Re: [PERFORM] Update table performance

2007-08-09 Thread Merlin Moncure
On 8/8/07, Mark Makarowsky <[EMAIL PROTECTED]> wrote:
> Can you provide more detail on what you mean by your
> two suggestions below:
>
> Yeah, I've used "vertical partitioning" very
> successfully in the past, though I've never done it
> for just a single field. I'll typically leave the few
> most common fields in the "main" table and pull
> everything else into a second table.
>
> I should mention that if you can handle splitting the
> update into multiple transactions, that will help a
> lot since it means you won't be doubling the size of
> the table.
>
> I guess I was just surprised by the speed it takes to
> update the field in Postgres since on an almost
> identical table in FoxPro (400,000 records less), it
> updates the table with the same exact update table
> statement in about 4 minutes.

FoxPro is a single process DBF based system with some sql access.
When you update th records, it updates them in place since all the
records are fixed size and padded.  Be careful with this
comparison...while certain operations like the above may feel faster,
the locking in foxpro is extremely crude compared to PostgreSQL.
There are many other things about dbf systems in general which are
pretty lousy from performance perspective.

That said, 'update' is the slowest operation for postgresql relative
to other databases that are not MVCC.  This is balanced by extremely
efficient locking and good performance under multi user loads.
PostgreSQL likes to be used a certain way...you will find that when
used properly it is extremely fast.

keep an eye for the HOT feature which will hopefully make 8.3 that
will highly reduce the penalty for (small) updates in many cases.

merlin

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] When/if to Reindex

2007-08-09 Thread Bill Moran
In response to "Decibel!" <[EMAIL PROTECTED]>:

> On Wed, Aug 08, 2007 at 03:27:57PM -0400, Bill Moran wrote:
> > I've had similar experience.  One thing you didn't mention that I've noticed
> > is that VACUUM FULL often bloats indexes.  I've made it SOP that
> > after application upgrades (which usually includes lots of ALTER TABLES and
> > other massive schema and data changes) I VACUUM FULL and REINDEX (in that
> > order).
> 
> You'd be better off with a CLUSTER in that case. It'll be faster, and
> you'll ensure that the table has optimal ordering.

Point taken.

> > Lots of ALTER TABLEs seem to bloat the database size considerably, beyond
> > what normal VACUUM seems to fix.  A FULL seems to fix that, but it appears
> > to bloat the indexes, thus a REINDEX helps.
> 
> Hrm, are you sure that's still true? I just did an ALTER TABLE ... TYPE
> and it created a new file, meaning no bloating.

No, I'm not.  This isn't something I've analyzed or investigated in detail.
During upgrades, a lot happens: ATLER TABLES, tables are dropped, new tables
are created, massive amounts of data may be altered in a short period, stored
procedures are replaced, etc, etc.

I don't remember what led me to believe that the ALTER TABLES were causing the
worst of the problem, but it's entirely possible that I was off-base.  (I seem
to remember being concerned about too many DROP COLUMN and ADD COLUMNs)  In any
event, my original statement (that it's a good idea to REINDEX after VACUUM
FULL) still seems to be correct.

-- 
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Update table performance

2007-08-09 Thread Michael Stone

On Thu, Aug 09, 2007 at 06:04:09PM +0530, Merlin Moncure wrote:

keep an eye for the HOT feature which will hopefully make 8.3 that
will highly reduce the penalty for (small) updates in many cases.


Is there an overview somewhere about how this feature works and what it 
is expected to do? There have been a lot of references to it over time, 
and it's possible to understand it if you follow list traffic over time, 
but starting cold it's hard to know what it is. The name was poorly 
chosen as far as google is concerned. :)


Mike Stone

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] When/if to Reindex

2007-08-09 Thread Steven Flatt
On 8/8/07, Tom Lane <[EMAIL PROTECTED]> wrote:
>
> I'd make the same remark about Steven's case: if possible, don't create
> the indexes at all until you've loaded the table fully.


We considered this, however in some of our 12-hour partitions, there are
upwards of 50 or 60 million rows near the end of the 12 hours so read
performance gets bad on the current partition very quickly if there are no
indexes.

It makes more sense for us to have ~1 hour's worth of reindexing afterwards
during which read performance on that partition is "compromised".

Steve


Re: [PERFORM] Update table performance

2007-08-09 Thread Trevor Talbot
On 8/9/07, Michael Stone <[EMAIL PROTECTED]> wrote:
> On Thu, Aug 09, 2007 at 06:04:09PM +0530, Merlin Moncure wrote:
> >keep an eye for the HOT feature which will hopefully make 8.3 that
> >will highly reduce the penalty for (small) updates in many cases.
>
> Is there an overview somewhere about how this feature works and what it
> is expected to do? There have been a lot of references to it over time,
> and it's possible to understand it if you follow list traffic over time,
> but starting cold it's hard to know what it is. The name was poorly
> chosen as far as google is concerned. :)

This is what I found when I went looking for info earlier:
http://archives.postgresql.org/pgsql-patches/2007-07/msg00142.php
http://archives.postgresql.org/pgsql-patches/2007-07/msg00360.php

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] mid 2007 "best bang for the buck" hardware opinions

2007-08-09 Thread Vivek Khera


On Aug 8, 2007, at 11:34 PM, justin wrote:

So whats the thoughts on a current combined rack/disks/cpu combo  
around the $10k-$15k point, currently?


I just put into production testing this setup:

SunFire X4100M2 (2x Opteron Dual core) with 20Gb RAM and an LSI PCI-e  
dual-channel 4Gb Fibre channel adapter, connected to Partners Data  
Systems' Triton 16FA4 RAID array.  The Triton array consists of 16  
SATA drives and connects out to 4Gb fibre channel.  I run FreeBSD 6.2  
on it.


It is very fast. ;-)

It cost a bit over $20k.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Update table performance

2007-08-09 Thread Decibel!
On Thu, Aug 09, 2007 at 06:04:09PM +0530, Merlin Moncure wrote:
> That said, 'update' is the slowest operation for postgresql relative
> to other databases that are not MVCC.

Actually, it depends on how you do MVCC. In Oracle, DELETE is actually
the most expensive operation, because they have to not only remove the
row from the heap, they have to copy it to the undo log. And they need
to do something with indexes as well. Whereas we just update 4 bytes in
the heap and that's it.

An UPDATE in Oracle OTOH just needs to store whatever fields have
changed in the undo log. If you haven't messed with indexed fields, it
doesn't have to touch those either.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgplqWULgqzjL.pgp
Description: PGP signature


[PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Joe Uhl
We have a 30 GB database (according to pg_database_size) running nicely
on a single Dell PowerEdge 2850 right now.  This represents data
specific to 1 US state.  We are in the process of planning a deployment
that will service all 50 US states.

If 30 GB is an accurate number per state that means the database size is
about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
data that is heavy-read but only updated or inserted in batch.  It is
also largely isolated to a single table partitioned on state.  This
portion of the data will grow very slowly after the initial loading. 

The remaining 500 GB has frequent individual writes performed against
it.  500 GB is a high estimate and it will probably start out closer to
100 GB and grow steadily up to and past 500 GB.

I am trying to figure out an appropriate hardware configuration for such
a database.  Currently I am considering the following:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks.  500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time.  I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
difference makes SATA awfully appealing.

We plan to involve outside help in getting this database tuned and
configured, but want to get some hardware ballparks in order to get
quotes and potentially request a trial unit.

Any thoughts or recommendations?  We are running openSUSE 10.2 with
kernel 2.6.18.2-34.

Regards,

Joe Uhl
[EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Decibel!
On Thu, Aug 09, 2007 at 03:47:09PM -0400, Joe Uhl wrote:
> We have a 30 GB database (according to pg_database_size) running nicely
> on a single Dell PowerEdge 2850 right now.  This represents data
> specific to 1 US state.  We are in the process of planning a deployment
> that will service all 50 US states.
> 
> If 30 GB is an accurate number per state that means the database size is
> about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
> data that is heavy-read but only updated or inserted in batch.  It is
> also largely isolated to a single table partitioned on state.  This
> portion of the data will grow very slowly after the initial loading. 
> 
> The remaining 500 GB has frequent individual writes performed against
> it.  500 GB is a high estimate and it will probably start out closer to
> 100 GB and grow steadily up to and past 500 GB.

What kind of transaction rate are you looking at?

> I am trying to figure out an appropriate hardware configuration for such
> a database.  Currently I am considering the following:
> 
> PowerEdge 1950 paired with a PowerVault MD1000
> 2 x Quad Core Xeon E5310
> 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)

16GB for 500GB of active data is probably a bit light.

> PERC 5/E Raid Adapter
> 2 x 146 GB SAS in Raid 1 for OS + logs.
> A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
> 
> The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> 12 250GB SATA drives to cover the 1.5TB we would be able add another
> 250GB of usable space for future growth before needing to get a bigger
> set of disks.  500GB drives would leave alot more room and could allow
> us to run the MD1000 in split mode and use its remaining disks for other
> purposes in the mean time.  I would greatly appreciate any feedback with
> respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> difference makes SATA awfully appealing.

Well, how does this compare with what you have right now? And do you
expect your query rate to be 50x what it is now, or higher?

> We plan to involve outside help in getting this database tuned and
> configured, but want to get some hardware ballparks in order to get
> quotes and potentially request a trial unit.

You're doing a very wise thing by asking for information before
purchasing (unfortunately, many people put that cart before the horse).
This list is a great resource for information, but there's no real
substitute for working directly with someone and being able to discuss
your actual system in detail, so I'd suggest getting outside help
involved before actually purchasing or even evaluating hardware. There's
a lot to think about beyond just drives and memory with the kind of
expansion you're looking at. For example, what ability do you have to
scale past one machine? Do you have a way to control your growth rate?
How well will the existing design scale out? (Often times what is a good
design for a smaller set of data is sub-optimal for a large set of
data.)

Something else that might be worth looking at is having your existing
workload modeled; that allows building a pretty accurate estimate of
what kind of hardware would be required to hit a different workload.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp180ZPmLqpp.pgp
Description: PGP signature


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Merlin Moncure
On 8/9/07, Joe Uhl <[EMAIL PROTECTED]> wrote:
> We have a 30 GB database (according to pg_database_size) running nicely
> on a single Dell PowerEdge 2850 right now.  This represents data
> specific to 1 US state.  We are in the process of planning a deployment
> that will service all 50 US states.
>
> If 30 GB is an accurate number per state that means the database size is
> about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
> data that is heavy-read but only updated or inserted in batch.  It is
> also largely isolated to a single table partitioned on state.  This
> portion of the data will grow very slowly after the initial loading.
>
> The remaining 500 GB has frequent individual writes performed against
> it.  500 GB is a high estimate and it will probably start out closer to
> 100 GB and grow steadily up to and past 500 GB.
>
> I am trying to figure out an appropriate hardware configuration for such
> a database.  Currently I am considering the following:
>
> PowerEdge 1950 paired with a PowerVault MD1000
> 2 x Quad Core Xeon E5310
> 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> PERC 5/E Raid Adapter
> 2 x 146 GB SAS in Raid 1 for OS + logs.
> A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
>
> The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> 12 250GB SATA drives to cover the 1.5TB we would be able add another
> 250GB of usable space for future growth before needing to get a bigger
> set of disks.  500GB drives would leave alot more room and could allow
> us to run the MD1000 in split mode and use its remaining disks for other
> purposes in the mean time.  I would greatly appreciate any feedback with
> respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> difference makes SATA awfully appealing.

I'm getting a MD1000 tomorrow to play with for just this type of
analysis as it happens.  First of all, move the o/s drives to the
backplane and get the cheapest available.

I might consider pick up an extra perc 5/e, since the MD1000 is
active/active, and do either raid 10 or 05 with one of the raid levels
in software.  For example, two raid 5 volumes (hardware raid 5)
striped in software as raid 0.  A 15k SAS drive is worth at least two
SATA drives (unless they are raptors) for OLTP performance loads.

Where the extra controller especially pays off is if you have to
expand to a second tray.  It's easy to add trays but installing
controllers on a production server is scary.

Raid 10 is usually better for databases but in my experience it's a
roll of the dice.  If you factor cost into the matrix a SAS raid 05
might outperform a SATA raid 10 because you are getting better storage
utilization out of the drives (n - 2 vs. n / 2).  Then again, you
might not.

merlin

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Decibel!
On Thu, Aug 09, 2007 at 05:50:10PM -0400, Merlin Moncure wrote:
> Raid 10 is usually better for databases but in my experience it's a
> roll of the dice.  If you factor cost into the matrix a SAS raid 05
> might outperform a SATA raid 10 because you are getting better storage
> utilization out of the drives (n - 2 vs. n / 2).  Then again, you
> might not.

It's going to depend heavily on the controller and the workload.
Theoretically, if most of your writes are to stripes that the controller
already has cached then you could actually out-perform RAID10. But
that's a really, really big IF, because if the strip isn't in cache you
have to read the entire thing in before you can do the write... and that
costs *a lot*.

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

Speaking of failures, keep in mind that a normal RAID5 puts you only 2
drive failures away from data loss, while with RAID10 you can
potentially lose half the array without losing any data. If you do RAID5
with multiple parity copies that does change things; I'm not sure which
is better at that point (I suspect it matters how many drives are
involved).

The comment about the extra controller isn't a bad idea, although I
would hope that you'll have some kind of backup server available, which
makes an extra controller much less useful.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpDQ9nJ7lGKI.pgp
Description: PGP signature


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Arjen van der Meijden

On 9-8-2007 23:50 Merlin Moncure wrote:

Where the extra controller especially pays off is if you have to
expand to a second tray.  It's easy to add trays but installing
controllers on a production server is scary.


For connectivity-sake that's not a necessity. You can either connect 
(two?) extra MD1000's to your first MD1000 or you can use the second 
external SAS-port on your controller. Obviously it depends on the 
controller whether its good enough to just add the disks to it, rather 
than adding another controller for the second tray. Whether the perc5/e 
is good enough for that, I don't know, we've only equipped ours with a 
single MD1000 holding 15x 15k rpm drives, but in our benchmarks it 
scaled pretty well going from a few to all 14 disks (+1 hotspare).


Best regards,

Arjen

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Joe Uhl
Thanks for the input.  Thus far we have used Dell but I would certainly
be willing to explore other options.

I found a "Reference Guide" for the MD1000 from April, 2006 that
includes info on the PERC 5/E at:

http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md1000_solutions_guide.pdf

To answer the questions below:

> How many users do you expect to hit the db at the same time?
There are 2 types of users.  For roughly every 5000 active accounts, 10
or fewer or those will have additional privileges.  Only those more
privileged users interact substantially with the OLAP portion of the
database.  For 1 state 10 concurrent connections was about the max, so
if that holds for 50 states we are looking at 500 concurrent users as a
top end, with a very small fraction of those users interacting with the
OLAP portion.

> How big of a dataset will each one be grabbing at the same time?
For the OLTP data it is mostly single object reads and writes and
generally touches only a few tables at a time.

> Will your Perc RAID controller have a battery backed cache on board?
> If so (and it better!) how big of a cache can it hold?
According to the above link, it has a 256 MB cache that is battery
backed.

> Can you split this out onto two different machines, one for the OLAP
> load and the other for what I'm assuming is OLTP?
> Can you physically partition this out by state if need be?
Right now this system isn't in production so we can explore any option. 
We are looking into splitting the OLAP and OLTP portions right now and I
imagine physically splitting the partitions on the big OLAP table is an
option as well.

Really appreciate all of the advice.  Before we pull the trigger on
hardware we probably will get some external advice from someone but I
knew this list would provide some excellent ideas and feedback to get us
started.

Joe Uhl
[EMAIL PROTECTED]

On Thu, 9 Aug 2007 16:02:49 -0500, "Scott Marlowe"
<[EMAIL PROTECTED]> said:
> On 8/9/07, Joe Uhl <[EMAIL PROTECTED]> wrote:
> > We have a 30 GB database (according to pg_database_size) running nicely
> > on a single Dell PowerEdge 2850 right now.  This represents data
> > specific to 1 US state.  We are in the process of planning a deployment
> > that will service all 50 US states.
> >
> > If 30 GB is an accurate number per state that means the database size is
> > about to explode to 1.5 TB.  About 1 TB of this amount would be OLAP
> > data that is heavy-read but only updated or inserted in batch.  It is
> > also largely isolated to a single table partitioned on state.  This
> > portion of the data will grow very slowly after the initial loading.
> >
> > The remaining 500 GB has frequent individual writes performed against
> > it.  500 GB is a high estimate and it will probably start out closer to
> > 100 GB and grow steadily up to and past 500 GB.
> >
> > I am trying to figure out an appropriate hardware configuration for such
> > a database.  Currently I am considering the following:
> >
> > PowerEdge 1950 paired with a PowerVault MD1000
> > 2 x Quad Core Xeon E5310
> > 16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
> > PERC 5/E Raid Adapter
> > 2 x 146 GB SAS in Raid 1 for OS + logs.
> > A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.
> >
> > The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With
> > 12 250GB SATA drives to cover the 1.5TB we would be able add another
> > 250GB of usable space for future growth before needing to get a bigger
> > set of disks.  500GB drives would leave alot more room and could allow
> > us to run the MD1000 in split mode and use its remaining disks for other
> > purposes in the mean time.  I would greatly appreciate any feedback with
> > respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price
> > difference makes SATA awfully appealing.
> >
> > We plan to involve outside help in getting this database tuned and
> > configured, but want to get some hardware ballparks in order to get
> > quotes and potentially request a trial unit.
> >
> > Any thoughts or recommendations?  We are running openSUSE 10.2 with
> > kernel 2.6.18.2-34.
> 
> Some questions:
> 
> How many users do you expect to hit the db at the same time?
> How big of a dataset will each one be grabbing at the same time?
> Will your Perc RAID controller have a battery backed cache on board?
> If so (and it better!) how big of a cache can it hold?
> Can you split this out onto two different machines, one for the OLAP
> load and the other for what I'm assuming is OLTP?
> Can you physically partition this out by state if need be?
> 
> A few comments:
> 
> I'd go with the bigger drives.  Just as many, so you have spare
> storage as you need it.  you never know when you'll need to migrate
> your whole data set from one pg db to another for testing etc...
> extra space comes in REAL handy when things aren't quite going right.
> With 10krpm 500 and 750 Gig drives you can use smaller partitions on
> 

Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Scott Marlowe
oops, the the wrong list...  now the right one.

On 8/9/07, Decibel! <[EMAIL PROTECTED]> wrote:
> You forgot the list. :)
>
> On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:
> > On 8/9/07, Decibel! <[EMAIL PROTECTED]> wrote:
> >
> > > Also, a good RAID controller can spread reads out across both drives in
> > > each mirror on a RAID10. Though, there is an argument for not doing
> > > that... it makes it much less likely that both drives in a mirror will
> > > fail close enough to each other that you'd lose that chunk of data.
> >
> > I'd think that kind of failure mode is pretty uncommon, unless you're
> > in an environment where physical shocks are common.  which is not a
> > typical database environment.  (tell that to the guys writing a db for
> > a modern tank fire control system though :) )
> >
> > > Speaking of failures, keep in mind that a normal RAID5 puts you only 2
> > > drive failures away from data loss,
> >
> > Not only that, but the first drive failure puts you way down the list
> > in terms of performance, where a single failed drive in a large
> > RAID-10 only marginally affects performance.
> >
> > > while with RAID10 you can
> > > potentially lose half the array without losing any data.
> >
> > Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.
> >
> > > If you do RAID5
> > > with multiple parity copies that does change things; I'm not sure which
> > > is better at that point (I suspect it matters how many drives are
> > > involved).
> >
> > That's RAID6.  The primary advantages of RAID6 over RAID10 or RAID5
> > are two fold:
> >
> > 1:  A single drive failure has no negative effect on performance, so
> > the array is still pretty fast, especially for reads, which just suck
> > under RAID 5 with a missing drive.
> > 2:  No two drive failures can cause loss of data.  Admittedly, by the
> > time the second drive fails, you're now running on the equivalent of a
> > degraded RAID5, unless you've configured >2 drives for parity.
> >
> > On very large arrays (100s of drives), RAID6 with 2, 3, or 4 drives
> > for parity makes some sense, since having that many extra drives means
> > the RAID controller (SW or HW) can now have elections to decide which
> > drive might be lying if you get data corruption.
> >
> > Note that you can also look into RAID10 with 3 or more drives per
> > mirror.  I.e. build 3 RAID-1 sets of 3 drives each, then you can lose
> > any two drives and still stay up.  Plus, on a mostly read database,
> > where users might be reading the same drives but in different places,
> > multi-disk RAID-1 makes sense under RAID-10.
> >
> > While I agree with Merlin that for OLTP a faster drive is a must, for
> > OLAP, more drives is often the real key.  The high aggregate bandwidth
> > of a large array of SATA drives is an amazing thing to watch when
> > running a reporting server with otherwise unimpressive specs.
> >
>
> --
> Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED]
> Give your computer some brain candy! www.distributed.net Team #1828
>
>

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Greg Smith

On Thu, 9 Aug 2007, Joe Uhl wrote:

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max.  With 
12 250GB SATA drives to cover the 1.5TB we would be able add another 
250GB of usable space for future growth before needing to get a bigger 
set of disks.  500GB drives would leave alot more room and could allow 
us to run the MD1000 in split mode and use its remaining disks for other 
purposes in the mean time.  I would greatly appreciate any feedback with 
respect to drive count vs. drive size and SATA vs. SCSI/SAS.  The price 
difference makes SATA awfully appealing.


The SATA II drives in the MD1000 all run at 7200 RPM, and are around 
0.8/GB (just grabbed a random quote from the configurator on their site 
for all these) for each of the 250GB, 500GB, and 750GB capacities.  If you 
couldn't afford to fill the whole array with 500GB models, than it might 
make sense to get the 250GB ones instead just to spread the load out over 
more spindles; if you're filling it regardless, surely the reduction in 
stress over capacity issues of the 500GB models makes more sense.  Also, 
using the 500 GB models would make it much easier to only ever use 12 
active drives and have 3 hot spares, with less pressure to convert spares 
into active storage; drives die in surprisingly correlated batches far too 
often to only have 1 spare IMHO.


The two SAS options that you could use are both 300GB, and you can have 
10K RPM for $2.3/GB or 15K RPM for $3.0/GB.  So relative to the SATA 
optoins, you're paying about 3X as much to get a 40% faster spin rate, or 
around 4X as much to get over a 100% faster spin.  There's certainly other 
things that factor into performance than just that, but just staring at 
the RPM gives you a gross idea how much higher of a raw transaction rate 
the drives can support.


The question you have to ask yourself is how much actual I/O are you 
dealing with.  The tiny 256MB cache on the PERC 5/E isn't going to help 
much with buffering writes in particular, so the raw disk performance may 
be critical for your update intensive workload.  If the combination of 
transaction rate and total bandwidth are low enough that the 7200 RPM 
drives can keep up with your load, by all means save yourself a lot of 
cash and get the SATA drives.


In your situation, I'd be spending a lot of my time measuring the 
transaction and I/O bandwidth rates on the active system very carefully to 
figure out which way to go here.  You're in a better position than most 
people buying new hardware to estimate what you need with the existing 
system in place, take advantage of that by drilling into the exact numbers 
for what you're pushing through your disks now.  Every dollar spent on 
work to quantify that early will easily pay for itself in helping guide 
your purchase and future plans; that's what I'd be bringing in people in 
right now to do if I were you, if that's not something you're already 
familiar with measuring.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread Decibel!
On Thu, Aug 09, 2007 at 08:58:19PM -0500, Scott Marlowe wrote:
> > On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:
> > > On 8/9/07, Decibel! <[EMAIL PROTECTED]> wrote:
> > >
> > > > Also, a good RAID controller can spread reads out across both drives in
> > > > each mirror on a RAID10. Though, there is an argument for not doing
> > > > that... it makes it much less likely that both drives in a mirror will
> > > > fail close enough to each other that you'd lose that chunk of data.
> > >
> > > I'd think that kind of failure mode is pretty uncommon, unless you're
> > > in an environment where physical shocks are common.  which is not a
> > > typical database environment.  (tell that to the guys writing a db for
> > > a modern tank fire control system though :) )

You'd be surprised. I've seen more than one case of a bunch of drives
failing within a month, because they were all bought at the same time.

> > > > while with RAID10 you can
> > > > potentially lose half the array without losing any data.
> > >
> > > Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.

Sure, but the odds of that with RAID5 are 100%, while they're much less
in a RAID10.

> > > While I agree with Merlin that for OLTP a faster drive is a must, for
> > > OLAP, more drives is often the real key.  The high aggregate bandwidth
> > > of a large array of SATA drives is an amazing thing to watch when
> > > running a reporting server with otherwise unimpressive specs.

True. In this case, the OP will probably want to have one array for the
OLTP stuff and one for the OLAP stuff.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgplJY7DsP6xb.pgp
Description: PGP signature


Re: [PERFORM] Update table performance

2007-08-09 Thread Kenneth Marshall
Mark,

You are not alone in the fact that when you post your system
specifications, CPU and memory are always listed while the
disk I/O subsystem invariably is not. This is a very disk
intensive operation and I suspect that your disk system is
maxed-out. If you want it faster, you will need more I/O
capacity.

Regards,
Ken

On Tue, Aug 07, 2007 at 05:58:35AM -0700, Mark Makarowsky wrote:
> I have a table with 4,889,820 records in it.  The
> table also has 47 fields.  I'm having problems with
> update performance.  Just as a test, I issued the
> following update:
> 
> update valley set test='this is a test'
> 
> This took 905641 ms.  Isn't that kind of slow?  There
> aren't any indexes, triggers, constraints or anything
> on this table.  The version of Postgres is "PostgreSQL
> 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe
> (GCC) 3.4.2 (mingw-special)".  The operating
> environment is Windows 2003 Standard Edition w/service
> pack 2.  It is 2.20 Ghz with 1.0 GB of RAM.  Here is
> the results from Explain:
> 
> "Seq Scan on valley  (cost=0.00..1034083.57
> rows=4897257 width=601)"
> 
> Here are the settings in the postgresql.conf.  Any

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] Performance on writable views

2007-08-09 Thread Enrico Weigelt

Hi folks,


I'm often using writable views as interfaces to clients, so 
they only see "virtual" objects and never have to cope with
the actual storage, ie. to give some client an totally 
denormalized view of certain things, containing only those 
information required for certain kind of operations. 

This method is nice for creating easy and robust client 
interfaces - internal schema changes are not visible to 
the client. In situations when many, many clients - often
coded/maintained by different people - have to access an
database which is still under development (typical for 
many inhouse applications), it helps to circument interface
instabilities.

Now I've got the strange feeling that this makes updates
slow, since it always has to run the whole view query to
fetch an record to be updated (ie. to get OLD.*).

Could anyone with some deep insight please give me some 
details about that issue ?


cu 
-- 
-
 Enrico Weigelt==   metux IT service - http://www.metux.de/
-
 Please visit the OpenSource QM Taskforce:
http://wiki.metux.de/public/OpenSource_QM_Taskforce
 Patches / Fixes for a lot dozens of packages in dozens of versions:
http://patches.metux.de/
-

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] select count(*) performance

2007-08-09 Thread runic
Hello Group,

I'm new in PostgreSQL Business, therefore please forgive me a "newbie"
Question. I have a table with ca. 1.250.000 Records. When I execute
a "select count (*) from table" (with pgAdmin III)  it takes about 40
secs.
I think that takes much to long. Can you please give me hints, where
I can search for Improvements?

TIA, Det


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] Dell Hardware Recommendations

2007-08-09 Thread david

On Thu, 9 Aug 2007, Decibel! wrote:


On Thu, Aug 09, 2007 at 08:58:19PM -0500, Scott Marlowe wrote:

On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:

On 8/9/07, Decibel! <[EMAIL PROTECTED]> wrote:


Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.


I'd think that kind of failure mode is pretty uncommon, unless you're
in an environment where physical shocks are common.  which is not a
typical database environment.  (tell that to the guys writing a db for
a modern tank fire control system though :) )


You'd be surprised. I've seen more than one case of a bunch of drives
failing within a month, because they were all bought at the same time.


while with RAID10 you can
potentially lose half the array without losing any data.


Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.


Sure, but the odds of that with RAID5 are 100%, while they're much less
in a RAID10.


so you go with Raid6, not Raid5.


While I agree with Merlin that for OLTP a faster drive is a must, for
OLAP, more drives is often the real key.  The high aggregate bandwidth
of a large array of SATA drives is an amazing thing to watch when
running a reporting server with otherwise unimpressive specs.


True. In this case, the OP will probably want to have one array for the
OLTP stuff and one for the OLAP stuff.


one thing that's interesting is that the I/O throughlut on the large SATA 
drives can actually be higher then the faster, but smaller SCSI drives. 
the SCSI drives can win on seeking, but how much seeking you need to do 
depends on how large the OLTP database ends up being


David Lang

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[PERFORM] Implementing an regex filter

2007-08-09 Thread Enrico Weigelt

Hi folks,


I've coded an auction crawler, which queries several auction
platforms for user-defined keywords. The results are then 
put into queues an sorted out from there. For example each
auction query can be passed along with an maxprice value, 
which is put into the result records. Every few minutes an
filter moves those result records where the article reached
the maxprice away to another queue. 

The blacklist filter (which makes me headaches) moves those 
records where article's title matches some regex'es to another 
queue. One of the users has more than 630 blacklist entries,
linked to one regex about 24kb. I've tried differet approaches,
comparing to each single blacklist entry (join'ing the 
blacklist table) or comparing against one huge regex (put the
compiled regex'es per user to an separate table). Both run 
very, very long (> 15mins) and make heavy load.

My current scheme (reduced to required stuff):

* article table:  article_id(oid) 
  title(text)
  current_price(float)
  ... 

* user table: user_id(oid)
  username(text)
  ...
 
* user_results table: article_id(oid)
  user_id(oid)  
  username(oid)
  queue(oid)   <-- only scanning 'incoming'
  seen(boolean)<-- seen results are skipped
  ...
  
* user_db_list:   username(text)
  dbname(text)
  data(text)
  ...

* heap:   name(text)
  data(text)


This is the explain analyze output of the compiled-regex approach:
(the compiled regex is stored in the "heap" table)

auctionwatch=> explain analyze update base.user_results set queue='FOO' 
   WHERE queue = 'incoming' AND 
 NOT seen AND 
 base.user_results.article_id = base.articles.inode_id AND 
 base.articles.end_time > current_timestamp AND 
 base.articles.title ~ (
SELECT data FROM base.heap WHERE name = 
'blacklist.title::'||base.user_results.username);

Hash Join  (cost=2131.38..7622.69 rows=22 width=56) (actual 
time=1040416.087..1128977.760 rows=1 loops=1)
Hash Cond: ("outer".article_id = "inner".inode_id)
Join Filter: ("inner".title ~ (subplan))
->  Seq Scan on user_results  (cost=0.00..593.08 rows=11724 width=56) 
(actual time=0.104..518.036 rows=11189 loops=1)
Filter: ((queue = 'incoming'::text) AND (NOT seen))
->  Hash  (cost=2014.41..2014.41 rows=8787 width=57) (actual 
time=250.946..250.946 rows=0 loops=1)
->  Seq Scan on articles  (cost=0.00..2014.41 rows=8787 width=57) 
(actual time=0.702..232.754 rows=8663 loops=1)
Filter: (end_time > ('now'::text)::timestamp(6) with time zone)
SubPlan
  ->  Seq Scan on heap  (cost=0.00..1.01 rows=1 width=32) (actual 
time=0.070..0.072 rows=1 loops=5998)
 Filter: (name = ('blacklist.title::'::text || $0))

Total runtime: 1129938.362 ms
  

And the approach via joining the regex table:

auctionwatch=> explain analyze update base.user_results set queue = 'FOO' 
   WHERE queue = 'incoming' AND 
 NOT seen AND 
 base.user_results.article_id = base.articles.inode_id AND 
 base.articles.end_time > current_timestamp AND 
 base.articles.title ~ base.user_db_list.data AND 
 base.user_db_list.username = base.user_results.username 
AND 
 base.user_db_list.dbname = 'BLACKLIST.TITLE' ;

Hash Join  (cost=3457.12..11044097.45 rows=3619812 width=56) (actual 
time=90458.408..126119.167 rows=2 loops=1)
   Hash Cond: ("outer".username = "inner".username)
   Join Filter: ("inner".title ~ "outer".data)
   ->  Seq Scan on user_db_list  (cost=0.00..5268.16 rows=186333 width=51) 
(actual time=512.939..514.394 rows=634 loops=1)
 Filter: (dbname = 'BLACKLIST.TITLE'::text)
   ->  Hash  (cost=3373.49..3373.49 rows=4254 width=109) (actual 
time=466.177..466.177 rows=0 loops=1)
 ->  Hash Join  (cost=2221.01..3373.49 rows=4254 width=109) (actual 
time=225.006..439.334 rows=6023 loops=1)
   Hash Cond: ("outer".article_id = "inner".inode_id)
   ->  Seq Scan on user_results  (cost=0.00..593.08 rows=11724 
width=56) (actual time=0.155..85.865 rows=11223 loops=1)
 Filter: ((queue = 'incoming'::text) AND (NOT seen))
   ->  Hash  (cost=2099.20..2099.20 rows=9127 width=57) (actual 
time=205.996..205.996 rows=0 loops=1)
 ->  Seq Scan on articles  (cost=0.00..2099.20 rows=9127 
width=57) (actual time=0.373..187.468 rows=8662 loops=1)
   Filter: 

Re: [PERFORM] select count(*) performance

2007-08-09 Thread slawekj
> I'm new in PostgreSQL Business, therefore please forgive me a "newbie"
> Question. I have a table with ca. 1.250.000 Records. When I execute
> a "select count (*) from table" (with pgAdmin III)  it takes about 40
> secs.
> I think that takes much to long. Can you please give me hints, where
> I can search for Improvements?
> TIA, Det

maybe try change shared_buffers and test it,

or you can create trigger on this table after insert and delete
if insert then increse value in another table "counts"
if delete then decrese
and now you dont must execute select count(*) ...
but execute
select my_count from counts where tablename='your_table'

sj



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Update table performance

2007-08-09 Thread Piotr Kołaczkowski
Hi,

update valley set test='this is a test'

Such query updates ALL of your records in the table. 
5 million records * 47 fields - that can be several gigabytes of data.
The system has to scan that gigabytes to change every record. This is a huge 
task. Try vacuuming and see if it helps. It can help a lot, if you perform 
such 'whole table updates' often.

Best regards,
Piotr Kolaczkowski

---(end of broadcast)---
TIP 6: explain analyze is your friend