Re: [PERFORM] Question on REINDEX
> > Josh Berkus writes: > >> 1) When is it necessary to run REINDEX or drop/create > >> an index? All I could really find in the docs is: > > > If you need to VACUUM FULL, you need to REINDEX as well. > For example, > > if you drop millions of rows from a table. > > That's probably a pretty good rule of thumb. It's worth > noting that VACUUM FULL tends to actively bloat indexes, not > reduce them in size, because it has to create new index > entries for the rows it moves before it can delete the old > ones. So if a VACUUM FULL moves many rows you are likely to > see the indexes get bigger not smaller. > Is my current understanding correct: 1) VACUUM defragments each page locally - moves free space to the end of page. 2) VACUUM FULL defragments table globally - tries to fill up all partially free pages and deletes all resulting empty pages. 3) Both VACUUM and VACUUM FULL do only local defragment for indexes. 4) If you want indexes to become fully defragmented, you need to REINDEX. If you happen to use triggers for denormalization, like I do, then you have a lot of updates, which means that tables and indexes become quicky cluttered with pages, which contain mostly dead tuples. If those tables and indexes fill up shared buffers, then PostgreSQL slows down, because it has to do a lot more IO than normal. Regular VACUUM FULL helped, but I needed REINDEX as well, otherwise indexes grew bigger than tables itself! > > Better to up your max_fsm_pages and do regular VACUUMs regularly and > > frequently so that you don't have to REINDEX at all. > > Yes, definitely. Also consider using CLUSTER rather than > VACUUM FULL when you need to clean up after massive deletions > from a table. It's not any less intrusive in terms of > locking, but it's often faster and it avoids the index bloat > problem (since it effectively does a REINDEX). > Hmm, thanks for a tip. BTW, is output of select count(1), sum(relpages) from pg_class where relkind in ('r','i','t') good estimate for max_fsm_relations and max_fsm_pages? Are these parameters used only during VACUUM or in runtime too? Tambet ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to improve db performance with $7K?
> -Original Message- > From: Alex Turner [mailto:[EMAIL PROTECTED] > Sent: Monday, April 18, 2005 5:50 PM > To: Bruce Momjian > Cc: Kevin Brown; pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > Does it really matter at which end of the cable the queueing is done > (Assuming both ends know as much about drive geometry etc..)? > [...] The parenthetical is an assumption I'd rather not make. If my performance depends on my kernel knowing how my drive is laid out, I would always be wondering if a new drive is going to break any of the kernel's geometry assumptions. Drive geometry doesn't seem like a kernel's business any more than a kernel should be able to decode the ccd signal of an optical mouse. The kernel should queue requests at a level of abstraction that doesn't depend on intimate knowledge of drive geometry, and the drive should queue requests on the concrete level where geometry matters. A drive shouldn't guess whether a process is trying to read a file sequentially, and a kernel shouldn't guess whether sector 30 is contiguous with sector 31 or not. __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Question on REINDEX
"Tambet Matiisen" <[EMAIL PROTECTED]> writes: > Is my current understanding correct: > 1) VACUUM defragments each page locally - moves free space to the end of > page. > 2) VACUUM FULL defragments table globally - tries to fill up all > partially free pages and deletes all resulting empty pages. Both versions of VACUUM do within-page defragmentation. Also, both versions will remove entirely-empty pages at the end of a table. The difference is that VACUUM FULL actively attempts to make pages at the end empty, by moving their contents into free space in earlier pages. Plain VACUUM never does cross-page data movement, which is how come it doesn't need as strong a lock. BTW, VACUUM FULL does the data movement back-to-front, and stops as soon as it finds a tuple it cannot move down; which is a reasonable strategy since the goal is merely to make the file shorter. But it's entirely likely that there will be lots of empty space left at the end. For instance the final state could have one 4K tuple in the last page and up to 4K-1 free bytes in every earlier page. > 3) Both VACUUM and VACUUM FULL do only local defragment for indexes. > 4) If you want indexes to become fully defragmented, you need to > REINDEX. I don't think "defragment" is a notion that applies to indexes, at least not in the same way as for tables. It's true that there is no cross-page data movement in either case. In the last release or two we've been able to recognize and recycle entirely-empty pages in both btree and hash indexes, but such pages are almost never returned to the OS; they're put on a freelist for re-use within the index, instead. If you allow the table to grow to much more than its "normal" size, ie, you allow many dead tuples to be formed, then getting back to "normal" size is going to require VACUUM FULL + REINDEX (or you can use CLUSTER or some varieties of ALTER TABLE). This is not the recommended maintenance process however. Sufficiently frequent plain VACUUMs should generally hold the free space to a tolerable level without requiring any exclusive locking. > Hmm, thanks for a tip. BTW, is output of > select count(1), sum(relpages) from pg_class where relkind in > ('r','i','t') > good estimate for max_fsm_relations and max_fsm_pages? Within that one database, yes --- don't forget you must sum these numbers across all DBs in the cluster. Also you need some slop in the max_fsm_pages setting because of quantization in the space usage. It's probably easier to let VACUUM VERBOSE do the calculation for you. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] How to improve db performance with $7K?
Good question. If the SCSI system was moving the head from track 1 to 10, and a request then came in for track 5, could the system make the head stop at track 5 on its way to track 10? That is something that only the controller could do. However, I have no idea if SCSI does that. || SCSI, AFAIK, does NOT do this. What SCSI can do is allow "next" request insertion into head of request queue (queue-jumping), and/or defer request ordering to done by drive per se (queue re-ordering). I have looked, in vain, for evidence that SCSI somehow magically "stops in the middle of request to pick up data" (my words, not yours) The only part I am pretty sure about is that real-world experience shows SCSI is better for a mixed I/O environment. Not sure why, exactly, but the command queueing obviously helps, and I am not sure what else does. || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per se drive request reordering) should go a looong way (but not all the way) toward making SATA 'enterprise acceptable'. Multiple initiators (e.g. more than one host being able to talk to a drive) is a biggie, too. AFAIK only SCSI drives/controllers do that for now. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Postgresql works too slow
On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote: > I'm trying to restore my database from dump in several parrallel processes, > but restore process works too slow. > Number of rows about 100 000 000, > RAM: 8192M > CPU: Ultra Sparc 3 > Number of CPU: 4 > OS: SunOS sun 5.8 > RDBMS: PostgreSQL 8.0 > How to encrease postgresql speed? Why postgres took only 5.0% of CPU time? When you say restore...what are you actually doing? An archive recovery? A reload? A file-level restore of database? Best Regards, Simon Riggs ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Postgresql works too slow
Simon Riggs wrote: On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote: I'm trying to restore my database from dump in several parrallel processes, but restore process works too slow. Number of rows about 100 000 000, RAM: 8192M CPU: Ultra Sparc 3 Number of CPU: 4 OS: SunOS sun 5.8 RDBMS: PostgreSQL 8.0 How to encrease postgresql speed? Why postgres took only 5.0% of CPU time? When you say restore...what are you actually doing? An archive recovery? A reload? A file-level restore of database? If you are doing a restore off a pg_dump, did you dump the data as inserts? This takes a lot more time to restore. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Question on REINDEX
Tambet, > Hmm, thanks for a tip. BTW, is output of > > select count(1), sum(relpages) from pg_class where relkind in > ('r','i','t') Well, if you do that for all databases in the cluster, it's the number you start with. However, setting FSM_pages to that would be assuming that you excpected 100% of the rows to be replaced by UPDATES or DELETEs before you ran VACUUM. I generally run VACUUM a little sooner than that. See the end portion of: http://www.powerpostgresql.com/PerfList -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How to improve db performance with $7K?
Mohan, Ross wrote: > The only part I am pretty sure about is that real-world experience shows SCSI > is better for a mixed I/O environment. Not sure why, exactly, but the > command queueing obviously helps, and I am not sure what else does. > > || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version of per > se drive request reordering) >should go a looong way (but not all the way) toward making SATA > 'enterprise acceptable'. Multiple >initiators (e.g. more than one host being able to talk to a drive) is a > biggie, too. AFAIK only SCSI >drives/controllers do that for now. What is 'multiple initiators' used for in the real world? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How to improve db performance with $7K?
Clustered file systems is the first/best example that comes to mind. Host A and Host B can both request from diskfarm, eg. -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 12:10 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Mohan, Ross wrote: > The only part I am pretty sure about is that real-world experience > shows SCSI is better for a mixed I/O environment. Not sure why, > exactly, but the command queueing obviously helps, and I am not sure > what else does. > > || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version > || of per se drive request reordering) >should go a looong way (but not all the way) toward making SATA > 'enterprise acceptable'. Multiple >initiators (e.g. more than one host being able to talk to a drive) is a > biggie, too. AFAIK only SCSI >drives/controllers do that for now. What is 'multiple initiators' used for in the real world? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
Mohan, Ross wrote: > Clustered file systems is the first/best example that > comes to mind. Host A and Host B can both request from diskfarm, eg. So one host writes to part of the disk and another host writes to a different part? --- > -Original Message- > From: Bruce Momjian [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 19, 2005 12:10 PM > To: Mohan, Ross > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > > Mohan, Ross wrote: > > The only part I am pretty sure about is that real-world experience > > shows SCSI is better for a mixed I/O environment. Not sure why, > > exactly, but the command queueing obviously helps, and I am not sure > > what else does. > > > > || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version > > || of per se drive request reordering) > >should go a looong way (but not all the way) toward making SATA > > 'enterprise acceptable'. Multiple > >initiators (e.g. more than one host being able to talk to a drive) is a > > biggie, too. AFAIK only SCSI > >drives/controllers do that for now. > > What is 'multiple initiators' used for in the real world? > > -- > Bruce Momjian| http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup.| Newtown Square, Pennsylvania 19073 > > ---(end of broadcast)--- > TIP 6: Have you searched our list archives? > >http://archives.postgresql.org > -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
[EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM: > > What is 'multiple initiators' used for in the real world? I asked this same question and got an answer off list: Somebody said their SAN hardware used multiple initiators. I would try to check the archives for you, but this thread is becoming more of a rope. Multiple initiators means multiple sources on the bus issuing I/O instructions to the drives. In theory you can have two computers on the same SCSI bus issuing I/O requests to the same drive, or to anything else on the bus, but I've never seen this implemented. Others have noted this feature as being a big deal, so somebody is benefiting from it. Rick > > -- > Bruce Momjian| http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup.| Newtown Square, Pennsylvania 19073 > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Question on REINDEX
Josh Berkus writes: >> select count(1), sum(relpages) from pg_class where relkind in >> ('r','i','t') > Well, if you do that for all databases in the cluster, it's the number you > start with. However, setting FSM_pages to that would be assuming that you > excpected 100% of the rows to be replaced by UPDATES or DELETEs before you > ran VACUUM. I generally run VACUUM a little sooner than that. Not at all. What it says is that you expect 100% of the pages to have useful amounts of free space, which is a *much* weaker criterion. I think you can usually get away with setting max_fsm_pages to less than your actual disk footprint, but I'm not sure how much less. It'd probably depend a lot on your usage pattern --- for instance, insert-only history tables don't need any FSM space. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
Well, more like they both are allowed to issue disk requests and the magical "clustered file system" manages locking, etc. In reality, any disk is only reading/writing to one part of the disk at any given time, of course, but that in the multiple initiator deal, multiple streams of requests from multiple hosts can be queued. -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 12:16 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How to improve db performance with $7K? Mohan, Ross wrote: > Clustered file systems is the first/best example that > comes to mind. Host A and Host B can both request from diskfarm, eg. So one host writes to part of the disk and another host writes to a different part? --- > -Original Message- > From: Bruce Momjian [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 19, 2005 12:10 PM > To: Mohan, Ross > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > > Mohan, Ross wrote: > > The only part I am pretty sure about is that real-world experience > > shows SCSI is better for a mixed I/O environment. Not sure why, > > exactly, but the command queueing obviously helps, and I am not sure > > what else does. > > > > || TCQ is the secret sauce, no doubt. I think NCQ (the SATA version > > || of per se drive request reordering) > >should go a looong way (but not all the way) toward making SATA > > 'enterprise acceptable'. Multiple > >initiators (e.g. more than one host being able to talk to a drive) is a > > biggie, too. AFAIK only SCSI > >drives/controllers do that for now. > > What is 'multiple initiators' used for in the real world? > > -- > Bruce Momjian| http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup.| Newtown Square, Pennsylvania 19073 > > ---(end of > broadcast)--- > TIP 6: Have you searched our list archives? > >http://archives.postgresql.org > -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Question on REINDEX
Tom, > Not at all. What it says is that you expect 100% of the pages to have > useful amounts of free space, which is a *much* weaker criterion. Hmmm. Good point. This seems to be another instance where my rule-of-thumb was based on false logic but nevertheless arrived at correct numbers. I've seldom, if ever, set FSM_pages above 50% of the pages in the active database ... and never run out. H actually, it seems like, if you are vacuuming regularly, you only *do* need to track pages that have been touched by DELETE or UPDATE. Other pages would have already been vacuumed and not have any useful free space left. Yes? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Question on REINDEX
Josh Berkus writes: >> Not at all. What it says is that you expect 100% of the pages to have >> useful amounts of free space, which is a *much* weaker criterion. > H actually, it seems like, if you are vacuuming regularly, you only > *do* need to track pages that have been touched by DELETE or UPDATE. Other > pages would have already been vacuumed and not have any useful free space > left. Yes? Well, the space has to be remembered until it's reused. On the other hand, there's nothing that says FSM has to be aware of all the free space available at all times --- the real criterion to avoid bloat is that after a VACUUM, enough space is logged in FSM to satisfy all the insertions that will happen before the next VACUUM. So you could have situations where free space is temporarily forgotten (for lack of slots in FSM), but other free space gets used instead, and eventually a later VACUUM re-finds that free space and puts it into FSM. I think it's true that the more often you vacuum, the less FSM you need, but this doesn't have much to do with how much free space is actually out there on disk. It's because you only need enough FSM to record the free space you'll need until the next vacuum. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Question on REINDEX
On Tue, Apr 19, 2005 at 10:06:40AM -0400, Tom Lane wrote: > BTW, VACUUM FULL does the data movement back-to-front, and stops as soon > as it finds a tuple it cannot move down; which is a reasonable strategy > since the goal is merely to make the file shorter. But it's entirely > likely that there will be lots of empty space left at the end. For > instance the final state could have one 4K tuple in the last page and > up to 4K-1 free bytes in every earlier page. Am I right in thinking that vacuum does at least two passes: one front-to-back to find removable tuples, and other back-to-front for movement? Because if it doesn't work this way, it wouldn't relabel (change Xmin/Xmax) tuples in early pages. Or does it do something different? I know maintenance_work_mem is used for storing TIDs of to-be-moved tuples for index cleanup ... how does it relate to the above? -- Alvaro Herrera (<[EMAIL PROTECTED]>) "Crear es tan difĂcil como ser libre" (Elsa Triolet) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Question on REINDEX
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Am I right in thinking that vacuum does at least two passes: one > front-to-back to find removable tuples, and other back-to-front for > movement? VACUUM FULL, yes. VACUUM only does the first one. > I know maintenance_work_mem is used for storing TIDs of to-be-moved > tuples for index cleanup ... how does it relate to the above? TIDs of to-be-deleted tuples, actually. Movable tuples aren't stored, they're just found on-the-fly during the back-to-front pass. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] Bad n_distinct estimation; hacks suggested?
Folks, Params: PostgreSQL 8.0.1 on Solaris 10 Statistics = 500 (tablenames have been changed to protect NDA) e1=# select tablename, null_frac, correlation, n_distinct from pg_stats where tablename = 'clickstream1' andattname = 'session_id'; tablename | null_frac | correlation | n_distinct --+---+-+ clickstream1 | 0 |0.412034 | 378174 (2 rows) e1=# select count(distinct session_id) from clickstream1; count - 3174813 As you can see, n_distinct estimation is off by a factor of 10x and it's causing query planning problems. Any suggested hacks to improve the histogram on this? (BTW, increasing the stats to 1000 only doubles n_distinct, and doesn't solve the problem) -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
> -Original Message- > From: Josh Berkus [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 19, 2005 2:09 PM > To: pgsql-perform > Subject: [PERFORM] Bad n_distinct estimation; hacks suggested? > > [...] > (BTW, increasing the stats to 1000 only doubles n_distinct, > and doesn't solve the problem) Speaking of which, is there a reason why statistics are limited to 1000? Performance? __ David B. Held Software Engineer/Array Services Group 200 14th Ave. East, Sartell, MN 56377 320.534.3637 320.253.7800 800.752.8129 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
Josh Berkus writes: > As you can see, n_distinct estimation is off by a factor of 10x and it's > causing query planning problems. Any suggested hacks to improve the > histogram on this? What's the histogram itself look like? (I'd like to see the whole pg_stats row not just part of it ...) There's probably no point in showing the target=1000 version, but maybe target=100 would be informative. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
Tom, > What's the histogram itself look like? (I'd like to see the whole > pg_stats row not just part of it ...) There's probably no point in > showing the target=1000 version, but maybe target=100 would be > informative. Here is the stats = 100 version. Notice that n_distinct has gone down. schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals |most_common_freqs | histogram_bounds | correlation +--++---+---++---+--+-+- public | web_site_activity_fa | session_id | 0 | 8 | 96107 | {4393922,6049228,6026260,4394034,60341,4393810,2562999,2573850,3006299,4705488,2561499,4705258,3007378,4705490,60327,60352,2560950,2567640,2569852,3006604,4394329,2570739,2406633,2407292,3006356,4393603,4394121,6449083,2565815,4387881,2406770,2407081,2564340,3007328,2406578,2407295,2562813,2567603,4387835,71014,2566253,2566900,6103079,2289424,2407597,2567627,2568333,3457448,23450,23670,60743,70739,2406818,2406852,2407511,2562816,3007446,6306095,60506,71902,591543,1169136,1447077,2285047,2406830,2573964,6222758,61393,70955,70986,71207,71530,262368,2289213,2406899,2567361,2775952,3006824,4387864,6239825,6244853,6422152,1739,58600,179293,278473,488407,1896390,2286976,2407020,2546720,2677019,2984333,3006133,3007497,3310286,3631413,3801909,4366116,4388025} | {0.0017,0.00146667,0.0013,0.0011,0.00093,0.0009,0.0008,0.0008,0.00073,0.00073,0.0007,0.00063,0.0006,0.0006,0.00057,0.00057,0.00057,0.00057,0.00057,0.00057,0.00057,0.00053,0.0005,0.0005,0.0005,0.0005,0.0005,0.0005,0.00047,0.00047,0.00043,0.00043,0.00043,0.00043,0.0004,0.0004,0.0004,0.0004,0.0004,0.00037,0.00037,0.00037,0.00037,0.00033,0.00033,0.00033,0.00033,0.00033,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00027,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00
Re: [PERFORM] [HACKERS] PLM pulling from CVS nightly for testing in STP
I have dbt-2 tests automatically running against each pull from CVS and have started to automatically compile results here: http://developer.osdl.org/markw/postgrescvs/ I did start with a bit of a minimalistic approach, so I'm open for any comments, feedback, etc. Mark ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How to improve db performance with $7K?
On Thu, Apr 14, 2005 at 10:51:46AM -0500, Matthew Nuzum wrote: > So if you all were going to choose between two hard drives where: > drive A has capacity C and spins at 15K rpms, and > drive B has capacity 2 x C and spins at 10K rpms and > all other features are the same, the price is the same and C is enough > disk space which would you choose? > > I've noticed that on IDE drives, as the capacity increases the data > density increases and there is a pereceived (I've not measured it) > performance increase. > > Would the increased data density of the higher capacity drive be of > greater benefit than the faster spindle speed of drive A? The increased data density will help transfer speed off the platter, but that's it. It won't help rotational latency. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] How to improve db performance with $7K?
On Mon, Apr 18, 2005 at 07:41:49PM +0200, Jacques Caron wrote: > It would be interesting to actually compare this to real-world (or > nearly-real-world) benchmarks to measure the effectiveness of features like > TCQ/NCQ etc. I was just thinking that it would be very interesting to benchmark different RAID configurations using dbt2. I don't know if this is something that the lab is setup for or capable of, though. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How to improve db performance with $7K?
On Mon, Apr 18, 2005 at 10:20:36AM -0500, Dave Held wrote: > Hmm...so you're saying that at some point, quantity beats quality? > That's an interesting point. However, it presumes that you can > actually distribute your data over a larger number of drives. If > you have a db with a bottleneck of one or two very large tables, > the extra spindles won't help unless you break up the tables and > glue them together with query magic. But it's still a point to > consider. Huh? Do you know how RAID10 works? -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How to improve db performance with $7K?
On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote: > Don't you think "optimal stripe width" would be > a good question to research the binaries for? I'd > think that drives the answer, largely. (uh oh, pun alert) > > EG, oracle issues IO requests (this may have changed _just_ > recently) in 64KB chunks, regardless of what you ask for. > So when I did my striping (many moons ago, when the Earth > was young...) I did it in 128KB widths, and set the oracle > "multiblock read count" according. For oracle, any stripe size > under 64KB=stupid, anything much over 128K/258K=wasteful. > > I am eager to find out how PG handles all this. AFAIK PostgreSQL requests data one database page at a time (normally 8k). Of course the OS might do something different. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How to improve db performance with $7K?
On Tue, Apr 19, 2005 at 11:22:17AM -0500, [EMAIL PROTECTED] wrote: > > > [EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM: > > > > What is 'multiple initiators' used for in the real world? > > I asked this same question and got an answer off list: Somebody said their > SAN hardware used multiple initiators. I would try to check the archives > for you, but this thread is becoming more of a rope. > > Multiple initiators means multiple sources on the bus issuing I/O > instructions to the drives. In theory you can have two computers on the > same SCSI bus issuing I/O requests to the same drive, or to anything else > on the bus, but I've never seen this implemented. Others have noted this > feature as being a big deal, so somebody is benefiting from it. It's a big deal for Oracle clustering, which relies on shared drives. Of course most people doing Oracle clustering are probably using a SAN and not raw SCSI... -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] immutable functions vs. join for lookups ?
You should re-run the function test using SQL as the function language instead of plpgsql. There might be some performance to be had there. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] How to improve postgres performace
On Mon, Apr 18, 2005 at 11:36:01AM -0300, Rodrigo Moreno wrote: > I really worried about that, because it's no enough anymore, and users claim > about performace. But running the vacuumdb full, everthing starts to run > better again, so i think the problem is not related to a specific query. Vacuum full will skew your results, unless you plan on running vacuum full all the time. This is because you will always have some amount of dead tuples in a table that has any update or delete activity. A regular vacuum doesn't remove these tuples, it just marks them as available. So over time, depending on how frequently a table is vacuumed, it will settle down to a steady-state size that is greater than it's size after a vacuum full. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Sort and index
On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote: > Since you are fetching the entire table, you are touching all the rows. > If the query were to fetch the rows in index order, it would be seeking > all over the table's tracks. By fetching in sequence order, it has a > much better chance of fetching rows in a way that minimizes head seeks. > Since disk I/O is generally 10-100x slower than RAM, the in-memory sort > can be surprisingly slow and still beat indexed disk access. Of course, > this is only true if the table can fit and be sorted entirely in memory > (which, with 1500 rows, probably can). Actually, the planner (at least in 7.4) isn't smart enough to consider if the sort would fit in memory or not. I'm running a test right now to see if it's actually faster to use an index in this case. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] What to do with 6 disks?
Now that we've hashed out which drives are quicker and more money equals faster... Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid option would you use for a standalone postgres server? a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? b) 1xRAID1 for OS/xlog, 1xRAID5 for data c) 1xRAID10 for OS/xlong/data d) 1xRAID1 for OS, 1xRAID10 for data e) . I was initially leaning towards b, but after talking to Josh a bit, I suspect that with only 4 disks the raid5 might be a performance detriment vs 3 raid 1s or some sort of split raid10 setup. -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] What to do with 6 disks?
http://stats.distributed.net is setup with the OS, WAL, and temp on a RAID1 and the database on a RAID10. The drives are 200G SATA with a 3ware raid card. I don't think the controller has battery-backed cache, but I'm not sure. In any case, it's almost never disk-bound on the mirror; when it's disk-bound it's usually the RAID10. But this is a read-mostly database. If it was write-heavy, that might not be the case. Also, in general, I see very little disk activity from the OS itself, so I don't think there's a large disadvantage to having it on the same drives as part of your database. I would recommend different filesystems for each, though. (ie: not one giant / partition) On Tue, Apr 19, 2005 at 06:00:42PM -0700, Jeff Frost wrote: > Now that we've hashed out which drives are quicker and more money equals > faster... > > Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid > option would you use for a standalone postgres server? > > a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? > b) 1xRAID1 for OS/xlog, 1xRAID5 for data > c) 1xRAID10 for OS/xlong/data > d) 1xRAID1 for OS, 1xRAID10 for data > e) . > > I was initially leaning towards b, but after talking to Josh a bit, I > suspect that with only 4 disks the raid5 might be a performance detriment > vs 3 raid 1s or some sort of split raid10 setup. > > -- > Jeff Frost, Owner <[EMAIL PROTECTED]> > Frost Consulting, LLC http://www.frostconsultingllc.com/ > Phone: 650-780-7908 FAX: 650-649-1954 > > ---(end of broadcast)--- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faq > -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] Slow copy with little CPU/disk usage
A friend of mine has an application where he's copying in 4000 rows at a time into a table that has about 4M rows. Each row is 40-50 bytes. This is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk SATA mirror, running FBSD 4.10-stable. There's one index on the table. What's really odd is that neither the CPU or the disk are being hammered. The box appears to be pretty idle; the postgresql proces is using 4-5% CPU. I seem to recall others running into this before, but I can't remember what the issue was and I can't find it in the archives. This is version 8.0, btw. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Sort and index
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > Actually, the planner (at least in 7.4) isn't smart enough to consider > if the sort would fit in memory or not. Really? Have you read cost_sort()? It's certainly possible that the calculation is all wet, but to claim that the issue is not considered is just wrong. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] What to do with 6 disks?
Jeff, > Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid > option would you use for a standalone postgres server? > > a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? > b) 1xRAID1 for OS/xlog, 1xRAID5 for data > c) 1xRAID10 for OS/xlong/data > d) 1xRAID1 for OS, 1xRAID10 for data > e) . > > I was initially leaning towards b, but after talking to Josh a bit, I > suspect that with only 4 disks the raid5 might be a performance detriment > vs 3 raid 1s or some sort of split raid10 setup. Knowing that your installation is read-heavy, I'd recommend (d), with the WAL on the same disk as the OS, i.e. RAID1 2 disks OS, pg_xlog RAID 1+0 4 disks pgdata -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] What to do with 6 disks?
My experience: 1xRAID10 for postgres 1xRAID1 for OS + WAL Jeff Frost wrote: Now that we've hashed out which drives are quicker and more money equals faster... Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid option would you use for a standalone postgres server? a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? b) 1xRAID1 for OS/xlog, 1xRAID5 for data c) 1xRAID10 for OS/xlong/data d) 1xRAID1 for OS, 1xRAID10 for data e) . I was initially leaning towards b, but after talking to Josh a bit, I suspect that with only 4 disks the raid5 might be a performance detriment vs 3 raid 1s or some sort of split raid10 setup. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Slow copy with little CPU/disk usage
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > A friend of mine has an application where he's copying in 4000 rows at a > time into a table that has about 4M rows. Each row is 40-50 bytes. This > is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk > SATA mirror, running FBSD 4.10-stable. There's one index on the table. If there's no hidden costs such as foreign key checks, that does seem pretty dang slow. > What's really odd is that neither the CPU or the disk are being > hammered. The box appears to be pretty idle; the postgresql proces is > using 4-5% CPU. It's very hard to believe that *neither* disk nor CPU is maxed. Can we see a reproducible test case, please? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Sort and index
On Tue, Apr 19, 2005 at 11:01:26PM -0400, Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > Actually, the planner (at least in 7.4) isn't smart enough to consider > > if the sort would fit in memory or not. > > Really? Have you read cost_sort()? > > It's certainly possible that the calculation is all wet, but to claim > that the issue is not considered is just wrong. To be fair, no, I haven't looked at the code. This is based strictly on anecdotal evidence on a 120M row table. I'm currently running a test to see how an index scan compares to a seqscan. I also got the same results when I added a where clause that would restrict it to about 7% of the table. Actually, after running some tests (below), the plan cost does change when I change sort_mem (it was originally 5). stats=# \d email_contrib Table "public.email_contrib" Column | Type | Modifiers +-+--- project_id | integer | not null id | integer | not null date | date| not null team_id| integer | work_units | bigint | not null Indexes: "email_contrib_pkey" primary key, btree (project_id, id, date) "email_contrib__pk24" btree (id, date) WHERE (project_id = 24) "email_contrib__pk25" btree (id, date) WHERE (project_id = 25) "email_contrib__pk8" btree (id, date) WHERE (project_id = 8) "email_contrib__project_date" btree (project_id, date) Foreign-key constraints: "fk_email_contrib__id" FOREIGN KEY (id) REFERENCES stats_participant(id) ON UPDATE CASCADE "fk_email_contrib__team_id" FOREIGN KEY (team_id) REFERENCES stats_team(team) ON UPDATE CASCADE stats=# explain select * from email_contrib where project_id=8 order by project_id, id, date; QUERY PLAN Sort (cost=3613476.05..3635631.71 rows=8862263 width=24) Sort Key: project_id, id, date -> Seq Scan on email_contrib (cost=0.00..2471377.50 rows=8862263 width=24) Filter: (project_id = 8) (4 rows) stats=# explain select * from email_contrib order by project_id, id, date; QUERY PLAN -- Sort (cost=25046060.83..25373484.33 rows=130969400 width=24) Sort Key: project_id, id, date -> Seq Scan on email_contrib (cost=0.00..2143954.00 rows=130969400 width=24) (3 rows) stats=# select 8862263::float/130969400; ?column? 0.067687027657 (1 row) stats=# explain select * from email_contrib where project_id=8 order by project_id, id, date; QUERY PLAN - Index Scan using email_contrib_pkey on email_contrib (cost=0.00..6832005.57 rows=8862263 width=24) Index Cond: (project_id = 8) (2 rows) stats=# explain select * from email_contrib order by project_id, id, date; QUERY PLAN - Index Scan using email_contrib_pkey on email_contrib (cost=0.00..100055905.62 rows=130969400 width=24) (1 row) stats=# set enable_seqscan=on; SET stats=# set sort_mem=1000; SET stats=# explain select * from email_contrib order by project_id, id, date; QUERY PLAN -- Sort (cost=28542316.63..28869740.13 rows=130969400 width=24) Sort Key: project_id, id, date -> Seq Scan on email_contrib (cost=0.00..2143954.00 rows=130969400 width=24) (3 rows) stats=# -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Slow copy with little CPU/disk usage
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > What's really odd is that neither the CPU or the disk are being > hammered. The box appears to be pretty idle; the postgresql proces is > using 4-5% CPU. Is he committing every row? In that case you would see fairly low i/o bandwidth usage because most of the time is being spent seeking and waiting for rotational latency. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Slow copy with little CPU/disk usage
Quoting Tom Lane <[EMAIL PROTECTED]>: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > A friend of mine has an application where he's copying in 4000 rows at a > > time into a table that has about 4M rows. Each row is 40-50 bytes. This > > is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk > > SATA mirror, running FBSD 4.10-stable. There's one index on the table. > > If there's no hidden costs such as foreign key checks, that does seem > pretty dang slow. > > > What's really odd is that neither the CPU or the disk are being > > hammered. The box appears to be pretty idle; the postgresql proces is > > using 4-5% CPU. -- This sounds EXACTLY like my problem, if you make the box to a Xeon 2.4GHz, 2GB RAM ... with two SCSI drives (xlog and base); loading 10K rows of about 200 bytes each; takes about 20 secs at the best, and much longer at the worst. By any chance does your friend have several client machines/processes trying to mass-load rows at the same time? Or at least some other processes updating that table in a bulkish way? What I get is low diskio, low cpu, even low context-switches ... and I'm betting he should take a look at pg_locks. For my own problem, I gather that an exclusive lock is necessary while updating indexes and heap, and the multiple processes doing the update can make that pathological. Anyway, have your friend check pg_locks. "Dreams come true, not free." -- S.Sondheim, ITW ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)
RAID1 2 disks OS, pg_xlog RAID 1+0 4 disks pgdata Looks like the consensus is RAID 1 for OS, pg_xlog and RAID10 for pgdata. Now here's another performance related question: I've seen quite a few folks touting the Opteron as 2.5x faster with postgres than a Xeon box. What makes the Opteron so quick? Is it that Postgres really prefers to run in 64-bit mode? When I look at AMD's TPC-C scores where they are showing off the Opteron http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8796_8800~96125,00.html It doesn't appear 2.5x as fast as the Xeon systems, though I have heard from a few Postgres folks that a dual Opteron is 2.5x as fast as a dual Xeon. I would think that AMD would be all over that press if they could show it, so what am I missing? Is it a bus speed thing? Better south bridge on the boards? -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[PERFORM] How to tell what your postgresql server is doing
Is there a way to look at the stats tables and tell what is jamming up your postgres server the most? Other than seeing long running queries and watch top, atop, iostat, vmstat in separate xterms...I'm wondering if postgres keeps some stats on what it spends the most time doing or if there's a way to extract that sort of info from other metrics it keeps in the stats table? Maybe a script which polls the stats table and correlates the info with stats about the system in /proc? -- Jeff Frost, Owner <[EMAIL PROTECTED]> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How to tell what your postgresql server is doing
Is there a way to look at the stats tables and tell what is jamming up your postgres server the most? Other than seeing long running queries and watch top, atop, iostat, vmstat in separate xterms...I'm wondering if postgres keeps some stats on what it spends the most time doing or if there's a way to extract that sort of info from other metrics it keeps in the stats table? Maybe a script which polls the stats table and correlates the info with stats about the system in /proc? Turn on logging of all queries, sample for a few hours or one day. Then run Practical Query Analyzer (PQA on pgfoundry.org) over it to get aggregate query information. Chris ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)
I've seen quite a few folks touting the Opteron as 2.5x faster with postgres than a Xeon box. What makes the Opteron so quick? Is it that Postgres really prefers to run in 64-bit mode? I don't know about 2.5x faster (perhaps on specific types of loads), but the reason Opterons rock for database applications is their insanely good memory bandwidth and latency that scales much better than the Xeon. Opterons also have a ccNUMA-esque I/O fabric and two dedicated on-die memory channels *per processor* -- no shared bus there, closer to real UNIX server iron than a glorified PC. We run a large Postgres database on a dual Opteron in 32-bit mode that crushes Xeons running at higher clock speeds. It has little to do with bitness or theoretical instruction dispatch, and everything to do with the superior memory controller and I/O fabric. Databases are all about moving chunks of data around and the Opteron systems were engineered to do this very well and in a very scalable fashion. For the money, it is hard to argue with the price/performance of Opteron based servers. We started with one dual Opteron postgres server just over a year ago (with an equivalent uptime) and have considered nothing but Opterons for database servers since. Opterons really are clearly superior to Xeons for this application. I don't work for AMD, just a satisfied customer. :-) re: 6 disks. Unless you are tight on disk space, a hot spare might be nice as well depending on your needs. Cheers, J. Andrew Rogers ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)
On Tue, 19 Apr 2005, J. Andrew Rogers wrote: I don't know about 2.5x faster (perhaps on specific types of loads), but the reason Opterons rock for database applications is their insanely good memory bandwidth and latency that scales much better than the Xeon. Opterons also have a ccNUMA-esque I/O fabric and two dedicated on-die memory channels *per processor* -- no shared bus there, closer to real UNIX server iron than a glorified PC. Thanks J! That's exactly what I was suspecting it might be. Actually, I found an anandtech benchmark that shows the Opteron coming in at close to 2.0x performance: http://www.anandtech.com/linux/showdoc.aspx?i=2163&p=2 It's an Opteron 150 (2.4ghz) vs. Xeon 3.6ghz from August. I wonder if the differences are more pronounced with the newer Opterons. -Jeff ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match