Re: [PERFORM] possible improvement between G4 and G5
There are a few things that you can do to help force yourself to be I/O bound. These include: - RAID 5 for write intensive applications, since multiple writes per synch write is good. (There is a special case for logging or other streaming sequential writes on RAID 5) - Data journaling file systems are helpful in stress testing your checkpoints - Using midsized battery backed up write through buffering controllers. In general, if you have a small cache, you see the problem directly, and a huge cache will balance out load and defer writes to quieter times. That is why a midsized cache is so useful in showing stress in your system only when it is being stressed. Only partly in jest, /Aaron BTW - I am truly curious about what happens to your system if you use separate RAID 0+1 for your logs, disk sorts, and at least the most active tables. This should reduce I/O load by an order of magnitude. "Vivek Khera" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > "JB" == Josh Berkus <[EMAIL PROTECTED]> writes: > > JB> Aaron, > >> I do consulting, so they're all over the place and tend to be complex. Very > >> few fit in RAM, but still are very buffered. These are almost all backed > >> with very high end I/O subsystems, with dozens of spindles with battery > >> backed up writethrough cache and gigs of buffers, which may be why I worry > >> so much about CPU. I have had this issue with multiple servers. > > JB> Aha, I think this is the difference. I never seem to be able to > JB> get my clients to fork out for adequate disk support. They are > JB> always running off single or double SCSI RAID in the host server; > JB> not the sort of setup you have. > > Even when I upgraded my system to a 14-spindle RAID5 with 128M cache > and 4GB RAM on a dual Xeon system, I still wind up being I/O bound > quite often. > > I think it depends on what your "working set" turns out to be. My > workload really spans a lot more of the DB than I can end up caching. > > -- > =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= > Vivek Khera, Ph.D.Khera Communications, Inc. > Internet: [EMAIL PROTECTED] Rockville, MD +1-301-869-4449 x806 > AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/ > > ---(end of broadcast)--- > TIP 8: explain analyze is your friend > ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] possible improvement between G4 and G5
> "JB" == Josh Berkus <[EMAIL PROTECTED]> writes: JB> Aaron, >> I do consulting, so they're all over the place and tend to be complex. Very >> few fit in RAM, but still are very buffered. These are almost all backed >> with very high end I/O subsystems, with dozens of spindles with battery >> backed up writethrough cache and gigs of buffers, which may be why I worry >> so much about CPU. I have had this issue with multiple servers. JB> Aha, I think this is the difference. I never seem to be able to JB> get my clients to fork out for adequate disk support. They are JB> always running off single or double SCSI RAID in the host server; JB> not the sort of setup you have. Even when I upgraded my system to a 14-spindle RAID5 with 128M cache and 4GB RAM on a dual Xeon system, I still wind up being I/O bound quite often. I think it depends on what your "working set" turns out to be. My workload really spans a lot more of the DB than I can end up caching. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Vivek Khera, Ph.D.Khera Communications, Inc. Internet: [EMAIL PROTECTED] Rockville, MD +1-301-869-4449 x806 AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/ ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] possible improvement between G4 and G5
Aaron, > I do consulting, so they're all over the place and tend to be complex. Very > few fit in RAM, but still are very buffered. These are almost all backed > with very high end I/O subsystems, with dozens of spindles with battery > backed up writethrough cache and gigs of buffers, which may be why I worry > so much about CPU. I have had this issue with multiple servers. Aha, I think this is the difference. I never seem to be able to get my clients to fork out for adequate disk support. They are always running off single or double SCSI RAID in the host server; not the sort of setup you have. > What my CPU tends to be doing is a combination of general processing, > complex SQL processing: nested loops and sorting and hashing and triggers > and SPs. I haven't noticed SPs to be particularly CPU-hoggish, more RAM. > I'm curious about you having flat CPU, which is not my experience. Are your > apps mature and stable? Well, "flat" was a bit of an exaggeration ... there are spikes ... but average CPU load is < 30%.I think the difference is that your clients listen to you about disk access. Mine are all too liable to purchase a quad-Xeon machine but with an Adaptec RAID-5 card with 4 drives, and *then* call me and ask for advice. As a result, most intensive operations don't tend to swamp the CPU because they are waiting for disk. I have noticed the limitiations on RAM for 64 vs. 32, as I find it easier to convince a client to get 8GB RAM than four-channel RAID with 12 drives, mostly because the former is cheaper. Linux 2.4 + Bigmem just doesn't cut it for making effective use of > 3GB of RAM. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] possible improvement between G4 and G5
- Original Message - From: "Josh Berkus" <[EMAIL PROTECTED]> To: "Aaron Werman" <[EMAIL PROTECTED]>; "Qing Zhao" <[EMAIL PROTECTED]>; "Tom Lane" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, April 06, 2004 2:52 PM Subject: Re: [PERFORM] possible improvement between G4 and G5 > Aaron, > > > I'm surprised by this thought. I tend to hit CPU bottlenecks more often than > > I/O ones. In most applications, db I/O is a combination of buffer misses and > > logging, which are both reasonably constrained. > > Not my experience at all. In fact, the only times I've seen modern platforms > max out the CPU was when: > a) I had bad queries with bad plans, or > b) I had reporting queires that did a lot of calculation for display (think > OLAP). > > Otherwise, on the numerous servers I administrate, RAM spikes, and I/O > bottlenecks, but the CPU stays almost flat. > > Of course, most of my apps are large databases (i.e. too big for RAM) with a > heavy transaction-processing component. > > What kind of applications are you running? > > -- > -Josh Berkus > Aglio Database Solutions > San Francisco > > I do consulting, so they're all over the place and tend to be complex. Very few fit in RAM, but still are very buffered. These are almost all backed with very high end I/O subsystems, with dozens of spindles with battery backed up writethrough cache and gigs of buffers, which may be why I worry so much about CPU. I have had this issue with multiple servers. Consider an analysis db with 10G data. Of that, 98% of the access is read and only 2% write (that is normal for almost anything that is not order entry, even transaction processing with thorough cross validation). Almost all the queries access 10%, or 1G of the data. Of the reads, they average ~3 level b-trees, with the first 2 levels certainly cached, and the last ones often cached. Virtually all the I/O activity is logical reads against buffer. A system with a 100 transactions which on average access 200 rows does 98% of 200 rows x 100 transactions x 3 logical I/Os per read = 58,800 logical reads, of which actually maybe a hundred are physical reads. It also does 2% of 200 rows x 100 transactions x (1 table logical I/O and say 2 index logical writes) per write = 1,200 logical writes to log, of which there are 100 transaction commit synch writes, and in reality less than that because of queuing against logs (there are also 1,200 logical writes deferred to checkpoint, of which it is likely to only be 40 physical writes because of page overlaps). Transaction processing is a spectrum between activity logging, and database centric design. The former, where actions are stored in the database is totally I/O bound with the engine acting as a thin layer of logical to physical mapping. Database centric processing makes the engine a functional server of discrete actions - and is a big CPU hog. What my CPU tends to be doing is a combination of general processing, complex SQL processing: nested loops and sorting and hashing and triggers and SPs. I'm curious about you having flat CPU, which is not my experience. Are your apps mature and stable? /Aaron ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] possible improvement between G4 and G5
Aaron, > I'm surprised by this thought. I tend to hit CPU bottlenecks more often than > I/O ones. In most applications, db I/O is a combination of buffer misses and > logging, which are both reasonably constrained. Not my experience at all. In fact, the only times I've seen modern platforms max out the CPU was when: a) I had bad queries with bad plans, or b) I had reporting queires that did a lot of calculation for display (think OLAP). Otherwise, on the numerous servers I administrate, RAM spikes, and I/O bottlenecks, but the CPU stays almost flat. Of course, most of my apps are large databases (i.e. too big for RAM) with a heavy transaction-processing component. What kind of applications are you running? -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] possible improvement between G4 and G5
- Original Message - From: "Tom Lane" <[EMAIL PROTECTED]> To: "Qing Zhao" <[EMAIL PROTECTED]> Cc: <[EMAIL PROTECTED]> Sent: Tuesday, April 06, 2004 1:47 AM Subject: Re: [PERFORM] possible improvement between G4 and G5 > Qing Zhao <[EMAIL PROTECTED]> writes: > > We have got a G5 64-bit processor to replace an old G4 32-bit > > processor. Given everything else equal, should we see a big > > improvement on PG's performance? > > Nope. Database performance typically depends on disk performance first, > and RAM size second. I'm surprised by this thought. I tend to hit CPU bottlenecks more often than I/O ones. In most applications, db I/O is a combination of buffer misses and logging, which are both reasonably constrained. RAM size seems to me to be the best way to improve performance, and then CPU which is needed to perform the in-memory searching, locking, versioning, and processing, and finally I/O (this is not the case in small I/O subsystems - if you have less than a dozen drives, you're easily I/O bound). I/O is often the thing I tune first, because I can do it in place without buying hardware. Conceptually, an RDBMS converts slow random I/O into in memory processing and sequential logging writes. If successful, it should reduce the I/O overhead. /Aaron ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] possible improvement between G4 and G5
On Tue, Apr 06, 2004 at 01:47:22AM -0400, Tom Lane wrote: > Qing Zhao <[EMAIL PROTECTED]> writes: > > We have got a G5 64-bit processor to replace an old G4 32-bit > > processor. Given everything else equal, should we see a big > > improvement on PG's performance? > > Nope. Database performance typically depends on disk performance first, > and RAM size second. A 64-bit processor might help by allowing you to > install more RAM, but you didn't say that you had. Memory bandwidth is a consideration too, so you might see some performance improvements on a G5. We recently debated between Xeons and Opterons in a new PGSQL server and a little poking around on the lists indicated that the Opterons did perform better, presumably due to the increased memory bandwidth. Incidentally, this is why you need about 2x the CPUs on Sun hardware vs RS6000 hardware for database stuff (and that gets expensive if you're paying per CPU!). -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Member: Triangle Fraternity, Sports Car Club of America Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?" ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] possible improvement between G4 and G5
Qing Zhao <[EMAIL PROTECTED]> writes: > We have got a G5 64-bit processor to replace an old G4 32-bit > processor. Given everything else equal, should we see a big > improvement on PG's performance? Nope. Database performance typically depends on disk performance first, and RAM size second. A 64-bit processor might help by allowing you to install more RAM, but you didn't say that you had. > The other question I have is that, when I tried different size for > shared_buffer ( i used 10,000, 1,000, 528, 256) and Max > connections=32, it gives me error when I tried to start PG using > pg_ctl start as postgres. It kept saying this is bigger than the > system Shared Memory. Out-of-the-box, Mac OS X has a very low SHMMAX limit. See the PG admin docs or the mail list archives about how to increase it. You should do this --- most people find that you want to set shared_buffers to 1000 or 1 or so for best performance. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings