Re: [SPAM?] Re: [PERFORM] PG8 Tuning
Jeff, 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog you're better off. Like Mr. Stone said earlier, this is pure dogma. In my experience, xlogs on the same volume with data is much faster if both are on battery-backed write-back RAID controller memory. Moving from this situation to xlogs on a single normal disk is going to be much slower in most cases. The advice on separate drives for xlog (as is all advice on that web page) is based on numerous, repeatable tests at OSDL. However, you are absolutely correct in that it's *relative* advice, not absolute advice. If, for example, you're using a $100,000 EMC SAN as your storage you'll probably be better off giving it everything and letting its controller and cache handle disk allocation etc. On the other hand, if you're dealing with the 5 drives in a single Dell 6650, I've yet to encounter a case where a separate xlog disk did not benefit an OLTP application. For Solaris, the advantage of using a separate disk or partition is that the mount options you want for the xlog (including forcedirectio) are considerably different from what you'd use with the main database. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [SPAM?] Re: [PERFORM] PG8 Tuning
On Tue, Aug 16, 2005 at 09:12:31AM -0700, Josh Berkus wrote: However, you are absolutely correct in that it's *relative* advice, not absolute advice. If, for example, you're using a $100,000 EMC SAN as your storage you'll probably be better off giving it everything and letting its controller and cache handle disk allocation etc. On the other hand, if you're dealing with the 5 drives in a single Dell 6650, I've yet to encounter a case where a separate xlog disk did not benefit an OLTP application. I've been asked this a couple of times and I don't know the answer: what happens if you give XLog a single drive (unmirrored single spindle), and that drive dies? So the question really is, should you be giving two disks to XLog? -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) [PostgreSQL] is a great group; in my opinion it is THE best open source development communities in existence anywhere.(Lamar Owen) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [SPAM?] Re: [PERFORM] PG8 Tuning
I've been asked this a couple of times and I don't know the answer: what happens if you give XLog a single drive (unmirrored single spindle), and that drive dies? So the question really is, should you be giving two disks to XLog? If that drive dies your restoring from backup. You would need to run at least RAID 1, preferrably RAID 10. Sincerely, Joshua D. Drkae -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [SPAM?] Re: [PERFORM] PG8 Tuning
On Tue, Aug 16, 2005 at 09:12:31AM -0700, Josh Berkus wrote: However, you are absolutely correct in that it's *relative* advice, not absolute advice. If, for example, you're using a $100,000 EMC SAN as your storage you'll probably be better off giving it everything and letting its controller and cache handle disk allocation etc. Well, you don't have to spend *quite* that much to get a decent storage array. :) On the other hand, if you're dealing with the 5 drives in a single Dell 6650, I've yet to encounter a case where a separate xlog disk did not benefit an OLTP application. IIRC, that's an older raid controller that tops out at 128MB write cache, and 5 spindles ain't a lot--so it makes sense that it would benefit from a seperate spindle for xlog. Also note that I said the write cache advice goes out the window if you have a workload that involves constant writing (or if your xlog writes come in faster than your write cache can drain) because at that point you essentially drop back to raw disk speed; I assume the OLTP apps you mention are fairly write-intensive. OTOH, in a reasonably safe configuration I suppose you'd end up with a 3 disk raid5 / 2 disk raid1 or 2 raid 1 pairs on that dell 6650; is that how you test? Once you're down to that small a data set I'd expect the system's ram cache to be a much larger percentage of the working set, which would tend to make the xlog just about the *only* latency-critical i/o. That's a different creature from a data mining app that might really benefit from having additional spindles to accelerate read performance from indices much larger than RAM. At any rate, this just underscores the need for testing a particular workload on particular hardware. Things like the disk speed, raid configuration, write cache size, transaction size, data set size, working set size, concurrent transactions, read vs write emphasis, etc., are each going to have a fairly large impact on performance. For Solaris, the advantage of using a separate disk or partition is that the mount options you want for the xlog (including forcedirectio) are considerably different from what you'd use with the main database. Yeah, having a seperate partition is often good even if you do have everything on the same disks. Mike Stone ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SPAM?] Re: [PERFORM] PG8 Tuning
On Aug 11, 2005, at 12:58 PM, Jeffrey W. Baker wrote: Like Mr. Stone said earlier, this is pure dogma. In my experience, xlogs on the same volume with data is much faster if both are on battery-backed write-back RAID controller memory. Moving from this situation to xlogs on a single normal disk is going to be much slower in most cases. This does also point one important point about performance. Which is a touch unfortunate (and expensive to test): Your milage may vary on any of these improvements. Some people have 0 problems and incredible performance with say, 1000 shared_bufs and the WAL on the same disk.. Others need 10k shared bufs and wal split over a 900 spindle raid with data spread across 18 SAN's... Unfortunately there is no one true way :( The best bet (which is great if you can): Try out various settings.. if you still run into problems look into some more hardware.. see if you can borrow any or fabricate a poor man's equivalent for testing. -- Jeff Trout [EMAIL PROTECTED] http://www.jefftrout.com/ http://www.stuarthamm.net/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[SPAM?] Re: [PERFORM] PG8 Tuning
Paul, Before I say anything else, one online document which may be of assistance to you is: http://www.powerpostgresql.com/PerfList/ Some thoughts I have: 3) You're shared RAM setting seems overkill to me. Part of the challenge is you're going from 1000 to 262K with no assessment in between. Each situation can be different, but try in the range of 10 - 50K. 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog you're better off. If it is sharing with any other OS/DB resource, the performance will be impacted. From what I have learned from others on this list, RAID5 is not the best choice for the database. RAID10 would be a better solution (using 8 of your disks) then take the remaining disk and do mirror with your pg_xlog if possible. Best of luck, Steve Poe On Thu, 2005-08-11 at 13:23 +0100, Paul Johnson wrote: Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC CPUs running Solaris 10. The DB cluster is on an external fibre-attached Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN. The system is for the sole use of a couple of data warehouse developers, hence we are keen to use 'aggressive' tuning options to maximise performance. So far we have made the following changes and measured the impact on our test suite: 1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement in some cases. 2) Increase work_mem from 1,024 to 524,288. 3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box. Question - can Postgres only use 2GB RAM, given that shared_buffers can only be set as high as 262,143 (8K pages)? So far so good... 4) Move /pg_xlog to an internal disk within the V250. This has had a severe *negative* impact on performance. Copy job has gone from 2 mins to 12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL jobs. I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to a single spindle disk? In cases such as this, where an external storage array with a hardware RAID controller is used, the normal advice to separate the data from the pg_xlog seems to come unstuck, or are we missing something? Cheers, Paul Johnson. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [SPAM?] Re: [PERFORM] PG8 Tuning
On Fri, 2005-08-12 at 08:47 +, Steve Poe wrote: Paul, Before I say anything else, one online document which may be of assistance to you is: http://www.powerpostgresql.com/PerfList/ Some thoughts I have: 3) You're shared RAM setting seems overkill to me. Part of the challenge is you're going from 1000 to 262K with no assessment in between. Each situation can be different, but try in the range of 10 - 50K. 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog you're better off. Like Mr. Stone said earlier, this is pure dogma. In my experience, xlogs on the same volume with data is much faster if both are on battery-backed write-back RAID controller memory. Moving from this situation to xlogs on a single normal disk is going to be much slower in most cases. -jwb ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [SPAM?] Re: [PERFORM] PG8 Tuning
I think the T-3 RAID at least breaks some of these rules -- I've got 2 T-3's, 1 configured as RAID-10 and the other as RAID5, and they both seem to perform about the same. I use RAID5 with a hot spare, so it's using 8 spindles. I got a lot of performance improvement out of mount the fs noatime and turning journaling off. Of course it takes a *long* time to recover from a crash. Steve Poe wrote: Paul, Before I say anything else, one online document which may be of assistance to you is: http://www.powerpostgresql.com/PerfList/ Some thoughts I have: 3) You're shared RAM setting seems overkill to me. Part of the challenge is you're going from 1000 to 262K with no assessment in between. Each situation can be different, but try in the range of 10 - 50K. 4) pg_xlog: If you're pg_xlog is on a spindle is *only* for pg_xlog you're better off. If it is sharing with any other OS/DB resource, the performance will be impacted. From what I have learned from others on this list, RAID5 is not the best choice for the database. RAID10 would be a better solution (using 8 of your disks) then take the remaining disk and do mirror with your pg_xlog if possible. Best of luck, Steve Poe On Thu, 2005-08-11 at 13:23 +0100, Paul Johnson wrote: Hi all, we're running PG8 on a Sun V250 with 8GB RAM and 2*1.3GHz SPARC CPUs running Solaris 10. The DB cluster is on an external fibre-attached Sun T3 array that has 9*36GB drives configured as a single RAID5 LUN. The system is for the sole use of a couple of data warehouse developers, hence we are keen to use 'aggressive' tuning options to maximise performance. So far we have made the following changes and measured the impact on our test suite: 1) Increase checkpoint_segments from 3 to 64. This made a 10x improvement in some cases. 2) Increase work_mem from 1,024 to 524,288. 3) Increase shared_buffers from 1,000 to 262,143 (2 GB). This required setting SHMMAX=4294967295 (4 GB) in /etc/system and re-booting the box. Question - can Postgres only use 2GB RAM, given that shared_buffers can only be set as high as 262,143 (8K pages)? So far so good... 4) Move /pg_xlog to an internal disk within the V250. This has had a severe *negative* impact on performance. Copy job has gone from 2 mins to 12 mins, simple SQL job gone from 1 min to 7 mins. Not even run long SQL jobs. I'm guessing that this is because pg_xlog has gone from a 9 spindle LUN to a single spindle disk? In cases such as this, where an external storage array with a hardware RAID controller is used, the normal advice to separate the data from the pg_xlog seems to come unstuck, or are we missing something? Cheers, Paul Johnson. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq