Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
apologies for such a late response to this thread, but there is domething I think is _really_ dangerous here. On Thu, 18 Aug 2011, Aidan Van Dyk wrote: On Thu, Aug 18, 2011 at 1:35 AM, Craig Ringer ring...@ringerc.id.au wrote: On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? The barrier is the linux fs/block way of saying these writes need to be on persistent media before I can depend on them. On typical spinning media disks, that means out of the disk cache (which is not persistent) and on platters. The way it assures that the writes are on persistant media is with a flush cache type of command. The flush cache is a close approximation to make sure it's persistent. If your cache is battery backed, it is now persistent, and there is no need to flush cache, hence the nobarrier option if you believe your cache is persistent. Now, make sure that even though your raid cache is persistent, your disks have cache in write-through mode, cause it would suck for your raid cache to work, but believe the data is safely on disk and only find out that it was in the disks (small) cache, and you're raid is out of sync after an outage because of that... I believe most raid cards will handle that correctly for you automatically. if you don't have barriers enabled, the data may not get written out of main memory to the battery backed memory on the card as the OS has no reason to do the write out of the OS buffers now rather than later. Every raid card I have seen has ignored the 'flush cache' type of command if it has a battery and that battery is good, so you leave the barriers enabled and the card still gives you great performance. David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Mon, Sep 12, 2011 at 6:57 PM, da...@lang.hm wrote: The barrier is the linux fs/block way of saying these writes need to be on persistent media before I can depend on them. On typical spinning media disks, that means out of the disk cache (which is not persistent) and on platters. The way it assures that the writes are on persistant media is with a flush cache type of command. The flush cache is a close approximation to make sure it's persistent. If your cache is battery backed, it is now persistent, and there is no need to flush cache, hence the nobarrier option if you believe your cache is persistent. Now, make sure that even though your raid cache is persistent, your disks have cache in write-through mode, cause it would suck for your raid cache to work, but believe the data is safely on disk and only find out that it was in the disks (small) cache, and you're raid is out of sync after an outage because of that... I believe most raid cards will handle that correctly for you automatically. if you don't have barriers enabled, the data may not get written out of main memory to the battery backed memory on the card as the OS has no reason to do the write out of the OS buffers now rather than later. It's not quite so simple. The sync calls (pick your flavour) is what tells the OS buffers they have to go out. The syscall (on a working FS) won't return until the write and data has reached the device safely, and is considered persistent. But in linux, a barrier is actually a synchronization point, not just a flush cache... It's a guarantee everything up to now is persistent, I'm going to start counting on it. But depending on your card, drivers and yes, kernel version, that barrier is sometimes a drain/block I/O queue, issue cache flush, wait, write specific data, flush, wait, open I/O queue. The double flush is because it needs to guarantee everything previous is good before it writes the critical piece, and then needs to guarantee that too. Now, on good raid hardware it's not usually that bad. And then, just to confuse people more, LVM up until 2.6.29 (so that includes all those RHEL5/CentOS5 installs out there which default to using LVM) didn't handle barriers, it just sort of threw them out as it came across them, meaning that you got the performance of nobarrier, even if you thought you were using barriers on poor raid hardware. Every raid card I have seen has ignored the 'flush cache' type of command if it has a battery and that battery is good, so you leave the barriers enabled and the card still gives you great performance. XFS FAQ goes over much of it, starting at Q24: http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F So, for pure performance, on a battery-backed controller, nobarrier is the recommended *performance* setting. But, to throw a wrench into the plan, what happens when during normal battery tests, your raid controller decides the battery is failing... of course, it's going to start screaming and send all your monitoring alarms off (you're monitoring that, right?), but have you thought to make sure that your FS is remounted with barriers at the first sign of battery trouble? a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Mon, 12 Sep 2011, Aidan Van Dyk wrote: On Mon, Sep 12, 2011 at 6:57 PM, da...@lang.hm wrote: The barrier is the linux fs/block way of saying these writes need to be on persistent media before I can depend on them. On typical spinning media disks, that means out of the disk cache (which is not persistent) and on platters. The way it assures that the writes are on persistant media is with a flush cache type of command. The flush cache is a close approximation to make sure it's persistent. If your cache is battery backed, it is now persistent, and there is no need to flush cache, hence the nobarrier option if you believe your cache is persistent. Now, make sure that even though your raid cache is persistent, your disks have cache in write-through mode, cause it would suck for your raid cache to work, but believe the data is safely on disk and only find out that it was in the disks (small) cache, and you're raid is out of sync after an outage because of that... I believe most raid cards will handle that correctly for you automatically. if you don't have barriers enabled, the data may not get written out of main memory to the battery backed memory on the card as the OS has no reason to do the write out of the OS buffers now rather than later. It's not quite so simple. The sync calls (pick your flavour) is what tells the OS buffers they have to go out. The syscall (on a working FS) won't return until the write and data has reached the device safely, and is considered persistent. But in linux, a barrier is actually a synchronization point, not just a flush cache... It's a guarantee everything up to now is persistent, I'm going to start counting on it. But depending on your card, drivers and yes, kernel version, that barrier is sometimes a drain/block I/O queue, issue cache flush, wait, write specific data, flush, wait, open I/O queue. The double flush is because it needs to guarantee everything previous is good before it writes the critical piece, and then needs to guarantee that too. Now, on good raid hardware it's not usually that bad. And then, just to confuse people more, LVM up until 2.6.29 (so that includes all those RHEL5/CentOS5 installs out there which default to using LVM) didn't handle barriers, it just sort of threw them out as it came across them, meaning that you got the performance of nobarrier, even if you thought you were using barriers on poor raid hardware. this is part of the problem. if you have a simple fs-on-hardware you may be able to get away with the barriers, but if you have fs-on-x-on-y-on-hardware type of thing (specifically where LVM is one of the things in the middle), and those things in the middle do not honor barriers, the fsync becomes meaningless because without propogating the barrier down the stack, the writes that the fsync triggers may not get to the disk. Every raid card I have seen has ignored the 'flush cache' type of command if it has a battery and that battery is good, so you leave the barriers enabled and the card still gives you great performance. XFS FAQ goes over much of it, starting at Q24: http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F So, for pure performance, on a battery-backed controller, nobarrier is the recommended *performance* setting. But, to throw a wrench into the plan, what happens when during normal battery tests, your raid controller decides the battery is failing... of course, it's going to start screaming and send all your monitoring alarms off (you're monitoring that, right?), but have you thought to make sure that your FS is remounted with barriers at the first sign of battery trouble? yep. on a good raid card with battery backed cache, the performance difference between barriers being on and barriers being off should be minimal. If it's not, I think that you have something else going on. David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Mon, Sep 12, 2011 at 8:47 PM, da...@lang.hm wrote: XFS FAQ goes over much of it, starting at Q24: http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F So, for pure performance, on a battery-backed controller, nobarrier is the recommended *performance* setting. But, to throw a wrench into the plan, what happens when during normal battery tests, your raid controller decides the battery is failing... of course, it's going to start screaming and send all your monitoring alarms off (you're monitoring that, right?), but have you thought to make sure that your FS is remounted with barriers at the first sign of battery trouble? yep. on a good raid card with battery backed cache, the performance difference between barriers being on and barriers being off should be minimal. If it's not, I think that you have something else going on. The performance boost you'll get is that you don't have the temporary stall in parallelization that the barriers have. With barriers, even if the controller cache doesn't really flush, you still have the can't send more writes to the device until the barrier'ed write is done, so at all those points, you have only a single write command in flight. The performance penalty of barriers on good cards comes because barriers are written to prevent the devices from reordering of write persistence, and do that by waiting for a write to be persistent before allowing more to be queued to the device. With nobarrier, you operate under the assumption that the block device writes are persisted in the order commands are issued to the devices, so you never have to drain the queue, as you do in the normal barrier implementation, and can (in theory) always have more request that the raid card can be working on processing, reordering, and dispatching to platters for the maximum theoretical throughput... Of course, linux has completely re-written/changed the sync/barrier/flush methods over the past few years, and there is no guarantee they don't keep changing the implementation details in the future, so keep up on the filesystem details of whatever you're using... So keep doing burn-ins, with real pull-the-cord tests... They can't prove it's 100% safe, but they can quickly prove when it's not ;-) a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 18/08/11 17:35, Craig Ringer wrote: On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? If the card's cache has a battery, then the cache is preserved in the advent of crash/power loss etc - provided it has enough charge, so setting 'writeback' property on arrays is safe. The PERC/SERVERRAID cards I'm familiar (LSI Megaraid rebranded models) all switch to write-though mode if they detect the battery is dangerously discharged so this is not normally a problem (but commit/fsync performance will fall off a cliff when this happens)! Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Thu, Aug 18, 2011 at 1:35 AM, Craig Ringer ring...@ringerc.id.au wrote: On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? The barrier is the linux fs/block way of saying these writes need to be on persistent media before I can depend on them. On typical spinning media disks, that means out of the disk cache (which is not persistent) and on platters. The way it assures that the writes are on persistant media is with a flush cache type of command. The flush cache is a close approximation to make sure it's persistent. If your cache is battery backed, it is now persistent, and there is no need to flush cache, hence the nobarrier option if you believe your cache is persistent. Now, make sure that even though your raid cache is persistent, your disks have cache in write-through mode, cause it would suck for your raid cache to work, but believe the data is safely on disk and only find out that it was in the disks (small) cache, and you're raid is out of sync after an outage because of that... I believe most raid cards will handle that correctly for you automatically. a. -- Aidan Van Dyk Create like a god, ai...@highrise.ca command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 18, 2011, at 2:07 AM, Mark Kirkwood wrote: On 18/08/11 17:35, Craig Ringer wrote: On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? If the card's cache has a battery, then the cache is preserved in the advent of crash/power loss etc - provided it has enough charge, so setting 'writeback' property on arrays is safe. The PERC/SERVERRAID cards I'm familiar (LSI Megaraid rebranded models) all switch to write-though mode if they detect the battery is dangerously discharged so this is not normally a problem (but commit/fsync performance will fall off a cliff when this happens)! Cheers Mark So a setting such as this: Device Name : /dev/sdb Type: SAS Read Policy : No Read Ahead Write Policy: Write Back Cache Policy: Not Applicable Stripe Element Size : 64 KB Disk Cache Policy : Enabled Is sufficient to enable nobarrier then with these settings? Thank you Ogden -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 4:17 PM, Greg Smith wrote: On 08/17/2011 02:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Congratulations--you're now qualified to be a member of the RAID5 sucks club. You can find other members at http://www.miracleas.com/BAARF/BAARF2.html Reasonable read speeds and just terrible write ones are expected if that's on your old hardware. Your new results are what I would expect from the hardware you've described. The only thing that looks weird are your ext4 Sequential Output - Block results. They should be between the ext3 and the XFS results, not far lower than either. Normally this only comes from using a bad set of mount options. With a battery-backed write cache, you'd want to use nobarrier for example; if you didn't do that, that can crush output rates. I have mounted the ext4 system with the nobarrier option: /dev/sdb1 on /var/lib/pgsql type ext4 (rw,noatime,data=writeback,barrier=0,nobh,errors=remount-ro) Yet the results show absolutely a decrease in performance in the ext4 Sequential Output - Block results: http://malekkoheavyindustry.com/benchmark.html However, the Random seeks is better, even more so than XFS... Any thoughts as to why this is occurring? Ogden
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 19/08/11 02:09, Ogden wrote: On Aug 18, 2011, at 2:07 AM, Mark Kirkwood wrote: On 18/08/11 17:35, Craig Ringer wrote: On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? If the card's cache has a battery, then the cache is preserved in the advent of crash/power loss etc - provided it has enough charge, so setting 'writeback' property on arrays is safe. The PERC/SERVERRAID cards I'm familiar (LSI Megaraid rebranded models) all switch to write-though mode if they detect the battery is dangerously discharged so this is not normally a problem (but commit/fsync performance will fall off a cliff when this happens)! Cheers Mark So a setting such as this: Device Name : /dev/sdb Type: SAS Read Policy : No Read Ahead Write Policy: Write Back Cache Policy: Not Applicable Stripe Element Size : 64 KB Disk Cache Policy : Enabled Is sufficient to enable nobarrier then with these settings? Hmm - that output looks different from the cards I'm familiar with. I'd want to see the manual entries for Cache Policy=Not Applicable and Disk Cache Policy=Enabled to understand what the settings actually mean. Assuming Disk Cache Policy=Enabled means what I think it does (i.e writes are cached in the physical drives cache), this setting seems wrong if your card has on board cache + battery, you would want to only cache 'em in the *card's* cache (too many caches to keep straight in one's head, lol). Cheers Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 19/08/11 12:52, Mark Kirkwood wrote: On 19/08/11 02:09, Ogden wrote: On Aug 18, 2011, at 2:07 AM, Mark Kirkwood wrote: On 18/08/11 17:35, Craig Ringer wrote: On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? If the card's cache has a battery, then the cache is preserved in the advent of crash/power loss etc - provided it has enough charge, so setting 'writeback' property on arrays is safe. The PERC/SERVERRAID cards I'm familiar (LSI Megaraid rebranded models) all switch to write-though mode if they detect the battery is dangerously discharged so this is not normally a problem (but commit/fsync performance will fall off a cliff when this happens)! Cheers Mark So a setting such as this: Device Name : /dev/sdb Type: SAS Read Policy : No Read Ahead Write Policy: Write Back Cache Policy: Not Applicable Stripe Element Size : 64 KB Disk Cache Policy : Enabled Is sufficient to enable nobarrier then with these settings? Hmm - that output looks different from the cards I'm familiar with. I'd want to see the manual entries for Cache Policy=Not Applicable and Disk Cache Policy=Enabled to understand what the settings actually mean. Assuming Disk Cache Policy=Enabled means what I think it does (i.e writes are cached in the physical drives cache), this setting seems wrong if your card has on board cache + battery, you would want to only cache 'em in the *card's* cache (too many caches to keep straight in one's head, lol). FWIW - here's what our ServerRaid (M5015) output looks like for a RAID 1 array configured with writeback, reads not cached on the card's memory, physical disk caches disabled: $ MegaCli64 -LDInfo -L0 -a0 Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name: RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0 Size: 67.054 GB State : Optimal Strip Size : 64 KB Number Of Drives: 2 Span Depth : 1 Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU Access Policy : Read/Write Disk Cache Policy : Disabled Encryption Type : None -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 17/08/2011 7:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html The results are not completely outrageous, however you don't say what drives, how many and what RAID controller you have in the current and new systems. You might expect that performance from 10/12 disks in RAID 10 with a good controller. I would say that your current system is outrageous in that is is so slow! Cheers, Gary. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 8/17/2011 1:35 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. Regards, Ken A while back I tested ext3 and xfs myself and found xfs performs better for PG. However, I also have a photos site with 100K files (split into a small subset of directories), and xfs sucks bad on it. So my db is on xfs, and my photos are on ext4. The numbers between raid5 and raid10 dont really surprise me either. I went from 100 Meg/sec to 230 Meg/sec going from 3 disk raid 5 to 4 disk raid 10. (I'm, of course, using SATA drives with 4 gig of ram... and 2 cores. Everyone with more than 8 cores and 64 gig of ram is off my Christmas list! :-) ) -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 1:48 PM, Andy Colson wrote: On 8/17/2011 1:35 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. Regards, Ken A while back I tested ext3 and xfs myself and found xfs performs better for PG. However, I also have a photos site with 100K files (split into a small subset of directories), and xfs sucks bad on it. So my db is on xfs, and my photos are on ext4. What about the OS itself? I put the Debian linux sysem also on XFS but haven't played around with it too much. Is it better to put the OS itself on ext4 and the /var/lib/pgsql partition on XFS? Thanks Ogden -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 1:33 PM, Gary Doades wrote: On 17/08/2011 7:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html The results are not completely outrageous, however you don't say what drives, how many and what RAID controller you have in the current and new systems. You might expect that performance from 10/12 disks in RAID 10 with a good controller. I would say that your current system is outrageous in that is is so slow! Cheers, Gary. Yes, under heavy writes the load would shoot right up which is what caused us to look at upgrading. If it is the RAID 5, it is mind boggling that it could be that much of a difference. I expected a difference, now that much. The new system has 6 drives, 300Gb 15K SAS and I've put them into a RAID 10 configuration. The current system is ext3 with RAID 5 over 4 disks on a Perc/5i controller which has half the write cache as the new one (256 Mb vs 512Mb). Ogden -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 17/08/2011 7:56 PM, Ogden wrote: On Aug 17, 2011, at 1:33 PM, Gary Doades wrote: On 17/08/2011 7:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html The results are not completely outrageous, however you don't say what drives, how many and what RAID controller you have in the current and new systems. You might expect that performance from 10/12 disks in RAID 10 with a good controller. I would say that your current system is outrageous in that is is so slow! Cheers, Gary. Yes, under heavy writes the load would shoot right up which is what caused us to look at upgrading. If it is the RAID 5, it is mind boggling that it could be that much of a difference. I expected a difference, now that much. The new system has 6 drives, 300Gb 15K SAS and I've put them into a RAID 10 configuration. The current system is ext3 with RAID 5 over 4 disks on a Perc/5i controller which has half the write cache as the new one (256 Mb vs 512Mb). Hmm... for only 6 disks in RAID 10 I would say that the figures are a bit higher than I would expect. The PERC 5 controller is pretty poor in my opinion, PERC 6 a lot better and the new H700's pretty good. I'm guessing you have a H700 in your new system. I've just got a Dell 515 with a H700 and 8 SAS in RAID 10 and I only get around 600 MB/s read using ext4 and Ubuntu 10.4 server. Like I say, your figures are not outrageous, just unexpectedly good :) Cheers, Gary. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 8/17/2011 1:55 PM, Ogden wrote: On Aug 17, 2011, at 1:48 PM, Andy Colson wrote: On 8/17/2011 1:35 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. Regards, Ken A while back I tested ext3 and xfs myself and found xfs performs better for PG. However, I also have a photos site with 100K files (split into a small subset of directories), and xfs sucks bad on it. So my db is on xfs, and my photos are on ext4. What about the OS itself? I put the Debian linux sysem also on XFS but haven't played around with it too much. Is it better to put the OS itself on ext4 and the /var/lib/pgsql partition on XFS? Thanks Ogden I doubt it matters. The OS is not going to batch delete thousands of files. Once its setup, its pretty constant. I would not worry about it. -Andy -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Wed, Aug 17, 2011 at 1:55 PM, Ogden li...@darkstatic.com wrote: What about the OS itself? I put the Debian linux sysem also on XFS but haven't played around with it too much. Is it better to put the OS itself on ext4 and the /var/lib/pgsql partition on XFS? We've always put the OS on whatever default filesystem it uses, and then put PGDATA on a RAID 10/XFS and PGXLOG on RAID 1/XFS (and for our larger installations, we setup another RAID 10/XFS for heavily accessed indexes or tables). If you have a battery-backed cache on your controller (and it's been tested to work), you can increase performance by mounting the XFS partitions with nobarrier...just make sure your battery backup works. I don't know how current this information is for 9.x (we're still on 8.4), but there is (used to be?) a threshold above which more shared_buffers didn't help. The numbers vary, but somewhere between 8 and 16 GB is typically quoted. We set ours to 25% RAM, but no more than 12 GB (even for our machines with 128+ GB of RAM) because that seems to be a breaking point for our workload. Of course, no advice will take the place of testing with your workload, so be sure to test =)
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 1:35 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. i tested ext4 and the results did not seem to be that close to XFS. Especially when looking at the Block K/sec for the Sequential Output. http://malekkoheavyindustry.com/benchmark.html So XFS would be best in this case? Thank you Ogden
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Wed, Aug 17, 2011 at 03:40:03PM -0500, Ogden wrote: On Aug 17, 2011, at 1:35 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. i tested ext4 and the results did not seem to be that close to XFS. Especially when looking at the Block K/sec for the Sequential Output. http://malekkoheavyindustry.com/benchmark.html So XFS would be best in this case? Thank you Ogden It appears so for at least the Bonnie++ benchmark. I would really try to benchmark your actual DB on both EXT4 and XFS because some of the comparative benchmarks between the two give the win to EXT4 for INSERT/UPDATE database usage with PostgreSQL. Only your application will know for sure:) Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 3:56 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 03:40:03PM -0500, Ogden wrote: On Aug 17, 2011, at 1:35 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:32:41PM -0500, Ogden wrote: On Aug 17, 2011, at 1:31 PM, k...@rice.edu wrote: On Wed, Aug 17, 2011 at 01:26:56PM -0500, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Thank you Ogden That looks pretty normal to me. Ken But such a jump from the current db01 system to this? Over 20 times difference from the current system to the new one with XFS. Is that much of a jump normal? Ogden Yes, RAID5 is bad for in many ways. XFS is much better than EXT3. You would get similar results with EXT4 as well, I suspect, although you did not test that. i tested ext4 and the results did not seem to be that close to XFS. Especially when looking at the Block K/sec for the Sequential Output. http://malekkoheavyindustry.com/benchmark.html So XFS would be best in this case? Thank you Ogden It appears so for at least the Bonnie++ benchmark. I would really try to benchmark your actual DB on both EXT4 and XFS because some of the comparative benchmarks between the two give the win to EXT4 for INSERT/UPDATE database usage with PostgreSQL. Only your application will know for sure:) Ken What are some good methods that one can use to benchmark PostgreSQL under heavy loads? Ie. to emulate heavy writes? Are there any existing scripts and what not? Thank you Afra -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 08/17/2011 02:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Congratulations--you're now qualified to be a member of the RAID5 sucks club. You can find other members at http://www.miracleas.com/BAARF/BAARF2.html Reasonable read speeds and just terrible write ones are expected if that's on your old hardware. Your new results are what I would expect from the hardware you've described. The only thing that looks weird are your ext4 Sequential Output - Block results. They should be between the ext3 and the XFS results, not far lower than either. Normally this only comes from using a bad set of mount options. With a battery-backed write cache, you'd want to use nobarrier for example; if you didn't do that, that can crush output rates. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
-Original Message- From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance- ow...@postgresql.org] On Behalf Of Greg Smith Sent: Wednesday, August 17, 2011 3:18 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++ On 08/17/2011 02:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Congratulations--you're now qualified to be a member of the RAID5 sucks club. You can find other members at http://www.miracleas.com/BAARF/BAARF2.html Reasonable read speeds and just terrible write ones are expected if that's on your old hardware. Your new results are what I would expect from the hardware you've described. The only thing that looks weird are your ext4 Sequential Output - Block results. They should be between the ext3 and the XFS results, not far lower than either. Normally this only comes from using a bad set of mount options. With a battery-backed write cache, you'd want to use nobarrier for example; if you didn't do that, that can crush output rates. To clarify maybe for those new at using non-default mount options. With XFS the mount option is nobarrier. With ext4 I think it is barrier=0 Someone please correct me if I am misleading people or otherwise mistaken. -mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 08/17/2011 08:35 PM, mark wrote: With XFS the mount option is nobarrier. With ext4 I think it is barrier=0 http://www.mjmwired.net/kernel/Documentation/filesystems/ext4.txt ext4 supports both; nobarrier and barrier=0 mean the same thing. I tend to use nobarrier just because I'm used to that name on XFS systems. -- Greg Smith 2ndQuadrant USg...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On Aug 17, 2011, at 4:16 PM, Greg Smith wrote: On 08/17/2011 02:26 PM, Ogden wrote: I am using bonnie++ to benchmark our current Postgres system (on RAID 5) with the new one we have, which I have configured with RAID 10. The drives are the same (SAS 15K). I tried the new system with ext3 and then XFS but the results seem really outrageous as compared to the current system, or am I reading things wrong? The benchmark results are here: http://malekkoheavyindustry.com/benchmark.html Congratulations--you're now qualified to be a member of the RAID5 sucks club. You can find other members at http://www.miracleas.com/BAARF/BAARF2.html Reasonable read speeds and just terrible write ones are expected if that's on your old hardware. Your new results are what I would expect from the hardware you've described. The only thing that looks weird are your ext4 Sequential Output - Block results. They should be between the ext3 and the XFS results, not far lower than either. Normally this only comes from using a bad set of mount options. With a battery-backed write cache, you'd want to use nobarrier for example; if you didn't do that, that can crush output rates. Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. I did not do that with XFS and it did quite well - I know it's up to my app and more testing, but in your experience, what is usually a good filesystem to use? I keep reading conflicting things.. Thank you Ogden -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Raid 5 vs Raid 10 Benchmarks Using bonnie++
On 18/08/2011 11:48 AM, Ogden wrote: Isn't this very dangerous? I have the Dell PERC H700 card - I see that it has 512Mb Cache. Is this the same thing and good enough to switch to nobarrier? Just worried if a sudden power shut down, then data can be lost on this option. Yeah, I'm confused by that too. Shouldn't a write barrier flush data to persistent storage - in this case, the RAID card's battery backed cache? Why would it force a RAID controller cache flush to disk, too? -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance