Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Toby Thain
On 29-Dec-09, at 11:53 PM, Ross Walker wrote: On Dec 29, 2009, at 12:36 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: ... However, zfs does not implement RAID 1 either. This is easily demonstrated since you can unplug one side of the mirror and the writes to the zfs mirror

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Bob Friesenhahn
On Tue, 29 Dec 2009, Ross Walker wrote: Some important points to consider are that every write to a raidz vdev must be synchronous. In other words, the write needs to complete on all the drives in the stripe before the write may return as complete. This is also true of RAID 1 (mirrors)

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Ross Walker
On Wed, Dec 30, 2009 at 12:35 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Tue, 29 Dec 2009, Ross Walker wrote: Some important points to consider are that every write to a raidz vdev must be synchronous.  In other words, the write needs to complete on all the drives in the

Re: [zfs-discuss] repost - high read iops

2009-12-30 Thread Richard Elling
On Dec 30, 2009, at 9:35 AM, Bob Friesenhahn wrote: On Tue, 29 Dec 2009, Ross Walker wrote: Some important points to consider are that every write to a raidz vdev must be synchronous. In other words, the write needs to complete on all the drives in the stripe before the write may

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread przemolicc
On Mon, Dec 28, 2009 at 01:40:03PM -0800, Brad wrote: This doesn't make sense to me. You've got 32 GB, why not use it? Artificially limiting the memory use to 20 GB seems like a waste of good money. I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
Thanks for the suggestion! I have heard mirrored vdevs configuration are preferred for Oracle but whats the difference between a raidz mirrored vdev vs a raid10 setup? We have tested a zfs stripe configuration before with 15 disks and our tester was extremely happy with the performance.

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Ross Walker
On Dec 29, 2009, at 7:55 AM, Brad bene...@yahoo.com wrote: Thanks for the suggestion! I have heard mirrored vdevs configuration are preferred for Oracle but whats the difference between a raidz mirrored vdev vs a raid10 setup? A mirrored raidz provides redundancy at a steep cost to

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Eric D. Mudama
On Tue, Dec 29 at 4:55, Brad wrote: Thanks for the suggestion! I have heard mirrored vdevs configuration are preferred for Oracle but whats the difference between a raidz mirrored vdev vs a raid10 setup? We have tested a zfs stripe configuration before with 15 disks and our tester was

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
@ross Because each write of a raidz is striped across the disks the effective IOPS of the vdev is equal to that of a single disk. This can be improved by utilizing multiple (smaller) raidz vdevs which are striped, but not by mirroring them. So with random reads, would it perform better on a

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
@eric As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for random IO. It sounds like we'll need 16 vdevs striped in a pool to at

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Bob Friesenhahn
On Tue, 29 Dec 2009, Ross Walker wrote: A mirrored raidz provides redundancy at a steep cost to performance and might I add a high monetary cost. I am not sure what a mirrored raidz is. I have never heard of such a thing before. With raid10 each mirrored pair has the IOPS of a single

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Mattias Pantzare
On Tue, Dec 29, 2009 at 18:16, Brad bene...@yahoo.com wrote: @eric As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Eric D. Mudama
On Tue, Dec 29 at 9:16, Brad wrote: @eric As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for random IO. It sounds like we'll

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Richard Elling
On Dec 29, 2009, at 9:16 AM, Brad wrote: @eric As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give roughly the performance as a stripe of six bare drives, for random IO. This model

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Tim Cook
On Tue, Dec 29, 2009 at 12:07 PM, Richard Elling richard.ell...@gmail.comwrote: On Dec 29, 2009, at 9:16 AM, Brad wrote: @eric As a general rule of thumb, each vdev has the random performance roughly the same as a single member of that vdev. Having six RAIDZ vdevs in a pool should give

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Erik Trimble
Eric D. Mudama wrote: On Tue, Dec 29 at 9:16, Brad wrote: The disk cost of a raidz pool of mirrors is identical to the disk cost of raid10. ZFS can't do a raidz of mirrors or a mirror of raidz. Members of a mirror or raidz[123] must be a fundamental device (i.e. file or drive) This

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Brad
@relling For small, random read IOPS the performance of a single, top-level vdev is performance = performance of a disk * (N / (N - P)) 133 * 12/(12-1)= 133 * 12/11 where, N = number of disks in the vdev P = number of parity devices in the vdev performance of a disk

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Richard Elling
On Dec 29, 2009, at 11:26 AM, Brad wrote: @relling For small, random read IOPS the performance of a single, top-level vdev is performance = performance of a disk * (N / (N - P)) 133 * 12/(12-1)= 133 * 12/11 where, N = number of disks in the vdev P = number of parity

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Eric D. Mudama
On Tue, Dec 29 at 11:14, Erik Trimble wrote: Eric D. Mudama wrote: On Tue, Dec 29 at 9:16, Brad wrote: The disk cost of a raidz pool of mirrors is identical to the disk cost of raid10. ZFS can't do a raidz of mirrors or a mirror of raidz. Members of a mirror or raidz[123] must be a

Re: [zfs-discuss] repost - high read iops

2009-12-29 Thread Ross Walker
On Dec 29, 2009, at 12:36 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Tue, 29 Dec 2009, Ross Walker wrote: A mirrored raidz provides redundancy at a steep cost to performance and might I add a high monetary cost. I am not sure what a mirrored raidz is. I have never heard

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Richard Elling
Hi Brad, comments below... On Dec 27, 2009, at 10:24 PM, Brad wrote: Richard - the l2arc is c1t13d0. What tools can be use to show the l2arc stats? raidz1 2.68T 580G543453 4.22M 3.70M c1t1d0 - -258102 689K 358K c1t2d0 - -256

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Brad
Try an SGA more like 20-25 GB. Remember, the database can cache more effectively than any file system underneath. The best I/O is the I/O you don't have to make. We'll be turning up the SGA size from 4GB to 16GB. The arc size will be set from 8GB to 4GB. This can be a red herring. Judging by the

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Richard Elling
On Dec 28, 2009, at 12:40 PM, Brad wrote: Try an SGA more like 20-25 GB. Remember, the database can cache more effectively than any file system underneath. The best I/O is the I/O you don't have to make. We'll be turning up the SGA size from 4GB to 16GB. The arc size will be set from 8GB to

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Brad
This doesn't make sense to me. You've got 32 GB, why not use it? Artificially limiting the memory use to 20 GB seems like a waste of good money. I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because their philosophy is, no matter what eventually you'll have

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Bob Friesenhahn
On Mon, 28 Dec 2009, Brad wrote: I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because their philosophy is, no matter what eventually you'll have to hit disk to pick up data thats not stored in cache (arc or l2arc). The typical database server in our

Re: [zfs-discuss] repost - high read iops

2009-12-28 Thread Richard Elling
On Dec 28, 2009, at 1:40 PM, Brad wrote: This doesn't make sense to me. You've got 32 GB, why not use it? Artificially limiting the memory use to 20 GB seems like a waste of good money. I'm having a hard time convincing the dbas to increase the size of the SGA to 20GB because their

Re: [zfs-discuss] repost - high read iops

2009-12-27 Thread Richard Elling
OK, I'll take a stab at it... On Dec 26, 2009, at 9:52 PM, Brad wrote: repost - Sorry for ccing the other forums. I'm running into a issue where there seems to be a high number of read iops hitting disks and physical free memory is fluctuating between 200MB - 450MB out of 16GB total. We

Re: [zfs-discuss] repost - high read iops

2009-12-27 Thread Brad
Richard - the l2arc is c1t13d0. What tools can be use to show the l2arc stats? raidz1 2.68T 580G543453 4.22M 3.70M c1t1d0 - -258102 689K 358K c1t2d0 - -256103 684K 354K c1t3d0 - -258102 690K 359K