Re: [zfs-discuss] Large scale performance query

2011-08-09 Thread Richard Elling
On Aug 8, 2011, at 4:01 PM, Peter Jeremy wrote: On 2011-Aug-08 17:12:15 +0800, Andrew Gabriel andrew.gabr...@oracle.com wrote: periodic scrubs to cater for this case. I do a scrub via cron once a week on my home system. Having almost completely filled the pool, this was taking about 24

Re: [zfs-discuss] Large scale performance query

2011-08-08 Thread Andrew Gabriel
Alexander Lesle wrote: And what is your suggestion for scrubbing a mirror pool? Once per month, every 2 weeks, every week. There isn't just one answer. For a pool with redundancy, you need to do a scrub just before the redundancy is lost, so you can be reasonably sure the remaining data

Re: [zfs-discuss] Large scale performance query

2011-08-08 Thread Peter Jeremy
On 2011-Aug-08 17:12:15 +0800, Andrew Gabriel andrew.gabr...@oracle.com wrote: periodic scrubs to cater for this case. I do a scrub via cron once a week on my home system. Having almost completely filled the pool, this was taking about 24 hours. However, now that I've replaced the disks and

Re: [zfs-discuss] Large scale performance query

2011-08-07 Thread Alexander Lesle
Hello Bob Friesenhahn and List, On August, 06 2011, 20:41 Bob Friesenhahn wrote in [1]: I think that this depends on the type of hardware you have, how much new data is written over a period of time, the typical I/O load on the server (i.e. does scrubbing impact usability?), and how critical

Re: [zfs-discuss] Large scale performance query

2011-08-07 Thread Roy Sigurd Karlsbakk
The hardware ist SM-Board, Xeon, 16 GB reg RAM, LSI 9211-8i HBA, 6 x Hitachi 2TB Deskstar 5K3000 HDS5C3020ALA632. Server is standing in the basement by 32°C The HDs are filled to 80% and the workload ist only most reading. Whats the best? Scrubbing every week, every second week once a

Re: [zfs-discuss] Large scale performance query

2011-08-07 Thread Alexander Lesle
Hello Roy Sigurd Karlsbakk and List, On August, 07 2011, 19:27 Roy Sigurd Karlsbakk wrote in [1]: Generally, you can't scrub too often. If you have a set of striped mirrors, the scrub shouldn't take too long. The extra stress on the drives during scrub shouldn't matter much, drives are made

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Orvar Korvar
Ok, so mirrors resilver faster. But, it is not uncommon that another disk shows problem during resilver (for instance r/w errors), this scenario would mean your entire raid is gone, right? If you are using mirrors, and one disk crashes and you start resilver. Then the other disk shows r/w

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Mark Sandrock
Shouldn't the choice of RAID type also be based on the i/o requirements? Anyway, with RAID-10, even a second failed disk is not catastophic, so long as it is not the counterpart of the first failed disk, no matter the no. of disks. (With 2-way mirrors.) But that's why we do backups, right? Mark

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Edward Ned Harvey
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Orvar Korvar Ok, so mirrors resilver faster. But, it is not uncommon that another disk shows problem during resilver (for instance r/w errors), this scenario would mean your entire raid is

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Rob Cohen
I may have RAIDZ reading wrong here. Perhaps someone could clarify. For a read-only workload, does each RAIDZ drive act like a stripe, similar to RAID5/6? Do they have independant queues? It would seem that there is no escaping read/modify/write operations for sub-block writes, forcing the

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Rob Cohen
RAIDZ has to rebuild data by reading all drives in the group, and reconstructing from parity. Mirrors simply copy a drive. Compare 3tb mirros vs. 9x3tb RAIDZ2. Mirrors: Read 3tb Write 3tb RAIDZ2: Read 24tb Reconstruct data on CPU Write 3tb In this case, RAIDZ is at least 8x slower to

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Rob Cohen
I may have RAIDZ reading wrong here. Perhaps someone could clarify. For a read-only workload, does each RAIDZ drive act like a stripe, similar to RAID5/6? Do they have independant queues? It would seem that there is no escaping read/modify/write operations for sub-block writes,

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Bob Friesenhahn
On Sat, 6 Aug 2011, Orvar Korvar wrote: Ok, so mirrors resilver faster. But, it is not uncommon that another disk shows problem during resilver (for instance r/w errors), this scenario would mean your entire raid is gone, right? If you are using mirrors, and one disk crashes and you start

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Bob Friesenhahn
On Sat, 6 Aug 2011, Rob Cohen wrote: I may have RAIDZ reading wrong here. Perhaps someone could clarify. For a read-only workload, does each RAIDZ drive act like a stripe, similar to RAID5/6? Do they have independant queues? They act like a stripe like in RAID5/6. It would seem that

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Bob Friesenhahn
On Sat, 6 Aug 2011, Rob Cohen wrote: Can RAIDZ even do a partial block read? Perhaps it needs to read the full block (from all drives) in order to verify the checksum. If so, then RAIDZ groups would always act like one stripe, unlike RAID5/6. ZFS does not do partial block reads/writes.

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Rob Cohen
@opensolaris.org Subject: Re: [zfs-discuss] Large scale performance query On Sat, 6 Aug 2011, Rob Cohen wrote: Can RAIDZ even do a partial block read? Perhaps it needs to read the full block (from all drives) in order to verify the checksum. If so, then RAIDZ groups would always act like one stripe

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Alexander Lesle
Hello Bob Friesenhahn and List, On August, 06 2011, 18:34 Bob Friesenhahn wrote in [1]: Those using mirrors or raidz1 are best advised to perform periodic scrubs. This helps avoid future media read errors and also helps flush out failing hardware. And what is your suggestion for scrubbing

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Alexander Lesle
Hello Rob Cohen and List, On August, 06 2011, 17:32 Rob Cohen wrote in [1]: In this case, RAIDZ is at least 8x slower to resilver (assuming CPU and writing happen in parallel). In the mean time, performance for the array is severely degraded for RAIDZ, but not for mirrors. Aside from

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Roy Sigurd Karlsbakk
How much time needs the thread opener with his config? Technical Specs: 216x 3TB 7k3000 HDDs 24x 9 drive RAIDZ3 I suggest resilver need weeks and the chance that a second or third HD crashs in that time is high. Murphy’s Law With a full pool, perhaps a couple of weeks, but unless the

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Bob Friesenhahn
On Sat, 6 Aug 2011, Rob Cohen wrote: Perhaps you are saying that they act like stripes for bandwidth purposes, but not for read ops/sec? Exactly. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Bob Friesenhahn
On Sat, 6 Aug 2011, Alexander Lesle wrote: Those using mirrors or raidz1 are best advised to perform periodic scrubs. This helps avoid future media read errors and also helps flush out failing hardware. And what is your suggestion for scrubbing a mirror pool? Once per month, every 2 weeks,

Re: [zfs-discuss] Large scale performance query

2011-08-06 Thread Rob Cohen
If I'm not mistaken, a 3-way mirror is not implemented behind the scenes in the same way as a 3-disk raidz3. You should use a 3-way mirror instead of a 3-disk raidz3. RAIDZ2 requires at least 4 drives, and RAIDZ3 requires at least 5 drives. But, yes, a 3-way mirror is implemented totally

Re: [zfs-discuss] Large scale performance query

2011-08-05 Thread Orvar Korvar
Is mirrors really a realistic alternative? I mean, if I have to resilver a raid with 3TB discs, it can take days I suspect. With 4TB disks it can take a week, maybe. So, if I use mirror and one disk break, then I only have single redundancy while the mirror repairs. Reparation will take long

Re: [zfs-discuss] Large scale performance query

2011-08-05 Thread Ian Collins
On 08/ 6/11 10:42 AM, Orvar Korvar wrote: Is mirrors really a realistic alternative? To what? Some context would be helpful. I mean, if I have to resilver a raid with 3TB discs, it can take days I suspect. With 4TB disks it can take a week, maybe. So, if I use mirror and one disk break,

Re: [zfs-discuss] Large scale performance query

2011-08-05 Thread Rob Cohen
Generally, mirrors resilver MUCH faster than RAIDZ, and you only lose redundancy on that stripe, so combined, you're much closer to RAIDZ2 odds than you might think, especially with hot spare(s), which I'd reccommend. When you're talking about IOPS, each stripe can support 1 simultanious user.

Re: [zfs-discuss] Large scale performance query

2011-08-04 Thread Rob Cohen
Try mirrors. You will get much better multi-user performance, and you can easily split the mirrors across enclosures. If your priority is performance over capacity, you could experiment with n-way mirros, since more mirrors will load balance reads better than more stripes. -- This message

Re: [zfs-discuss] Large scale performance query

2011-07-31 Thread Evgueni Martynov
On 25/07/2011 2:34 AM, Phil Harrison wrote: Hi All, Hoping to gain some insight from some people who have done large scale systems before? I'm hoping to get some performance estimates, suggestions and/or general discussion/feedback. I cannot discuss the exact specifics of the purpose but will

Re: [zfs-discuss] Large scale performance query

2011-07-26 Thread Rocky Shek
Phil, Recently, we have built a large configuration on 4 way Xeon sever with 8 4U 24 Bay JBOD. We are using 2x LSI 6160 SAS switch so we can easy to expand the Storage in the future. 1) If you are planning to expand your storage, you should consider using LSI SAS switch for easy

Re: [zfs-discuss] Large scale performance query

2011-07-25 Thread Orvar Korvar
Wow. If you ever finish this monster, I would really like to hear more about the performance and how you connected everything. Could be useful as a reference for anyone else building big stuff. *drool* -- This message posted from opensolaris.org ___

Re: [zfs-discuss] Large scale performance query

2011-07-25 Thread Roberto Waltman
Phil Harrison wrote: Hi All, Hoping to gain some insight from some people who have done large scale systems before? I'm hoping to get some performance estimates, suggestions and/or general discussion/feedback. No personal experience, but you may find this useful: Petabytes on a budget

Re: [zfs-discuss] Large scale performance query

2011-07-25 Thread Tiernan OToole
they dont go into too much detail on their setup, and they are not running Solaris, but they do mention how their SATA cards see different drives, based on where they are placed they also have a second revision at

Re: [zfs-discuss] Large scale performance query

2011-07-25 Thread Brandon High
On Sun, Jul 24, 2011 at 11:34 PM, Phil Harrison philha...@gmail.com wrote: What kind of performance would you expect from this setup? I know we can multiple the base IOPS by 24 but what about max sequential read/write? You should have a theoretical max close to 144x single-disk throughput.

Re: [zfs-discuss] Large scale performance query

2011-07-25 Thread Roy Sigurd Karlsbakk
Workloads: Mainly streaming compressed data. That is, pulling compressed data in a sequential manner however could have multiple streams happening at once making it somewhat random. We are hoping to have 5 clients pull 500Mbit sustained. That shouldn't be much of a problem with that amount

Re: [zfs-discuss] Large scale performance query

2011-07-25 Thread Roy Sigurd Karlsbakk
Even with a controller per JBOD, you'll be limited by the SAS connection. The 7k3000 has throughput from 115 - 150 MB/s, meaning each of your JBODs will be capable of 5.2 GB/sec - 6.8 GB/sec, roughly 10 times the bandwidth of a single SAS 6g connection. Use multipathing if you can to increase