On Dec 7, 2007 7:09 AM, Greg Freemyer <[EMAIL PROTECTED]> wrote: > On Dec 6, 2007 1:26 PM, Chris Worley <[EMAIL PROTECTED]> wrote: > > > > On Dec 5, 2007 1:50 PM, Greg Freemyer <[EMAIL PROTECTED]> wrote: > <snip> > > > > > > Single threaded access to a raid array may not be helped by adding > > > drives. Drive access can end up being sequential and your not really > > > buying anything. > > > > > > Multi-threaded storage performance is definitely positively affected > > > by adding disks to an array. > > > > > > For multi-threaded, effectively each disk can do N IOPS (IOs per Second.) > > > > > > So if you have M drives, you can do M*N IOPS. > > > > > > The trouble with Raid 5 is that it typically requires 4 IOs to update > > > a single sector. > > > > > > ie. > > > Read checksum, > > > Read original sector, (so you can remove it from the checksum) > > > write updated sector > > > write new checksum. > > > > > > So it ends up being M*N / 4 IOPS. > > Greg, > > > > Doesn't that assume a sector/block mismatch? If your sectors and > > blocks are aligned (sectors are some multiple of blocks), then no > > read-mask-write is necessary. > > > > Even if there is a misalignment, if the amount of data being written > > is large, the read-mask-write operation is only at the beginning and > > tail ends of the entire operation. > > The above does not assume misalignment. It think what your talking > about is if your are doing a large write that spans the entire raid5 > stripe, then the existing parity data can be ignored. Linux is smart > enough to do this, but raid5 stripes are pretty large. Typically 64K > * (M - 1) I believe. So if you have a 5-disk raid 5, your entire > stripe is 256KB. And that ignores alignment issues you mention, that > means to guarantee a full stripe is written you need to write 512KB at > a time. Not many programs do that from user space. I'm not sure how > efficient the Linux kernel is a coalescing individual sequential > writes to a raid5 array and trying to create full stripe updates.
Granted: in my line of work, an app doing a single 1MB read/write call is small; anything smaller would be too trivial to mention. > > > Also, the writes are all in parallel. The above makes it sound like > > the writes of updated stripes, and the write of the checksum are > > serial... they should all be posted nearly simultaneously (some > > serialization introduced by the CPU). > > That above is a max throughput calculation, not an individual write > calculation. ie. Which is faster a sports car or a sem-itruck. The > semi-truck is if you have lots to move, so it effectively has a higher > throughput than a sports car. (but nowhere near as fast for small > loads). > > So the above assumes a busy server with lots going on. ie every disk > in the array is running at full capacity. The IOPS is obviously > effected by the workload and the seeking, but once the workload is > set, the IOPS per disk can be characterized and used to feed the > equation. > > > > > > > > > So from a performance perspective on _writes_ you need at least a 4 > > > drive array just to be as fast as a single disk. > > > > > > Reads OTOH just need to read the sector they want (unless you have a > > > failed drive). > > > > > > So _read_ performance is M*N. Or always faster than a single drive. > > > > > > > On a RAID5 you only need M-1 (or M-2 for RAID6) completions of > > parallel operations... you can discard the slowest disks results, as > > that can be recreated without all the data. > > No idea what you meant there. In a non-degraded raid5 every drive has > valid, non-parity data on it. If you have a heavy multi-threaded read > load, all disks can be actively providing valid data at one time. i.e > M * IOPS If "M" is the number of disks, and you are, for example, reading 1 stride, then, in a RAID5, you only need to get the stripes from M-1 disks, and you can complete the single stride I/O w/o having yet received the Mth stripe, which you can discard when it shows up. Chris > > > Greg > -- > Greg Freemyer > Litigation Triage Solutions Specialist > http://www.linkedin.com/in/gregfreemyer > First 99 Days Litigation White Paper - > http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf > > The Norcross Group > The Intersection of Evidence & Technology > http://www.norcrossgroup.com > -- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
