On Sun, May 3, 2020 at 5:32 PM antlists <antli...@youngman.org.uk> wrote:
>
> On 03/05/2020 21:07, Rich Freeman wrote:
> > I don't think you should focus so much on whether read=write in your
> > RAID.  I'd focus more on whether read and write both meet your
> > requirements.
>
> If you think about it, it's obvious that raid-1 will read faster than it
> writes - it has to write two copies while it only reads one.

Yes.  The same is true for RAID10, since it has to also write two
copies of everything.

>
> Likewise, raids 5 and 6 will be slower writing than reading - for a
> normal read it only reads the data disks, but when writing it has to
> write (and calculate!) parity as well.

Yes, but with any of the striped modes (0, 5, 6, 10) there is an
additional issue.  Writes have to generally be made in entire stripes,
so if you overwrite data in-place in units smaller than an entire
stripe, then the entire stripe needs to first be read, and then it can
be overwritten again.  This is an absolute requirement if there is
parity involved.  If there is no parity (RAID 0,10) then an
implementation might be able to overwrite part of a stripe in place
without harming the rest.

>
> A raid 1 should read data faster than a lone disk. A raid 5 or 6 should
> read noticeably faster because it's reading across more than one disk.

More-or-less.  RAID 1 is going to generally benefit from lower latency
because reads can be divided across mirrored copies (and there could
be more than one replica).  Any of the striped modes are going to be
the same as a single disk on latency, but will have much greater
bandwidth.  That bandwidth gain applies to both reading and writing,
as long as the data is sequential.

This is why it is important to understand your application.  There is
no one "best" RAID implementation.  They all have pros and cons
depending on whether you care more about latency vs bandwidth and also
read vs write.

And of course RAID isn't the only solution out there for this stuff.
Distributed filesystems also have pros and cons, and often those have
multiple modes of operation on top of this (usually somewhat mirroring
the options available for RAID but across multiple hosts).

For general storage I'm using zfs with raid1 pairs of disks (the pool
can have multiple pairs), and for my NAS for larger-scale media/etc
storage I'm using lizardfs.  I'd use ceph instead in any kind of
enterprise setup, but that is much more RAM-hungry and I'm cheap.

-- 
Rich

Reply via email to