On Wednesday 17 June 2015, Karel Gardas wrote:
> Hello,
>
> I'm curious if anybody is working on implementing block-level
> checksumming on softraid?
Not that I'm aware of.
> Backgroud: I'm comming from Solaris 11/ZFS world and I like ZFS's
> focus on data integrity from drive level up to the RAM. I've been
> thinking about OpenBSD and how to get the same with minimalistic
> effort (not porting ZFS) and I've though that having checksumming
> implemented in a virtual drive (softraid) may be the most easiest way.
> I think more easier than for example enhance ffs to include file-based
> checksums. Another issue is how to propagate block failures up to the
> file level, but for me it would be enough to just know something bad
> happening with the drive. At least for now. I hope discipline may be
> stacked on top of another so there is a possibility of using RAID1
> with checksumming disciplines on two drives, hence getting something
> similar to what I use now with ZFS (zpool with two drives in mirroring
> setup). If stacking is not possible for whatever reason, then I would
> probably go and clone and modify RAID1 to add checksum support (if
> feasible of course).
Stacking in softraid does work, but it is not officially supported (there are
number of gotchas that you need to be aware of, such as the need to manually
reassemble the volumes). It was never really designed to work this way and it
results in I/O going through multiple layers unnecessarily. At some point it
needs to be rearchitected so that it is stackable internally, which will then
allow for a set of fixed but flexible disciplines.
Re adding some form of checksumming, it only seems to make sense in the case
of RAID 1 where you can decide that the data on a disk is invalid, then fail
the read and pull the data from another drive. That coupled with block
level "healing" or similar could be interesting. Otherwise checksumming on
its own is not overly useful at this level - you would simply fail a read,
which then results in potentially worse than bit-flipping at higher layers.
If you wanted to investigate this I would suggest considering it as an option
to the existing RAID 1 implementation. The bulk of it would be calculating
and adding a checksum to each write and offsetting each block accordingly,
along with verification on read. The failure modes would need to be thought
through and handled - the re-reading from a different disk is already there,
however what you then do with the failure is an open question (failing the
chunk entirely is the heavy handed but already supported approach).
> Any comment on this topic welcome.
>
> Thanks,
> Karel
--
"Action without study is fatal. Study without action is futile."
-- Mary Ritter Beard