I don't know about fletcher, but I'm working on crc32 based
checksumming for soft raid1. The basic implementation is ready but I'm
not satisfied with write performance in some cases: small files, lots
of collisions in chksumming blocks etc. Worst case I see 6-7x slower
performance here in comparison with plain RAID1. I've tried to make
that situation better with the chksumming blocks cache on which I've
been working last few weekends, but still this is not right and while
using 32k blocks fs the improvements are not worth the much higher
complexity of the code, so I'll probably switch to scrub hacking which
is something you usually need in case of chksumming anyway. :-)
On the bright side: code "self-heal" bad block happily and refuse to
push you bad data in case of errors on all chunks. Also due to
simplicity of design if something runs really badly you still can
detach drive and attach it as a plain RAID1 and get your data out.
W.r.t. performance read is on 70-80% of plain RAID1 and write of big
data (>=32k on 32k block fs) is about 70% of plain RAID1. Also
PostgreSQL pgbench is about 70% of speed of RAID1 (again on 32k block
fs). Just small files sucks. W.r.t. fletcher, I think we don't need it
and still will be able to detect moved block. That's due to layout
which is really simple: <SR metadata><data area><chksum area>.

Are you willing to test the code on your setup? If so, I can save the
patch somewhere for you but well, my tree is month old or so if you
don't mind...

PS: all performance figures got on haswell based server with 2 WD Re
512 bytes sector (physical size) drives. So your numbers may vary and
I'm certainly interested to know them -- if you benchmark.

On Tue, Dec 1, 2015 at 6:31 PM, Tinker <[email protected]> wrote:
> Hi!
>
> I heard someone was working with implementing Fletcher checksums in
> softraid.
>
> Do you know any updates on this?
>
>
>
> Fletcher checksums are how OpenBSD would guarantee that the data you read
> from disk actually has integrity. What makes it different from traditional
> checksumming is that it not only guarantees that a sector/block of data read
> has integrity within itself, but also that it actually belonged in the place
> on the disk that it was read from.
>
> This is of particular importance when having sensitive information on disks
> with sector mapping, like all SSD:s (and even magnet disks, or??) have,
> which can break down.
>
> For this reason, with ordinary filesystems, reading file contents could give
> you just about any data that's anywhere on the disk, while a Fletcher-based
> disk would give you a read error.
>
> So it's really like a night and day difference.
>
> https://en.wikipedia.org/wiki/Fletcher%27s_checksum
>
> Thanks!
> Tinker

Reply via email to