On Thu, Jun 23, 2016 at 06:26:22PM -0600, Chris Murphy wrote:
> On Thu, Jun 23, 2016 at 1:32 PM, Goffredo Baroncelli <kreij...@inwind.it> 
> wrote:
> > The raid5 write hole is avoided in BTRFS (and in ZFS) thanks to the 
> > checksum.
> 
> Yeah I'm kinda confused on this point.
> 
> https://btrfs.wiki.kernel.org/index.php/RAID56
> 
> It says there is a write hole for Btrfs. But defines it in terms of
> parity possibly being stale after a crash. I think the term comes not
> from merely parity being wrong but parity being wrong *and* then being
> used to wrongly reconstruct data because it's blindly trusted.

I think the opposite is more likely, as the layers above raid56
seem to check the data against sums before raid56 ever sees it.
(If those layers seem inverted to you, I agree, but OTOH there are
probably good reason to do it that way).

It looks like uncorrectable failures might occur because parity is
correct, but the parity checksum is out of date, so the parity checksum
doesn't match even though data blindly reconstructed from the parity
*would* match the data.

> I don't read code well enough, but I'd be surprised if Btrfs
> reconstructs from parity and doesn't then check the resulting
> reconstructed data to its EXTENT_CSUM.

I wouldn't be surprised if both things happen in different code paths,
given the number of different paths leading into the raid56 code and
the number of distinct failure modes it seems to have.

Attachment: signature.asc
Description: Digital signature

Reply via email to