Re: [PATCH] fix random failures in shell/integrity.sh

Mikulas Patocka Thu, 07 Aug 2025 07:08:53 -0700


On Thu, 7 Aug 2025, Stuart D Gathman wrote:

> On Wed, 6 Aug 2025, John Stoffel wrote:
> 
> > > > > > > "Mikulas" == Mikulas Patocka <mpato...@redhat.com> writes:
> > 
> > > The problem is that the raid1 implementation may freely choose which leg
> > > to read from. If it chooses to read from the non-corrupted leg, the
> > > corruption is not detected, the number of mismatches is not incremented
> > > and the test reports this as a failure.
> > 
> > So wait, how is integrity supposed to work in this situation then?  In
> > real life?  I understand the test is hard, maybe doing it in a loop
> > three times?  Or configure the RAID1 to prefer one half over another
> > is the way to make this test work?

If you want to make sure that you detect (and correct) all mismatches, you 
have to scrub the raid array.

> Linux needs an optional parameter to read() syscall that is "leg index"
> for the blk interface.  Thus, btrfs scrub can check all legs, and this
> test can check all legs.  Filesystems with checks can repair corruption
> by rewriting the block after finding a leg with correct csum.
> 
> This only needs a few bits (how many legs can there be?), so can go in
> the FLAGS argument.

I think that adding a new bit for the read syscalls is not a workable 
solition. There are so many programs using the read() syscall and teaching 
them to use this new bit is impossible.

Mikulas

Re: [PATCH] fix random failures in shell/integrity.sh

Reply via email to