But csum verification is a point in verification and its not a
tree based transid verification. Which means if there is a stale data
with matching csum we may return a junk data silently.
Then the normal idea is to use stronger but slower csum in the first
place, to avoid the csum match case.
This is just a general observational comment, its ok lets assume
current point in csum verification works (as opposed to tree based
parent transid verification).
This problem is
easily reproducible when csum is disabled but not impossible to achieve
when csum is not disabled as well.
Under this case, it's the user to be blamed for the decision to disable
the csum in the first place.
The point here is. The logic isn't aware of the write hole on the other
disk on which the metadata is not verified. I disagree that nocsum or
the user to be blamed.
A tree based integrity verification
is important for all data, which is missing.
Fix:
In this RFC patch it proposes to use same disk from with the metadata
is read to read the data.
The obvious problem I found is, the idea only works for RAID1/10.
For striped profile it makes no sense, or even have a worse chance to
get stale data.
To me, the idea of using possible better mirror makes some sense, but
very profile limited.
Yep. This problem and fix is only for the mirror based profiles
such as raid1/raid10.
Another idea I get inspired from the idea is, make it more generic so
that bad/stale device get a lower priority.
When it comes to reading junk data, its not about the priority its
about the eliminating. When the problem is only few blocks, I am
against making the whole disk as bad.
Although it suffers the same problem as I described.
To make the point short, the use case looks very limited.
It applies to raid1/raid10 with nodatacow (which implies nodatasum).
In my understanding that's not rare.
Any comments on the fix offered here?
Thanks, Anand
Thanks,
Qu