But csum verification is a point in verification and its not a
  tree based transid verification. Which means if there is a stale data
  with matching csum we may return a junk data silently.

Then the normal idea is to use stronger but slower csum in the first
place, to avoid the csum match case.

 This is just a general observational comment, its ok lets assume
 current point in csum verification works (as opposed to tree based
 parent transid verification).

  This problem is
  easily reproducible when csum is disabled but not impossible to achieve
  when csum is not disabled as well.

Under this case, it's the user to be blamed for the decision to disable
the csum in the first place.

 The point here is. The logic isn't aware of the write hole on the other
 disk on which the metadata is not verified. I disagree that nocsum or
 the user to be blamed.


A tree based integrity verification
  is important for all data, which is missing.
Fix:
    In this RFC patch it proposes to use same disk from with the metadata
  is read to read the data.

The obvious problem I found is, the idea only works for RAID1/10.

For striped profile it makes no sense, or even have a worse chance to
get stale data.


To me, the idea of using possible better mirror makes some sense, but
very profile limited.

 Yep. This problem and fix is only for the mirror based profiles
 such as raid1/raid10.


Another idea I get inspired from the idea is, make it more generic so
that bad/stale device get a lower priority.

 When it comes to reading junk data, its not about the priority its
 about the eliminating. When the problem is only few blocks, I am
 against making the whole disk as bad.

Although it suffers the same problem as I described.

To make the point short, the use case looks very limited.

 It applies to raid1/raid10 with nodatacow (which implies nodatasum).
 In my understanding that's not rare.

 Any comments on the fix offered here?

Thanks, Anand


Thanks,
Qu

Reply via email to