On 03/14/2018 01:02 PM, Austin S. Hemmelgarn wrote:
[...]
>>
>> In btrfs, a checksum mismatch creates an -EIO error during the reading. In a 
>> conventional filesystem (or a btrfs filesystem w/o datasum) there is no 
>> checksum, so this problem doesn't exist.
>>
>> I am curious how ZFS solves this problem.
> It doesn't support disabling COW or the O_DIRECT flag, so it just never has 
> the problem in the first place.

I would like to perform some tests: however I think that you are right. if you 
make a "double buffering" approach (copy the data in the page cache, compute 
the checksum, then write the data to disk), the mismatch should not happen. Of 
course this is incompatible with O_DIRECT; but disabling O_DIRECT is a 
prerequisite to the "double buffering"; alone it couldn't be sufficient; what 
about mmap ? Are we sure that this does a double buffering ?

I would prefer that btrfs doesn't allow O_DIRECT with the COW files. I prefer 
this to the checksum mismatch bug.


>>
>> However I have to point out that this problem is not solved by the COW. COW 
>> solved only the problem about an interrupted commit of the filesystem, where 
>> the data is update in place (so it is available by the user), but the 
>> metadata not.
> COW is irrelevant if you're bypassing it.  It's only enforced for metadata so 
> that you don't have to check the FS every time you mount it (because the way 
> BTRFS uses it guarantees consistency of the metadata).
>>
>>>
>>> Even if not... I should be only a problem in case of a crash during
>>> that,.. and than I'd still prefer to get the false positive than bad
>>> data.
>>
>> How you can know if it is a "bad data" or a "bad checksum" ?
> You can't directly.  Just like you can't know which copy in a two-device MD 
> RAID1 array is bad when they mismatch.
> 
> That's part of why I'm not all that fond of the idea of having checksums 
> without COW, you need to verify the data using secondary means anyway, so why 
> exactly should you waste time verifying it twice?

This is true

>>
>>>
>>> Anyway... it's not going to happen so the discussion is pointless.
>>> I think people can probably use dm-integrity (which btw: does no CoW
>>> either (IIRC) and still can provide integrity... ;-) ) to see whether
>>> their data is valid.
>>> No nice but since it won't change on btrfs, a possible alternative.
>>
>> Even in this case I am curious about dm-integrity would sole this issue.
> dm-integrity uses journaling, and actually based on the testing I've done, 
> will typically have much worse performance than the overhead of just enabling 
> COW on files on BTRFS and manually defragmenting them on a regular basis.

Good to know
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to