On Tue, 2017-08-15 at 07:37 -0400, Austin S. Hemmelgarn wrote:
> Go look at Chrome, or Firefox, or Opera, or any other major web
> browser. 
>   At minimum, they will safely bail out if they detect corruption in
> the 
> user profile and can trivially resync the profile from another system
> if 
> the user has profile sync set up.

Aha,... I'd rather see a concrete reference to some white paper or
code, where one can really see that these programs actually *do* their
own checksumming.
But even from what you claim here now (that they'd only detect the
corruption and then resync from another system - which is nothing else
than recovering from a backup), I wouldn't see the big problem with
EIO.


> Go take a look at any enterprise 
> database application from a reasonable company, it will almost
> always 
> support replication across systems and validate data it reads.

Okay, I already showed you, that PostgreSQL, MySQL, BDB, sqlite can't
or don't do per default... so which do you mean with the enterprise DB
(Oracle?) and where's the reference that shows that they really do
general checksuming? And that EIO would be a problem for their recovery
strategies?

And again, we're not talking about the WALs (or whatever these programs
call it) which are there to handle a crash... we are talking about
silent data corruption.



> Agreed, but there's also the counter argument for log files that
> most 
> people who are not running servers rarely (if ever) look at old
> logs, 
> and it's the old logs that are the most likely to have at rest 
> corruption (the longer something sits idle on media, the more likely
> it 
> will suffer from a media error).

I wouldn't have any valid prove that it's really the "idle" data, which
is the most likely one to have silent corruptions (at least not for all
types of storage medium), but even if this is the case as you say...
then it's probably more likely to hit the /usr/ /lib/ and so on stuff
on stable distros... logs are typically rotated and then at least once
re-written (when compressed).


> Go install OpenSUSE in a VM.  Look at what filesystem it uses.  Go 
> install Solaris in a VM, lo and behold it uses ZFS _with no option
> for 
> anything else_ as it's root filesystem.  Go install a recent version
> of 
> Windows server in a VM, notice that it also has the option of a
> properly 
> checked filesystem (ReFS).  Go install FreeBSD in a VM, notice that
> it 
> provides the option (which is actively recommended by many people
> who 
> use FreeBSD) to install with root on ZFS.  Install Android or Chrome
> OS 
> (or AOSP or Chromium OS) in a VM.  Root the system and take a look
> at 
> the storage stack, both of them use dm-verity, and Android (and
> possibly 
> Chrome OS too, not 100% certain) uses per-file AEAD through the VFS 
> encryption API on encrypted devices.

So your argument for not adding support for this is basically:
People don't or shouldn't use btrfs for this? o.O



>   The fact that some OS'es blindly 
> trust the underlying storage hardware is not our issue, it's their 
> issue, and it shouldn't be 'fixed' by BTRFS because it doesn't just 
> affect their customers who run the OS in a VM on BTRFS.

Then you can probably drop checksumming from btrfs altogether. And with
the same "argument" any other advanced feature.
For resilience there is hardware RAID or Linux' MD raid... so no need
to keep it in btrfs o.O


> Most enterprise database apps offer support for
> replication, 
> and quite a few do their own data validation when reading from the 
> database.
First of all,... replication != the capability to detect silent data
corruption.

You still haven't named a single one which does checksumming per
default. At least those which are quite popular in the FLOSS world all
don't seem to do.



Cheers,
Chris.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to