On Tue, 2017-08-15 at 07:37 -0400, Austin S. Hemmelgarn wrote: > Go look at Chrome, or Firefox, or Opera, or any other major web > browser. > At minimum, they will safely bail out if they detect corruption in > the > user profile and can trivially resync the profile from another system > if > the user has profile sync set up.
Aha,... I'd rather see a concrete reference to some white paper or code, where one can really see that these programs actually *do* their own checksumming. But even from what you claim here now (that they'd only detect the corruption and then resync from another system - which is nothing else than recovering from a backup), I wouldn't see the big problem with EIO. > Go take a look at any enterprise > database application from a reasonable company, it will almost > always > support replication across systems and validate data it reads. Okay, I already showed you, that PostgreSQL, MySQL, BDB, sqlite can't or don't do per default... so which do you mean with the enterprise DB (Oracle?) and where's the reference that shows that they really do general checksuming? And that EIO would be a problem for their recovery strategies? And again, we're not talking about the WALs (or whatever these programs call it) which are there to handle a crash... we are talking about silent data corruption. > Agreed, but there's also the counter argument for log files that > most > people who are not running servers rarely (if ever) look at old > logs, > and it's the old logs that are the most likely to have at rest > corruption (the longer something sits idle on media, the more likely > it > will suffer from a media error). I wouldn't have any valid prove that it's really the "idle" data, which is the most likely one to have silent corruptions (at least not for all types of storage medium), but even if this is the case as you say... then it's probably more likely to hit the /usr/ /lib/ and so on stuff on stable distros... logs are typically rotated and then at least once re-written (when compressed). > Go install OpenSUSE in a VM. Look at what filesystem it uses. Go > install Solaris in a VM, lo and behold it uses ZFS _with no option > for > anything else_ as it's root filesystem. Go install a recent version > of > Windows server in a VM, notice that it also has the option of a > properly > checked filesystem (ReFS). Go install FreeBSD in a VM, notice that > it > provides the option (which is actively recommended by many people > who > use FreeBSD) to install with root on ZFS. Install Android or Chrome > OS > (or AOSP or Chromium OS) in a VM. Root the system and take a look > at > the storage stack, both of them use dm-verity, and Android (and > possibly > Chrome OS too, not 100% certain) uses per-file AEAD through the VFS > encryption API on encrypted devices. So your argument for not adding support for this is basically: People don't or shouldn't use btrfs for this? o.O > The fact that some OS'es blindly > trust the underlying storage hardware is not our issue, it's their > issue, and it shouldn't be 'fixed' by BTRFS because it doesn't just > affect their customers who run the OS in a VM on BTRFS. Then you can probably drop checksumming from btrfs altogether. And with the same "argument" any other advanced feature. For resilience there is hardware RAID or Linux' MD raid... so no need to keep it in btrfs o.O > Most enterprise database apps offer support for > replication, > and quite a few do their own data validation when reading from the > database. First of all,... replication != the capability to detect silent data corruption. You still haven't named a single one which does checksumming per default. At least those which are quite popular in the FLOSS world all don't seem to do. Cheers, Chris.
smime.p7s
Description: S/MIME cryptographic signature