On Wed, 18.02.15 06:22, Andrei Borzenkov (arvidjaar at gmail.com <http://lists.freedesktop.org/mailman/listinfo/systemd-devel>) wrote:

>/В Wed, 18 Feb 2015 01:14:44 +0100 />/Zbigniew Jędrzejewski-Szmek <zbyszek at in.waw.pl <http://lists.freedesktop.org/mailman/listinfo/systemd-devel>> пишет: />//>/> On Tue, Feb 17, 2015 at 08:05:29PM +0100, Goffredo Baroncelli wrote: />/> > Hi Lennart, />/> > />/> > On 2015-02-16 23:59, Lennart Poettering wrote: />/> > > * journald now sets the special FS_NOCOW file flag for its />/> > > journal files. This should improve performance on btrfs, by />/> > > avoiding heavy fragmentation when journald's write-pattern />/> > > is used on COW file systems. It degrades btrfs' data />/> > > integrity guarantees for the files to the same levels as for />/> > > ext3/ext4 however. This should be OK though as journald does />/> > > its own data integrity checks and all its objects are />/> > > checksummed on disk. Also, journald should handle btrfs disk />/> > > full events a lot more gracefully now, by processing SIGBUS />/> > > errors, and not relying on fallocate() anymore. />/> > />/> > If I read correctly the code, the FS_NOCOW is a temporary workaround, i.e. />/> > when the file is closed (or rotated ?) the FS_NOCOW flags is unset again. />/> > It is true ? />/> Yes, but you miss the point in general. FS_NOCOW is set during the />/> entire time when the file is being written to, which could be months, />/> and then it is unset when the file will not be written to anymore. So />/> indeed, the file is not protected by btrfs checksums for the majority />/> of time, but journald does its own checksumming, so the contents are />/> protected in a different way. />/> />//>/btrfs checksumming theoretically allows you to transparently recover />/after media corruption if filesystem has redundancy (more than one copy />/of data). Journald checksum will probably detect corruption, but can it />/repair it? /
No it cannot.

But btrfs checksumming cannot fix things for you either if you lose
non-trivial amounts of data. It might be able to fix a few bits of
errors, but not non-trivial amounts. I mean, that's a simple property
of error correction codes: the more you want to be able to correct the
longer must your checksum be. Neither btrfs' nor journald's are
substantial enough to correct even a sector...

Lennart

--
Lennart Poettering, Red Hat

Hi Lennart,

it's correct, that checksums are not suitable to recover a file;
BUT when using btrfs RAID, checksums are used to determine which copy of the 
file is malformed.
(and restore it, if any redundant OK copy exists)

Using FS_NOCOW on journal files does prevent btrfs from restoring the journal, 
even if a sane copy would exist.
(i.e. hardware / drive failure.)
That probably means losing important data.

While this IMHO seems like a temporary workaround until btrfs autodefrag (on a 
per file basis) exist,
I'd rather make this configurable and surely not the default!

Do you have any further info or opinion on this?


Best regards,
Florian

_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to