On Jul 2, 2016, at 1:18 AM, Pavel Raiskup <prais...@redhat.com> wrote:
> 
> There are optimizations in archivers (tar, rsync, ...) that rely on up2date
> st_blocks info.  For example, in GNU tar there is optimization check [1]
> whether the 'st_size' reports more data than the 'st_blocks' can hold --> then
> tar considers that file is sparse (and does additional steps).
> 
> It looks like btrfs doesn't show correct value in 'st_blocks' until the data
> are synced.  ATM, there happens that:
> 
>    a) some "tool" creates sparse file
>    b) that tool does not sync explicitly and exits ..
>    c) tar is called immediately after that to archive the sparse file
>    d) tar considers [2] the file is completely sparse (because st_blocks is
>       zero) and archives no data.  Here comes data loss.
> 
> Because we fixed 'btrfs' to report non-zero 'st_blocks' when the file data is
> small and is in-lined (no real data blocks) -- I consider this is too bug in
> btrfs worth fixing.

We had a similar problem with both ext4 and Lustre - the client was reporting
zero blocks due to delayed allocation until data was written to disk.  While
those problems were fixed in the filesystem to report an estimate of the block
count before any blocks were actually written to disk, it seems like this may
be a problem that will come up again with other filesystems in the future.

I think in addition to fixing btrfs (because it needs to work with existing
tar/rsync/etc. tools) it makes sense to *also* fix the heuristics of tar
to handle this situation more robustly.  One option is if st_blocks == 0 then
tar should also check if st_mtime is less than 60s in the past, and if yes
then it should call fsync() on the file to flush any unwritten data to disk,
or assume the file is not sparse and read the whole file, so that it doesn't
incorrectly assume that the file is sparse and skip archiving the file data.

Cheers, Andreas

> 
> [1] 
> http://git.savannah.gnu.org/cgit/paxutils.git/tree/lib/system.h?id=ec72abd9dd63bbff4534ec77e97b1a6cadfc3cf8#n392
> [2] 
> http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c?id=ac065c57fdc1788a2769fb119ed0c8146e1b9dd6#n273
> 
> Tested on kernel:
> kernel-4.5.7-300.fc24.x86_64
> 
> Originally reported here, reproducer available there:
> https://bugzilla.redhat.com/show_bug.cgi?id=1352061
> 
> Pavel
> 
> 


Cheers, Andreas





Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to