Re: CoW behavior when writing same content
On Tue, Oct 9, 2018 at 11:25 AM, Andrei Borzenkov wrote: > 09.10.2018 18:52, Chris Murphy пишет: >>> In this case is root/big_file and snapshot/big_file still share the same >>> data? >> >> You'll be left with three files. /big_file and root/big_file will >> share extents, > > How comes they share extents? This requires --reflink, is it default now? Good catch. It's not the default. I meant to write that initially only root/big_file and snapshot/big_file have shared extents And the shared extents are lost when snapshot/big_file is "overwritten" by the copy into snapshot/ >> and snapshot/big_file will have its own extents. You'd >> need to copy with --reflink for snapshot/big_file to have shared >> extents with /big_file - or deduplicate. >> > This still overwrites the whole file in the sense original file content > of "snapshot/big_file" is lost. That new content happens to be identical > and that new content will probably be reflinked does not change the fact > that original file is gone. Agreed. -- Chris Murphy
Re: CoW behavior when writing same content
09.10.2018 18:52, Chris Murphy пишет: > On Tue, Oct 9, 2018 at 8:48 AM, Gervais, Francois > wrote: >> Hi, >> >> If I have a snapshot where I overwrite a big file but which only a >> small portion of it is different, will the whole file be rewritten in >> the snapshot? Or only the different part of the file? > If you overwrite the whole file, the whole file will be overwritten. > Depends on how the application modifies files. Many applications write > out a whole new file with a pseudorandom filename, fsync, then rename. > >> >> Something like: >> >> $ dd if=/dev/urandom of=/big_file bs=1M count=1024 >> $ cp /big_file root/ >> $ btrfs sub snap root snapshot >> $ cp /big_file snapshot/ >> And which portion of these three files is different? They must be identical. Not that it really matters, but that does not match your question. >> In this case is root/big_file and snapshot/big_file still share the same >> data? > > You'll be left with three files. /big_file and root/big_file will > share extents, How comes they share extents? This requires --reflink, is it default now? > and snapshot/big_file will have its own extents. You'd > need to copy with --reflink for snapshot/big_file to have shared > extents with /big_file - or deduplicate. > This still overwrites the whole file in the sense original file content of "snapshot/big_file" is lost. That new content happens to be identical and that new content will probably be reflinked does not change the fact that original file is gone.
Re: CoW behavior when writing same content
On Tue, 9 Oct 2018 09:52:00 -0600 Chris Murphy wrote: > You'll be left with three files. /big_file and root/big_file will > share extents, and snapshot/big_file will have its own extents. You'd > need to copy with --reflink for snapshot/big_file to have shared > extents with /big_file - or deduplicate. Or use rsync for copying, in the mode where it reads and checksums blocks of both files, to copy only the non-matching portions. rsync --inplace This option is useful for transferring large files with block-based changes or appended data, and also on systems that are disk bound, not network bound. It can also help keep a copy-on-write filesystem snapshot from diverging the entire con‐ tents of a file that only has minor changes. -- With respect, Roman
Re: CoW behavior when writing same content
On Tue, Oct 9, 2018 at 8:48 AM, Gervais, Francois wrote: > Hi, > > If I have a snapshot where I overwrite a big file but which only a > small portion of it is different, will the whole file be rewritten in > the snapshot? Or only the different part of the file? Depends on how the application modifies files. Many applications write out a whole new file with a pseudorandom filename, fsync, then rename. > > Something like: > > $ dd if=/dev/urandom of=/big_file bs=1M count=1024 > $ cp /big_file root/ > $ btrfs sub snap root snapshot > $ cp /big_file snapshot/ > > In this case is root/big_file and snapshot/big_file still share the same data? You'll be left with three files. /big_file and root/big_file will share extents, and snapshot/big_file will have its own extents. You'd need to copy with --reflink for snapshot/big_file to have shared extents with /big_file - or deduplicate. -- Chris Murphy
CoW behavior when writing same content
Hi, If I have a snapshot where I overwrite a big file but which only a small portion of it is different, will the whole file be rewritten in the snapshot? Or only the different part of the file? Something like: $ dd if=/dev/urandom of=/big_file bs=1M count=1024 $ cp /big_file root/ $ btrfs sub snap root snapshot $ cp /big_file snapshot/ In this case is root/big_file and snapshot/big_file still share the same data? Thank you