Hi Lionel,

On 2021-03-29 21:53, Lionel Bouton wrote:
Hi Claudius,

Le 29/03/2021 à 21:14, Claudius Heine a écrit :
[...]
Are you sure?

I did a test with a 32MiB random file. I created one snapshot, then
changed (not deleted or added) one byte in that file and then created
a snapshot again. `btrfs send` created a >32MiB `btrfs-stream` file.
If it would be only block based, then I would have expected that it
would just contain the changed block, not the whole file.

I suspect there is another possible explanations : the tool you used to
change one byte actually rewrote the whole file.

You can test this by appending data to your file (for example with "cat
otherfile >> originalfile" or "dd if=/dev/urandom of=originalfile bs=1M
count=4 conv=notrunc oflag=append") and checking the size of `btrfs
send`'s output.

When I append data with dd as described above to a 32M file originally
created with "dd if=/dev/urandom of=originalfile bs=1M count=32" I get a
file with 1 extent only in each snapshot both marked shared and a little
other 4M in `btrfs send`'s output.
filefrag -v should tell you if the extents in your file are shared.

Note that if you use compression and your files compress well they will
use small extents (128kB from memory), this can be bad when you try to
avoid fragmentation but could help COW find more data to share if I
understand how COW works in respect to extents correctly.

Finally, using "dd if=/dev/urandom of=originalfile bs=1M count=1
conv=notrunc seek=12M" to write in the middle of my now 36M file results
in a little over 1M with `btrfs send` using -p <previous snapshot>
And filefrag -v shows 3 extents for this file. 2 of them share the same
logical offsets than the file in the previous snapshot, the last use a
new range, confirming the allocation of a new extent and reuse of the
previous ones.
This seems to confirm my hypothesis that the tool you used did rewrite
the whole file.

Yes, I think you are right here. I will have to experiment with this a bit further. Thanks!

regards,
Claudius

Reply via email to