At 10/17/2016 02:54 AM, Stefan Priebe - Profihost AG wrote:
Am 16.10.2016 um 00:37 schrieb Hans van Kranenburg:

On 10/15/2016 10:49 PM, Stefan Priebe - Profihost AG wrote:

cp --reflink=always takes sometimes very long. (i.e. 25-35 minutes)

An example:

source file:
# ls -la vm-279-disk-1.img
-rw-r--r-- 1 root root 204010946560 Oct 14 12:15 vm-279-disk-1.img

target file after around 10 minutes:
# ls -la vm-279-disk-1.img.tmp
-rw-r--r-- 1 root root 65022328832 Oct 15 22:13 vm-279-disk-1.img.tmp

Two quick thoughts:
1. How many extents does this img have?

filefrag says:
1011508 extents found

Too many fragments.
Average extent size is only about 200K.
Quite common for VM images, if not setting no copy-on-write (C) attr.

Normally it's not a good idea to put VM images into btrfs without any tuning.

Several default features of btrfs is not suitable for that use case:
1) Copy-on-Write
   For VM image, a lot of random write happens.
   This will create a lot of small extents, just as you see here.

   Traditional non-CoW filesystems, like Ext4 and (current) XFS,
   overwrite is just overwrite, won't be written into new places.
   So for these filesystems, no matter how many writes happen, the
   extent counts won't change much(mostly unchanged)

2) Extent booking
   Another result of CoW, data extents won't be freed until all its
   referencer get removed.
   Which leads to quite some space wastes.

3) Slow metadata operation
   Btfs tree cow and its lock mechanism makes metadata operation quite
   slow compared to other fs.

   Normal read/write is not metadata heavy operation, while reflinking
   (IIRC, xfs with reflink support, not mainlined yet, is faster than
    btrfs doing reflink)

Normally, no cow (C) attr is recommended for VM image use case.
This flag will make btrfs acts much like traditional fs, until there is a snapshot containing this file is created.

While it has the limitation that it will prohibit reflink, you can't use cp --reflink=always then.

If no cow flag is not what you want, and there is no other snapshot/subvolume/reflinked files sharing the file, defrag is high recommended before reflink.

That will hugely reduce the number of extents(fragments) and reduce the time calling reflink.

However I doubt the time consuming of defrag may be even longer than reflink.


2. Is this an XY problem? Why not just put the img in a subvolume and
snapshot that?

Sorry what's XY problem?

Implementing cp reflink was easier - as the original code was based on
XFS. But shouldn't be cp reflink / clone a file be nearly identical to a
snapshot? Just creating refs to the extents?

To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to
More majordomo info at

To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to
More majordomo info at

Reply via email to