пт, 24 авг. 2018 г. в 7:41, Lakshmipathi.G <lakshmipath...@giis.co.in>: > > Hi - > > dduper is an offline dedupe tool. Instead of reading whole file blocks and > computing checksum, It works by fetching checksum from BTRFS csum tree. This > hugely improves the performance. > > dduper works like: > - Read csum for given two files. > - Find matching location. > - Pass the location to ioctl_ficlonerange directly > instead of ioctl_fideduperange > > By default, dduper adds safty check to above steps by creating a > backup reflink file and compares the md5sum after dedupe. > If the backup file matches new deduped file, then backup file is > removed. You can skip this check by passing --skip option. Here is > sample cli usage [1] and quick demo [2] > > Some performance numbers: (with -skip option) > > Dedupe two 1GB files with same content - 1.2 seconds > Dedupe two 5GB files with same content - 8.2 seconds > Dedupe two 10GB files with same content - 13.8 seconds > > dduper requires `btrfs inspect-internal dump-csum` command, you can use > this branch [3] or apply patch by yourself [4] > > [1] > https://gitlab.collabora.com/laks/btrfs-progs/blob/dump_csum/Documentation/dduper_usage.md > [2] http://giis.co.in/btrfs_dedupe.gif > [3] git clone https://gitlab.collabora.com/laks/btrfs-progs.git -b dump_csum > [4] https://patchwork.kernel.org/patch/10540229/ > > Please remember its version-0.1, so test it out, if you plan to use dduper > real data. > Let me know, if you have suggestions or feedback or bugs :) > > Cheers. > Lakshmipathi.G >
One question: Why not ioctl_fideduperange? i.e. you kill most of benefits from that ioctl - atomicity. -- Have a nice day, Timofey.