On 12/8/16 1:36 PM, Christoph Anton Mitterer wrote: > Hey. > > I just wondered whether out-of-band/"offline" dedup is safe for general > use... https://btrfs.wiki.kernel.org/index.php/Status kinda implies so > (it tells about unspecified performance issues), but this seems again > already outdated (kernel 4.7)... > :-(
SUSE supports it in SLE12 using our 3.12 and 4.4 -based kernels. There haven't been a lot of changes to the kernel component of it. It's pretty simple: check to see if the ranges are identical between two files and then reflink between them. > My intention was to use it with duperemove, but AFAIU, the kernel > itself will anyway do a byte-by-byte comparison before any > deduplication, so in principle it should be totally safe regardless of > the stability of the userland tool, right? > Especially I wouldn't want that "identity" is only assumed because of > some checksum identity (or collision ;) ). Yep. It does a full check in the kernel for precisely that reason. It's not even enough to do it in userspace because we don't want dedupe to be race prone. It's either atomically identical or it's not, and we don't dedupe if it's not. If it changes immediately after the ioctl returns, that's fine -- the cloned range will be CoW'd properly. > Also, is there anything to take note of when this is used with > compression and snapshots? I don't believe so. IIRC dedupe maps the file to see if it's already cloned, so it's safe for snapshots (or could relink extents in a snapshot that diverged and then were restored to their original contents. Dedupe works with the uncompressed data, so compression shouldn't matter here. I haven't tested it, though. > What when I use it with incremental send/receive... i.e. I dedupe the > "master" and then send/receive this to another btrfs... will it work > (that is will the copy be also deduplicated, with no longer needed > extents properly being freed)... or at least not cause any corruptions? It should. IIRC send also maps the file (using a different mechanism) and receive will clone those ranges on the other end. > Any other things in terms of possible issues, data corruption, etc. > that one should know when using deduplication? There shouldn't be. We haven't had any bug reports at SUSE. -Jeff -- Jeff Mahoney SUSE Labs
signature.asc
Description: OpenPGP digital signature