Re: status of inline deduplication in btrfs

Austin S. Hemmelgarn Mon, 28 Aug 2017 04:31:07 -0700

On 2017-08-28 06:32, Adam Borowski wrote:

On Mon, Aug 28, 2017 at 12:49:10PM +0530, shally verma wrote:

Am bit confused over here, is your description based on offline-dedupe
here Or its with inline deduplication?


It doesn't matter _how_ you get to excessive reflinking, the resulting
slowdown is the same.

By the way, you can try "bees", it does nearline-dedupe which is for
practical purposes as good as fully online, and unlike the latter, has no
way to damage your data in case of bugs (mistaken userland dedupe can at
most make the kernel pointlessly read and compare data).

I haven't tried it myself, but what it does is dedupe using FILE_EXTENT_SAME
asynchronously right after a write gets put into the page cache, which in
most cases is quick enough to avoid writeout.

I would also recommend looking at 'bees'. If you absolutely _must_ haveonline or near-online deduplication, then this is your best optioncurrently from a data safety perspective.

That said, it's worth pointing out that in-line deduplication is notalways the best answer. In fact, it's quite often a sub-optimal answercompared to a combination of compression, sparse files, and batchdeduplication. Compression and usage of sparse files will get you aboutthe same space savings most of the time as in-line deduplication (I'vetested this on ZFS on FreeBSD using native in-line deduplication, andwith BTRFS on Linux using bees) while using much less memory, and aboutthe same amount of processor time. In the event that you need betterspace savings than that, you're better off using batch deduplicationbecause it gives you better control over when you're using more systemresources and will often get better overall results than in-linededuplication.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: status of inline deduplication in btrfs

Reply via email to