Am Wed, 15 Nov 2017 08:11:04 +0100 schrieb waxhead <waxh...@dirtcellar.net>:
> As for dedupe there is (to my knowledge) nothing fully automatic yet. > You have to run a program to scan your filesystem but all the > deduplication is done in the kernel. > duperemove works apparently quite well when I tested it, but there > may be some performance implications. There's bees as near-line deduplication tool, that is it watches for generation changes in the filesystem and walks the inodes. It only looks at extents, not at files. Deduplication itself is then delegated to the kernel which ensures all changes are data-safe. The process is running as a daemon and processes your changes in realtime (delayed by a few seconds to minutes of course, due to transaction commit and hashing phase). You need to dedicate it part of your RAM to work, around 1 GB is usually sufficient to work well enough. The RAM will be locked and cannot be swapped out, so you should have a sufficiently equipped system. Works very well here (2TB of data, 1GB hash table, 16GB RAM). New dDuplicated files are picked up within seconds, scanned (hitting the cache most of the time thus not requiring physical IO), and then submitted to the kernel for deduplication. I'd call that fully automatic: Once set up, it just works, and works well. Performance impact is very low once the initial scan is done. https://github.com/Zygo/bees -- Regards, Kai Replies to list-only preferred. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html