Hi, I like your idea and implementation for offline deduplication a lot. I think it will save me 50% of my backup storage!
Your code walks/scans the directory/file tree of the filesystem. Would it be possible to walk/scan the disk extents sequentially in disk order? - This would be more I/O-efficient - This would save you reading previously deduped/snapshotted/hardlinked files more than once. - Maybe this would make it possible to deduplicate directories as well. Met vriendelijke groet, Arjen Nienhuis P.S. The NTFS implementation on Windows has 'ioctls' to read the MFT sequentially in disk order and it's *fast*. It's being used for things like defrag. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html