On 2016-05-25 11:29, Hugo Mills wrote:
On Wed, May 25, 2016 at 01:58:15AM -0700, H. Peter Anvin wrote:
Hi,

I'm looking at using a btrfs with snapshots to implement a generational
backup capacity.  However, doing it the naïve way would have the side
effect that for a file that has been partially modified, after
snapshotting the file would be written with *mostly* the same data. How does btrfs' COW algorithm deal with that? If necessary I might want to
write some smarter user space utilities for this.

Sounds like it might be a job for one of the dedup tools
(deupremove, bedup), or, if you're writing your own, the safe
deduplication ioctl which underlies those tools.

Hugo.

Perhaps it really makes sense to delegate de-duplication to 3-rd party
software like BackupPC [1]. I am not sure if btrfs can manage it more
effectively, as in order to find duplicates it would need to scan / analyse
all blocks, so at least it would take longer.

[1] https://sourceforge.net/projects/backuppc/

--
With best regards,
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to