I have 3-4 years worth of snapshots I use for backup purposes. I keep R-O live 
snapshots, two local backups, and AWS Glacier Deep Freeze. I use both send | 
receive and send > file. This works well but I get massive deltas when files 
are moved around in a GUI via samba. Reorganize a bunch of files and the next 
snapshot is 50 or 100 GB. Perhaps mv or cp with reflink=always would fix the 
problem but it's just not usable enough for my family.

I'd like a solution to the massive delta problem. Perhaps someone already has a 
solution, that would be great. If not, I need advice on a few ideas.

It seems a realistic solution to deduplicate the subvolume  before each 
snapshot is taken, and in theory I could write a small program to do that. 
However I don't know if that would work. Will Btrfs will let me deduplicate 
between a file on the live subvolume and a file on the R-O snapshot (really the 
same file but different path). If so, will Btrfs send with -p result in a small 
delta?

Failing that I could probably make changes to the send data stream, but that's 
suboptimal for the live volume and any backup volumes where data has been 
received.

Also, is it possible to access the Btrfs hash values for files so I don't have 
to recalculate file hashes for the whole volume myself?

Thanks in advance for any advice.

Reply via email to