[rdiff-backup-users] fuzzy match - moved/renamed

David Sun, 07 Feb 2016 05:09:55 -0800

Hi All,

Are there plans to implement fuzzy match or similar algorithms to matchfiles moved/renamed files?

With scenario where large files are renamed or moved between foldersrdiff-backup treats these files as new ones and as result transferslarge amounts of data and takes a lot of data to store diffs, i.e. for12 weeks or so, whilst the data is in fact the very same.


What I would like to suggest is:

in case of discovering new file, calculate checksum, check if thechecksum exists already in destination folder (under any subfolder):a) if the file does exist in destination folder (different filename/path) and the file name/path does not exist anymore in the source,simply rename/move fileb) if the file does exist in destination folder (different filename/path) and the file name/path _does_ exist in the source we havesituation of duplicate of the file and can either do hardlinking orcreate local copy.

Above approach would solve the problem of transmitting and storing a lotof data for the same files being moved between folders.

The deletions should be done at the very end of the process as by thatwe could re-use files already store.

The diff between backups would then store only differences again and notfull copies of the files.

Does this sound like something which could be implemented in the nearfuture?

This might be not the best place to post this question, but if there isa better backup solution handling situations like this, please let meknow too. I'm looking to keep all the goodies of rdiff-backup hencersync with fuzzy option is not a way to go for me.


Thanks,
David

_______________________________________________
rdiff-backup-users mailing list at rdiff-backup-users@nongnu.org
https://lists.nongnu.org/mailman/listinfo/rdiff-backup-users
Wiki URL: http://rdiff-backup.solutionsfirst.com.au/index.php/RdiffBackupWiki

[rdiff-backup-users] fuzzy match - moved/renamed

Reply via email to