On Wed, Oct 26, 2005 at 03:02:51PM +0200, Tomasz Chmielewski wrote: > I use rsync for backing up user data, profiles, important network shares > etc. (from several locations over WAN). > > Overall it works flawlessly, as it transfers only changes, but sometimes > there are some serious hiccups. > > Suppose this scenario, suppose it's 1 GB of files: > > user shares: > > /home/joe/data/file1 > /file2 > /... > /file1000 > > Now the user _moves_ that data to some other folder: > > /home/joe/WAN_goes_crazy/file1 > /file2 > /... > /file1000 > > ...and we start a backup process. > > rsync will first transfer data from "/home/joe/WAN_goes_crazy/file...", > and then deletes "/home/joe/data/data...". > > Basically, this is how rsync works, but in the end, we transfer 1 GB of > files over WAN that we already have locally - the only thing that > changed was the folder where that data is. > > Is there some workaround for this (some intelligent script etc.)?
ISTM it would be quite useful to make rsync "rename-aware". Caveat: I haven't hacked on rsync for quite a while, so my understand may be wrong or outdated. But, I think this could be implemented thusly: You'd want to make this optional, say --detect-renames, because it does incur an extra processing cost. That option should imply at least, --checksum and --delete-after if --delete at all. Then you just need the generator to be slightly more clever. For each file on the sender which is *missing* from the receiver, it needs to search the checksums of all of receiver's existing files for a checksum match. If it finds a match, it can simply use that matched file and either copy or move it to the new filename. Then that file just gets skipped. I don't think this would require any changes to sender, receiver or protocol. What I described would only handle rename-without-modification, but it's cost is not very high. I think it's O(N*M), N=# of files on sender that are missing on receiver, M=# of files on sender. That's the cost over and above whatever --checksum costs. I don't see how rename-with-modification could be handled efficiently, though. Better not to go there. If nobody says I'm way off base here, I might be inspired to try to implement this. Unless someone else has the time and inclination... -chris -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
