On Thu, Mar 05, 2009 at 03:27:50PM -0800, Peter Salameh wrote: > My proposal is to first send a checksum of the file list for each > directory. If is found to be identical to the same checksum on the > remote side then the list need not be sent for that directory!
My rZync source does something like that for directories: it treats a directory-list transfer like a file transfer. That means that the receiving side sends a set of checksums to the sending side telling it what it's version of the directory looks like, and then the sender sends a normal set of delta data that lets the receiver reconstruct the sender's version of the directory (which it compares to its own). One potential drawback is having to deal with false checksum-matches (which should be rare, but would require the dir data to be resent) I hadn't optimized it for block size or (possibly) data order to make it more efficient, but it is an interesting idea for speeding up a slow connection. I'm not sure if it would really help out that much for a more modern, faster connection, because rsync sends the file-list data at the same time as it is being scanned, and sometimes the scan is the bottle-neck. The best way to optimize sending of really large numbers of files that are mostly the same is to start to leverage a file-change notification system, such as inotify. Using that, it is possible to distill a list of what files/directories need to be copied, and to just copy what is needed. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html