Any way to predict the amount of data to be copied when re-copying a file?

2009-11-29 Thread Andrew Gideon
I do backups using rsync, and - every so often - a file takes far longer than it normally does. These are large data files which typically change only a little over time. I'm guessing that these large transfers are caused by occasional changes that break (ie. yield poor performance) in the

Re: Any way to predict the amount of data to be copied when re-copying a file?

2009-11-29 Thread Matt McCutchen
On Sun, 2009-11-29 at 16:07 +, Andrew Gideon wrote: Is there some way to run rsync in some type of dry run mode but where an actual determination of what pages should be copied is performed? The current --dry-run doesn't go down to this level of detail as far as I can see. It

Re: Any way to predict the amount of data to be copied when re-copying a file?

2009-11-29 Thread Eliot Moss
I can't answer your question directly, but I can say that it is not strictly the number of bytes that are different that matters, but also how the differences are distributed in the file. Unless you explicitly set the block size, rsync uses a size that is the sqrt of the size of the file, thus