> rsync *is* minimizing the number of blocks sent, that's why it takes > longer -- it needs to figure out which blocks are the ones that > changed. But in one sentence you're talking about time, and the next > you're talking about minimizing blocks sent (bandwidth use). You need > to figure out which one you want.
Both. The ultimate goal is to have the job completed quickly. But that can only be done if the number of blocks sent is minimized. Presently, rsync reads the whole local file, and also reads the whole remote file to diff them, and send only the changed blocks. "Read the entire remote file" is the fault here. You could write the entire remote file, faster and with less traffic, than reading it and sending changes. If rsync, during the initial send, stored checksums of the internal blocks of a file, then on subsequent sends, rsync would only need to read the local file and recalculate checksums to see which blocks needed to be sent. This would occur entirely at local disk speeds, with little or no network traffic, and certainly no need to read the entire remote file. This leaves room for improvement - it cannot compare against ZFS incremental sends, but the point is to say, you're wrong if you think "minimizing the time," and "minimizing the blocks sent" are mutually exclusive. _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
