> rsync *is* minimizing the number of blocks sent, that's why it takes
> longer -- it needs to figure out which blocks are the ones that
> changed.  But in one sentence you're talking about time, and the next
> you're talking about minimizing blocks sent (bandwidth use).  You need
> to figure out which one you want.

Both.
The ultimate goal is to have the job completed quickly.  But that can only
be done if the number of blocks sent is minimized.  Presently, rsync reads
the whole local file, and also reads the whole remote file to diff them, and
send only the changed blocks.  "Read the entire remote file" is the fault
here.  You could write the entire remote file, faster and with less traffic,
than reading it and sending changes.

If rsync, during the initial send, stored checksums of the internal blocks
of a file, then on subsequent sends, rsync would only need to read the local
file and recalculate checksums to see which blocks needed to be sent.  This
would occur entirely at local disk speeds, with little or no network
traffic, and certainly no need to read the entire remote file.

This leaves room for improvement - it cannot compare against ZFS incremental
sends, but the point is to say, you're wrong if you think "minimizing the
time," and "minimizing the blocks sent" are mutually exclusive.


_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to