https://bugzilla.samba.org/show_bug.cgi?id=5482


[EMAIL PROTECTED] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[EMAIL PROTECTED]




------- Comment #2 from [EMAIL PROTECTED]  2008-05-24 17:46 CST -------
Matt,

Although it's true the delta-transfer algorithm detects the small changed
regions, and block boundary is not an issue, even the delta-transfer algorithm
is slow for _very_ large files with small changes.

Think about a 4GB media file with 10k of changes near the start.  In this, the
delta-transfer algorithm transmits a lot of block checksum data just to do the
comparisons - enough to be substantially affected by network bandwidth. 
Alternatively, if the blocks are large to reduce the number of checksums,
transmitting a single block of data is significant.

I doubt if the scenario described in this bug report is all that common.  How
often do you change the header of a huge video file, without transcoding the
contents as well?  However, if it is, can the delta-transfer algorithm be tuned
better for this by using smaller blocks near the start of the file?

(More generally, a hierarchical delta algorithm (checksums of blocks of
checksums of blocks - in a tree structure, but 2 or 3 levels may be plenty)
would solve this in a general way for a number of things involving small
changes in very large files.  If you concatenate all the files and metadata to
make a single structured data stream to be delta-transferred, it may also be a
good optimisation on data sets consisting of large numbers of files with only a
few changed.  The logical extreme is transferring a single checksum being
enough to compare the whole data set in just a few bytes, followed by a top
down breadth-first checksum tree traversal.  This is what I am attempting in a
project of mine.)


-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to