https://bugzilla.samba.org/show_bug.cgi?id=5482
[EMAIL PROTECTED] changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |[EMAIL PROTECTED]
------- Comment #2 from [EMAIL PROTECTED] 2008-05-24 17:46 CST -------
Matt,
Although it's true the delta-transfer algorithm detects the small changed
regions, and block boundary is not an issue, even the delta-transfer algorithm
is slow for _very_ large files with small changes.
Think about a 4GB media file with 10k of changes near the start. In this, the
delta-transfer algorithm transmits a lot of block checksum data just to do the
comparisons - enough to be substantially affected by network bandwidth.
Alternatively, if the blocks are large to reduce the number of checksums,
transmitting a single block of data is significant.
I doubt if the scenario described in this bug report is all that common. How
often do you change the header of a huge video file, without transcoding the
contents as well? However, if it is, can the delta-transfer algorithm be tuned
better for this by using smaller blocks near the start of the file?
(More generally, a hierarchical delta algorithm (checksums of blocks of
checksums of blocks - in a tree structure, but 2 or 3 levels may be plenty)
would solve this in a general way for a number of things involving small
changes in very large files. If you concatenate all the files and metadata to
make a single structured data stream to be delta-transferred, it may also be a
good optimisation on data sets consisting of large numbers of files with only a
few changed. The logical extreme is transferring a single checksum being
enough to compare the whole data set in just a few bytes, followed by a top
down breadth-first checksum tree traversal. This is what I am attempting in a
project of mine.)
--
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
--
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html