Lars Karlslund wrote:

Maybe I didn't express myself thoroughly enough :-)

Or me.

Yes, a block is a minimum storage unit, which is considered for transfer.

In size, yes. Not in position.

But it's a fact that the rsync algorithm as it is now checks to see if a block should have moved. And in that case, the 700 bytes default is very much worth considering.

No, because the rsync algorithm can detect single byte moves of this 700 bytes block.


If no blocks at all move in a 700 byte increment (i.e. 700 bytes gets inserted somewhere - optimally at a 700-byte boundary in the file), then all you get is larger memory and CPU usage and

all the bandwidth reduction you need.

The point I think you are missing is that the 700 bytes block need not be on 700 bytes boundaries. They can be on one byte boundaries.

It may very well be that, for your specific application, increasing the block size considerably will be better. If your files are huge, and the changed areas are very small in comparison to the file size, that can yield significant improvement. However, this is due to the trade offs I talked about in my previous email. It has nothing to do with 700 bytes being unrealistic or incorrect.

True, and in that scenario it makes no difference what the block size you choose: if the one byte is inserted at the beginning, the entire file will be transferred.

No, just the first block.

Rsync is not diff, and does not "patch" the file dynamically if the file has random insertions/removals.

Well, in a way, it does. It's really quite ingenious. As I have no relation to it's implementation, I can say that whole heartily. I encourage you to read the about the algorithm on the site.


You make no comment on my calculations on the block-moving algorithm in my real-world scenario, which was the basis for this discussion anyway.

I'm sorry. You just stated as facts things I knew to be incorrect, so I allowed myself to skip your calculations. I don't think there is any argument that you are getting sub-optimal results from rsync. The question is "why".


How much memory is on the machines? Try to bring the block size up to 1MB. This will mean you will have only 524 thousand blocks, which may prove more manageable.

Best regards,

Shachar

--
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? http://www.lingnu.com/backup.html

--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to