Re: reducing file list bytes transferred
Thanks for that. Doesn't the delta-transfer algorithm compare the files on sender and receiver? For the file list, we would only need compare the new file list with the last one on the sender (it would be a simple matter to check that the receiver's file list is still valid with a checksum). If the same rsync command is done over and over on a huge number of files, then a command like --save_list=path/filename could be used specify where to store the list (on both ends). The delta-transfer algorithm (or some difference code specific to the file lists) could be used on the sender to find differences between the new list and the saved list. After reading the chain you referenced, I realized that the rsync command should be identical (no change in options) for this to work without side-effects, since I infer that the file list generated depends on the options. One possibility is to store the rsync command used to generate the list along with the saved list, to check that the list is still valid. I'm sure there are many rsync users out there who would benefit from this. Still rsync is greatly appreciated. Peter On 10/1/07, Peter Salameh [EMAIL PROTECTED] wrote: Has the rsync team considered an rsync option which would remember the last file list on both ends, and only send changes to the list? Perhaps a more natural approach is to use the delta-transfer algorithm to send the file list. Jamie Lokier suggested this approach here: http://lists.samba.org/archive/rsync/2007-August/018345.html Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: reducing file list bytes transferred
I agree that rsync being stateless is a good thing. I have found it very reliable and predictable for years. I certainly wouldn't change to unison (which apparently is currently not maintained) just to speed up the file list transfer. Delta-transfer of the file list seems to make good sense for rsync, especially for one-way backup involving lots of files. Is there any chance that this will happen, or any way to gauge interest in this? Peter Unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) is a stateful two-way synchronizer that does essentially this by default; you can use Unison even for your one-way copy to get the performance benefit. Rsync is meant to be stateless, so if it were enhanced to reduce the amount of file list transferred, I think delta-transferring the file list would be more in keeping with its mission than saving a last file list. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: reducing file list bytes transferred
On Mon, Oct 01, 2007 at 08:21:25PM -0500, Stephen Zemlicka wrote: I thought it was supposed to do it more effeciently Yes, protocol 30 has several improvements that reduce the number of bytes sent over the wire in addition to its incremental recursion mode. The latter is slightly less byte-efficient over a non-incremental pr.30 recursion, but is more disk I/O efficient the larger your file list gets (not to mention more memory efficient). For example, I did an rsync on a hierarchy of 30,103 files that didn't need any updates. This resulted in the following transfer counts: Protocol 29: sent 533,947 bytes, received 20 bytes Protocol 30, --inc-recursive:sent 503,658 bytes, received 676 bytes Protocol 30, --no-inc-recursive: sent 501,189 bytes, received 11 bytes You'll note that the inc-recursive mode has a little more back chatter due to the generator letting the sender know about its progress through the file list. Since that direction of flow is more lightly loaded than the flow from sender to receiver, the small amount of extra data should not adversely affect rsync's speed. So, the inc-recursive transfer may be slightly less efficient over the wire than pr.30 no-inc-recursive, but because the sending and receiving sides tend to be doing more disk I/O at the same time, the total transfer time can be less (depending on how much directory data is in the disk cache, how large the transfer is, and how many files need to be transferred). It would be interesting to look into applying more general compression to the file list in a future version. It would also be interesting to see if an rsync-checksum-delta technique helps out on an inc-recursive file list. I use that idea on my protocol-experiment software a while back, and it may be a net win (though it has tradeoffs, as chunks of the file list must be buffered and sorted before being delta-difference transferred to the other side, while the current code sends file list info as soon as it is scanned). ..wayne.. -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: reducing file list bytes transferred
On 10/1/07, Peter Salameh [EMAIL PROTECTED] wrote: Has the rsync team considered an rsync option which would remember the last file list on both ends, and only send changes to the list? Perhaps a more natural approach is to use the delta-transfer algorithm to send the file list. Jamie Lokier suggested this approach here: http://lists.samba.org/archive/rsync/2007-August/018345.html Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: reducing file list bytes transferred
I think the next release is supposed to speed this up. _ Stephen Zemlicka Integrated Computer Technologies PH. 608-558-5926 E-Mail [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Salameh Sent: Monday, October 01, 2007 5:42 PM To: rsync@lists.samba.org Subject: reducing file list bytes transferred Hello, This is my first posting to the rsync list. I mirror a database containing directories which contain a very large number of files (say 30,000), and sending the file list can often take longer than transferring the new files.(Rsync ends up sending nearly the same file list on every transfer, with only the addition of a few new files.) Has the rsync team considered an rsync option which would remember the last file list on both ends, and only send changes to the list? In my case this would reduce the number of files in the file list from 300,000 per day to under 1,000 per day. Might be a good addition to rsync-3.0. Thanks in advance. Peter Salameh [EMAIL PROTECTED] -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: reducing file list bytes transferred
On 10/1/07, Stephen Zemlicka [EMAIL PROTECTED] wrote: I think the next release is supposed to speed this up. Are you thinking of incremental recursion? It interleaves file list and file data transmission but does not decrease the total amount of file-list data sent. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
RE: reducing file list bytes transferred
I thought it was supposed to do it more effeciently _ Stephen Zemlicka Integrated Computer Technologies PH. 608-558-5926 E-Mail [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt McCutchen Sent: Monday, October 01, 2007 8:10 PM To: Stephen Zemlicka Cc: Peter Salameh; rsync@lists.samba.org Subject: Re: reducing file list bytes transferred On 10/1/07, Stephen Zemlicka [EMAIL PROTECTED] wrote: I think the next release is supposed to speed this up. Are you thinking of incremental recursion? It interleaves file list and file data transmission but does not decrease the total amount of file-list data sent. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: reducing file list bytes transferred
On 10/1/07, Stephen Zemlicka [EMAIL PROTECTED] wrote: I thought it was supposed to do it more effeciently Not that I know of. (Wayne, please correct me if necessary.) Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: reducing file list bytes transferred
On 10/2/07, Peter Salameh [EMAIL PROTECTED] wrote: Thanks for that. Doesn't the delta-transfer algorithm compare the files on sender and receiver? Correct. For the file list, we would only need compare the new file list with the last one on the sender Unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) is a stateful two-way synchronizer that does essentially this by default; you can use Unison even for your one-way copy to get the performance benefit. Rsync is meant to be stateless, so if it were enhanced to reduce the amount of file list transferred, I think delta-transferring the file list would be more in keeping with its mission than saving a last file list. Matt -- To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html