Re: reducing file list bytes transferred

2007-10-03 Thread Peter Salameh
Thanks for that.  Doesn't the delta-transfer algorithm compare the files
on sender and receiver?  For the file list, we would only need compare the
new file list with the last one on the sender (it would be a simple matter
to check that the receiver's file list is still valid with a checksum).

If the same rsync command is done over and over on a huge number of files,
then a command like --save_list=path/filename could be used specify where
to store the list (on both ends).  The delta-transfer algorithm (or some
difference code specific to the file lists) could be used on the sender to
find differences between the new list and the saved list.

After reading the chain you referenced, I realized that the rsync command
should be identical (no change in options) for this to work without
side-effects, since I infer that the file list generated depends on the
options.  One possibility is to store the rsync command used to generate
the list along with the saved list, to check that the list is still valid.

I'm sure there are many rsync users out there who would benefit from this.
 Still rsync is greatly appreciated.

Peter


 On 10/1/07, Peter Salameh [EMAIL PROTECTED] wrote:
 Has the rsync team considered an rsync option which would remember the
 last file list on both ends, and only send changes to the list?

 Perhaps a more natural approach is to use the delta-transfer algorithm
 to send the file list.  Jamie Lokier suggested this approach here:

 http://lists.samba.org/archive/rsync/2007-August/018345.html

 Matt


-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: reducing file list bytes transferred

2007-10-03 Thread Peter Salameh
I agree that rsync being stateless is a good thing.  I have found it very
reliable and predictable for years.  I certainly wouldn't change to unison
(which apparently is currently not maintained) just to speed up the file
list transfer.

Delta-transfer of the file list seems to make good sense for rsync,
especially for one-way backup involving lots of files.  Is there any
chance that this will happen, or any way to gauge interest in this?

Peter


 Unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) is a stateful
 two-way synchronizer that does essentially this by default; you can
 use Unison even for your one-way copy to get the performance benefit.

 Rsync is meant to be stateless, so if it were enhanced to reduce the
 amount of file list transferred, I think delta-transferring the file
 list would be more in keeping with its mission than saving a last
 file list.

 Matt


-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: reducing file list bytes transferred

2007-10-02 Thread Wayne Davison
On Mon, Oct 01, 2007 at 08:21:25PM -0500, Stephen Zemlicka wrote:
 I thought it was supposed to do it more effeciently

Yes, protocol 30 has several improvements that reduce the number of
bytes sent over the wire in addition to its incremental recursion mode.
The latter is slightly less byte-efficient over a non-incremental pr.30
recursion, but is more disk I/O efficient the larger your file list gets
(not to mention more memory efficient).

For example, I did an rsync on a hierarchy of 30,103 files that didn't
need any updates.  This resulted in the following transfer counts:

Protocol 29: sent 533,947 bytes, received 20 bytes

Protocol 30, --inc-recursive:sent 503,658 bytes, received 676 bytes

Protocol 30, --no-inc-recursive: sent 501,189 bytes, received 11 bytes

You'll note that the inc-recursive mode has a little more back chatter
due to the generator letting the sender know about its progress through
the file list.  Since that direction of flow is more lightly loaded than
the flow from sender to receiver, the small amount of extra data should
not adversely affect rsync's speed.

So, the inc-recursive transfer may be slightly less efficient over the
wire than pr.30 no-inc-recursive, but because the sending and receiving
sides tend to be doing more disk I/O at the same time, the total
transfer time can be less (depending on how much directory data is in
the disk cache, how large the transfer is, and how many files need to
be transferred).

It would be interesting to look into applying more general compression
to the file list in a future version.  It would also be interesting to
see if an rsync-checksum-delta technique helps out on an inc-recursive
file list.  I use that idea on my protocol-experiment software a while
back, and it may be a net win (though it has tradeoffs, as chunks of the
file list must be buffered and sorted before being delta-difference
transferred to the other side, while the current code sends file list
info as soon as it is scanned).

..wayne..
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: reducing file list bytes transferred

2007-10-01 Thread Matt McCutchen
On 10/1/07, Peter Salameh [EMAIL PROTECTED] wrote:
 Has the rsync team considered an rsync option which would remember the
 last file list on both ends, and only send changes to the list?

Perhaps a more natural approach is to use the delta-transfer algorithm
to send the file list.  Jamie Lokier suggested this approach here:

http://lists.samba.org/archive/rsync/2007-August/018345.html

Matt
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: reducing file list bytes transferred

2007-10-01 Thread Stephen Zemlicka
I think the next release is supposed to speed this up.

_
Stephen Zemlicka
Integrated Computer Technologies
PH. 608-558-5926
E-Mail [EMAIL PROTECTED] 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of
Peter Salameh
Sent: Monday, October 01, 2007 5:42 PM
To: rsync@lists.samba.org
Subject: reducing file list bytes transferred

Hello,

This is my first posting to the rsync list.  I mirror a database
containing directories which contain a very large number of files (say
30,000), and sending the file list can often take longer than transferring
the new files.(Rsync ends up sending nearly the same file list on
every transfer, with only the addition of a few new files.)

Has the rsync team considered an rsync option which would remember the
last file list on both ends, and only send changes to the list?  In my
case this would reduce the number of files in the file list from 300,000
per day to under 1,000 per day.

Might be a good addition to rsync-3.0.

Thanks in advance.

Peter Salameh
[EMAIL PROTECTED]


-- 
To unsubscribe or change options:
https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: reducing file list bytes transferred

2007-10-01 Thread Matt McCutchen
On 10/1/07, Stephen Zemlicka [EMAIL PROTECTED] wrote:
 I think the next release is supposed to speed this up.

Are you thinking of incremental recursion?  It interleaves file list
and file data transmission but does not decrease the total amount of
file-list data sent.

Matt
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


RE: reducing file list bytes transferred

2007-10-01 Thread Stephen Zemlicka
I thought it was supposed to do it more effeciently

_
Stephen Zemlicka
Integrated Computer Technologies
PH. 608-558-5926
E-Mail [EMAIL PROTECTED] 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt
McCutchen
Sent: Monday, October 01, 2007 8:10 PM
To: Stephen Zemlicka
Cc: Peter Salameh; rsync@lists.samba.org
Subject: Re: reducing file list bytes transferred

On 10/1/07, Stephen Zemlicka [EMAIL PROTECTED] wrote:
 I think the next release is supposed to speed this up.

Are you thinking of incremental recursion?  It interleaves file list
and file data transmission but does not decrease the total amount of
file-list data sent.

Matt

-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: reducing file list bytes transferred

2007-10-01 Thread Matt McCutchen
On 10/1/07, Stephen Zemlicka [EMAIL PROTECTED] wrote:
 I thought it was supposed to do it more effeciently

Not that I know of.  (Wayne, please correct me if necessary.)

Matt
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: reducing file list bytes transferred

2007-10-01 Thread Matt McCutchen
On 10/2/07, Peter Salameh [EMAIL PROTECTED] wrote:
 Thanks for that.  Doesn't the delta-transfer algorithm compare the files
 on sender and receiver?

Correct.

 For the file list, we would only need compare the
 new file list with the last one on the sender

Unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) is a stateful
two-way synchronizer that does essentially this by default; you can
use Unison even for your one-way copy to get the performance benefit.

Rsync is meant to be stateless, so if it were enhanced to reduce the
amount of file list transferred, I think delta-transferring the file
list would be more in keeping with its mission than saving a last
file list.

Matt
-- 
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html