On Tue, Mar 06, 2001 at 07:08:05PM +1030, Alex C wrote:
> I'm working on a project where I need to automate the transfer of
> files securely over a dialup connection. Files need to be moved both
> ways with wildcard pattern matching needed on both sides to find the
> right files.
> 
> I've got this working with ssh and scp, but this requires many
> separate ssh invocations (especially for retrieving files, e.g. ssh to
> ls files, scp to copy files, ssh to rm files). There is a noticeable
> delay for each ssh invocation, and this is more error prone since
> accidental disconnection (e.g. of the dialup link) could leave the
> files existing on both sides.
> 
> This is what I need:
> - Secure transfers
> - The ability to send/retrieve all files matching wildcard patterns
> - The ability to have files deleted after they have been transferred
> - Atomic operation as much as possible, so that files won't end up
>   existing on both sides in the case of an error
> - The ability to do it all with as few reconnections as possible
> 
> It looks like rsync would be great for this, since it can work over
> ssh, match wildcards on the remote side with --include etc. but there
> doesn't appear to be a way to remove the files (at least on the remote
> side) after they have been received, and only one transfer direction
> is supported per rsync invocation. Is there a way to get around these
> problems or would I be better off just using ssh or something else?
> Connecting once per send operation and once per receive operation
> would be satisfactory, but moving instead of copying is essential.
> 
> I guess what I really want to be able to do is
>   rsync --move src dest , src2 dest2 , src3 dest3


I don't think that type of operation is likely to get into rsync itself
but I could certainly see that something could be built successfully on
top of rsync to do that.



> Also, it seems to be possible to send all the files with rsync and
> then remove files based on rsync's output with --log-format=%f, but
> rsync sometimes lists files even if they haven't been successfully
> transferred. Is this a bug? Is the assumption that a file has been
> transferred successfully if it is listed on stdout with --log-format
> and its name did not appear on stderr reasonable?


It's long been desired that the --log-format option be more robust and
provide some guarantees, but it really doesn't unfortunately.  It needs
some close attention to do a good job.

The most recently something related was discussed was last October but I
see it was done in private email so I will attach it here for the record.
Follow the references to previous discussions in January 1999.

- Dave Dykstra


On Wed, Oct 11, 2000 at 11:04:15AM -0500, Dave Dykstra wrote:
> Date: Wed, 11 Oct 2000 11:04:15 -0500
> From: Dave Dykstra <[EMAIL PROTECTED]>
> To: Douglas N Arnold <[EMAIL PROTECTED]>
> Cc: [EMAIL PROTECTED], Andrew Tridgell <[EMAIL PROTECTED]>, [EMAIL PROTECTED]
> Subject: Re: --report patch for rsync
> 
> I'm including Martin Pool in the Cc on this now since he seems to have
> taken over primary maintenance of rsync; it will really be up to him in
> discussion with Andrew (they work in the same office) to decide.  According
> to my email record it looks like you contacted Andrew and I directly, not
> via the rsync mailing list, and didn't send us a patch to look at.  Ah, I
> see the patch is available now on your web page
>     http://www.math.psu.edu/dna/synchron
> in the "download" section.
> 
> My opinion is that rsync *should* change in scattered places throughout it
> in order to do this properly.  I don't think it should have to be an
> "extensive" change but I do expect it to not be all localized.  Protocol
> changes would be acceptable.
> 
> The main thing I don't like about --report is that its format is completely
> set and doesn't give the user any control over the output like --log-format
> can.  Also, I don't see why it should have to imply --dry-run (not to
> mention so many other options); some people might like to find out that
> information about what rsync did, not just what it would do.  My aim is for
> maximum flexibility so that somebody else doesn't need to come along in
> another 6 months with a requirement for something that's slightly different
> and need to add yet another new option.
> 
> Another possibility: it seems to me you could do a compromise between your
> current --report and what I had envisioned for --log-format by extending
> the --log-format option to do pretty much exactly what your --report option
> does (except implying --dry-run) but not do the other things I had
> proposed.  If you need to put restrictions on to say that it only works on
> the client side, or that whatever % substitution you choose can't be used
> in combination with some of the others, that's probably something we could
> handle.  Somebody in the future may come along and remove the restrictions.
> 
> - Dave Dykstra
> 
> 
> On Tue, Oct 10, 2000 at 06:07:01PM -0400, Douglas N Arnold wrote:
> > 
> > Dear Dave,
> > 
> > After some delay we would like to return to the question of our patch
> > to rsync for reporting differences between filesystems. Recall that we
> > have written a patch that adds an option --report which causes rsync
> > not to transfer any files, but just to report filesystem differences on
> > standard output in a format like this:
> > 
> >      building file list ... done
> >      _F text/papers/dgerr/TODO
> >      _F info/ethernet/twisted-pair-cables
> >      Fd README
> >      F_ databases/maps/map4.ps
> >      fF text/addresses.prof
> >      D_ text/notes
> >      F_ text/notes/README
> >      Ff www/complex.html
> >      lL www/interpolation/index.html
> >      wrote 873 bytes  read 20 bytes  71.44 bytes/sec
> >      total size is 83312  speedup is 93.29
> > 
> > The report lists all files that differ between the filesystems for any
> > reason (they exist on only one side, or they have different dates on
> > the two sides, or they are different file types on the two sides,...). 
> > The two character code at the beginning of the line encodes the reason.
> > 
> > Such a report can be quite valuable in itself to let a user know the
> > status quo, but its main value is that it can be easily parsed in order
> > to determine what actions are needed to synchronize the two file
> > systems according to the user's requirements.  We have written a couple
> > of scripts which do exactly this (see www.math.psu.edu/dna/synchron), and
> > these have become extremely popular here.
> > 
> > You wrote:
> > 
> > >>  Something like this is badly needed, but I had in mind doing it a little
> > >>  differently:
> > >>  
> > >>      - I thought it would be a modification to the --log-format option
> > >>          rather than a separate option.   Please review the threads at
> > >>              http://lists.samba.org/pipermail/rsync/1999-January/000950.html
> > >>          and
> > >>              http://lists.samba.org/pipermail/rsync/1999-January/000954.html
> > >>          I see that you were involved in the discussion back then too.  There
> > >>          were some problems with --log-format not working during --dry-run,
> > >>          but that's certainly fixable.  A nice side benefit to this is that
> > >>          this information could also then be put into the rsync --daemon's
> > >>          server log.
> > 
> > We have reviewed the various threads and given this a lot of thought,
> > but we think that what you ask is impossible without very extensive
> > changes to the way rsync now works. (By contrast, our patch is quite
> > localized in the code, and if the new --report option is not on the
> > command line, the behavior of rsync is 100% unchanged).  The reason we
> > can't see how to implement filesystem difference reporting as part of
> > the --log-format option is that the server machine that writes the log
> > entries may be either the sender or the receiver, and when it is the
> > sender, it does not know the information we want to report, at the
> > moment the file has been sent. In the current distribution, the log
> > information for any file is "logged" right before "rsync" starts
> > transmitting the file itself.  At this moment, though, the sender
> > knows only the file name, but does not know the reason "why this file
> > should be sent".  To get this information to the sender, the rsync
> > protocol would have to be changed so that along with a request for a
> > file, the receiver would have to issue the reason for the request.
> > 
> > With our --report option, the information about the status of the file
> > of the sender and receiver is output by the receiver on FINFO and shows
> > up on the users standard output.
> > 
> > In short, logging is done by the server machine, while reporting
> > is done by the receiver.  Switching reporting to the server, so it
> > could be incorporated in the log is impossible without extensive
> > changes, because only the receiver has the necessary information.
> > 
> > Or at least this is our understanding.  Did we get something wrong?
> > 
> > You also wrote:
> > 
> > >>  - I think it would be better for it to not imply any option unless
> > >>  it really needs to, because some people may want to use it without
> > >>  being during --dry-run, for example.
> > 
> > We view our option as analogous to the --dry-run option, but more
> > informative. The man page describes --dry-run as
> > 
> >   This tells rsync to not do any file transfers, instead it will just
> >   report the actions it would have taken.
> > 
> > The equivalent description for --report would be
> > 
> >   This tells rsync to not do any file transfers, instead it will just
> >   report the differences between the two filesystems.
> > 
> > Like --dry-run, the information is written to the users terminal,
> > not to the log.
> > 
> > Of course we could easily implement our patch so that it replaces the
> > previous --dry-run option.  But this would hurt backward compatbility:
> > if users have scripts that depend on the current format of the
> > --dry-run output, those would break.
> > 
> > We hope that this convinces that you that the new --report option
> > is valuable and it is worth incorporating the patch in the rsync
> > distribution.  Again, we emphasize that this will not break or change
> > any current aspect of rsync functionality, and it will add a capability
> > that many people find useful.
> > 
> > What do you think?
> > 
> > Doug Arnold and Ludmil Zikatanov
> > 
> > 

Reply via email to