state of the rsync nation? (revisited 6/2003 from 11/2000)

2003-06-07 Thread Jeff Kowalczyk
I'm interested in these very questions (librsync-rsync relationship,
remaining limitations of rsync, active prospects for ground-up rewrites),
Google searches for rsync info have proved a little too vague due to the
programs ubiquity. Much has certainly changed since this was written,
could some people with knowledge in these areas could update martin's
response for the state of rsync, June 2003? Thanks.

On 13 Nov 2000, Jason Ozolins wrote:
http://lists.samba.org/pipermail/rsync/2000-November/003147.html
 Just a quick question: is the librsync contained within the rproxy
 source code meant to be tracking the development of the mainstream
 rsync, or is it a stripped-down thing meant only to support rproxy?

On 13 Nov 2000, Martin Pool Responded: Here's a quick history:

In the beginning was rsync, which is a file transfer protocol. At the
moment I look after the day-to-day stuff, and tridge watches the
evolution.

rsync gave rise to Josh Macdonald's XDelta, which is optimized for the
case where old and new versions are on the same machine, and so it can
generate more efficient deltas.

tridge extracted the algorithm into librsync, which I renamed to libhsync
when I changed the wire format.  The code currently checked in as librsync
is in my opinion not very good.  It tries to make the algorithm available
at various levels to programs that would like to use it, though the only
user at the moment is rproxy.  rsync doesn't use libhsync -- possibly it
never will, as we care enough about rsync performance that tighter
integration is justified.  Well, if we were starting from scratch it might
be separated out, but it's not worth doing it retrospectively now.

The problems with rsync at the moment are basically:

 * Quirks of design ('triangular' TCP sockets, etc) tend to provoke
   bugs in operating systems or remote shells.

 * Useful features have been added in ad-hoc, and so the code is
   fairly crufty in places.

 * People still want even more features for special cases.  To avoid
   feature hell, my opinion is that we need a clean scripting or plugin
   mechanism.=20

 * rsync is optimized for transferring relatively small trees
   (e.g. the rsync source tree) across slow links (e.g. 56kbps ppp). This
   is fine and important, but people want to use it for different
   situations (10GB, 100Mbps, 50 in parallel) where some design decisions
   (e.g. traverse the whole tree up front) are no longer optimal or even
   adequate.

rproxy uses the rsync algorithm to improve HTTP caching -- it's not
rsync-over-HTTP.  I'm the lead developer for it, and it's in beta.

Completely unrelated to rproxy, sfr has added a small feature to tunnel
rsync through HTTP CONNECT proxies.

Therefore, some people at Linuxcare (primarily rusty, tridge and myself)
are looking at a ground-up rewrite with new code and a new network
protocol.  (Of course we will have a fallback mode.)  This might be called
rsync-3.0, or rsync-tng, or tsync, or something else.

This will likely be a more traditional client-server protocol, somewhat
similar to FTP and HTTP in that the client sends commands to the server to
put or get files.  However, commands will be pipelined,
network-independent binary, and using only a single tcp connection. In
general we hope that there will be less special cases, and probably that
there will be less application-level intelligence in the server and more
in the client.  This should be a firmer foundation for building things
such as

 * implementations in different languages/platforms (Java, Win32
   native, INTERCAL, ...)

 * interactive rsync (like ftp(1))

 * two-way rsync (controlled by the client, which could be automatic
   or even have a GUI.)

 * rsync as a transport for things such as CVS

Discussion about either feature requests or implementation ideas would be
very welcome.  It's probably best to send them to the rsync mailing list.

 The reason I ask is that I am thinking of extending Bob Edwards'
 rsync-based backup server architecture here at DCS, using a database to
 hold file metadata, doing binary deltas for history, and doing block
 compression on backed up data.  This is a fair amount of stuff to
 change, and I was wondering which source base would be better to start
 with.

You might like to look at the XDelta work on XDFS and PCVS, or in the
longer term to work on rsync 3.0.

-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


rsync and cygwin

2003-06-07 Thread Trey Nolen
I had sent a message to the list earlier this week detailing an error
message I am getting using rsync on Cygwin on Windows 2000.  I have now
duplicated the problem on another windows 2000 server. I have a third server
on which the process works, so I don't really know what is going wrong.  All
three servers are rsyncing to the same Linux server.  Below is the original
question. Any help would be appreciated.

The error is this:

Invalid file index some number (count=some number)
rsync error: protocol incompatibility (code 2) at sender.c (135)


The command line I'm using to start this is:
c:\cygwin\bin\rsync -aR -e
c\cygwin\bin\ssh.exe -v --numeric-ids --exclude-from=/cydrive/c/ssh/rsync-ex
cludes.txt --dry-run --delete --delete-after /cygdrive/c root at
backupserver:/

rsync-excludes.txt  contains things like:
-PEACHW/
-WINPOINT/
-PNTDATA/
:
:


At first I thought that maybe my file list was getting too long, but I
changed my excludes so that I had a file list that I knew was short enough
and I still get the error. I get the error on different files depending on
what I have excluded, but if I run the same command twice,  I get the error
in the exact same spot.  Thanks in advance for any help.



-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync and cygwin

2003-06-07 Thread Trey Nolen
HmmThis backup server is handling backup jobs from about a dozen other
machines -- all but three are Linux. The other three are Windows 2000
servers running Cygwin.  Two of those are not working (giving the error).
The backup machine has other duties, too, and has not had any issues. The
uptime is great, and the performance has been fine.

One other thing that might be important:  This error happens *JUST* after
building the file list. It is like the transfer doesn't work at all. It
errors immediately as the transfer starts.

Trey Nolen


 The error is this:

 Invalid file index some number (count=some number)
 rsync error: protocol incompatibility (code 2) at sender.c (135)


 That error indicates something is seriously wrong at the
 communications level.  Given an ssh over TCP connection i'd
 rule out the network layer.  That leaves internal data
 corruption.  I'd start by questioning the integrety of the
 common point of failure--the backup server.  Check the logs
 for errors, run memtest, etc.



-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync and cygwin

2003-06-07 Thread jw schultz
On Sat, Jun 07, 2003 at 05:36:03PM -0500, Trey Nolen wrote:
 HmmThis backup server is handling backup jobs from about a dozen other
 machines -- all but three are Linux. The other three are Windows 2000
 servers running Cygwin.  Two of those are not working (giving the error).
 The backup machine has other duties, too, and has not had any issues. The
 uptime is great, and the performance has been fine.

OK.  Now we get a fuller picture.  All i knew before was you
had one Linux box and three W2K cygwin boxes.  That means we
have 2/3 of cygwin boxes having the problem but not the
Linux boxes.

 One other thing that might be important:  This error happens *JUST* after
 building the file list. It is like the transfer doesn't work at all. It
 errors immediately as the transfer starts.
 
 Trey Nolen
 
 
  The error is this:
 
  Invalid file index some number (count=some number)

Rsync uses the index number of the file list array to
identify files.  This message is comes from the sender.  The
first number is the file number that was requested by the
receiver, the second is the total number of files in the
index.  For the sender to get a request for a file number
not in the list indicates serious corruption somewhere.

My next step would be to examine the binaries on cygwin,
rsync and ssh.  Try checksumming them for a difference
between the badly behaving machines and the still OK one.  
I'd then consider that the cygwin binaries might have been 
built incorrectly.  After that, i don't know, perhaps
creeping corruption on the two machines?

  rsync error: protocol incompatibility (code 2) at sender.c (135)
 
 
  That error indicates something is seriously wrong at the
  communications level.  Given an ssh over TCP connection i'd
  rule out the network layer.  That leaves internal data
  corruption.  I'd start by questioning the integrety of the
  common point of failure--the backup server.  Check the logs
  for errors, run memtest, etc.

-- 

J.W. SchultzPegasystems Technologies
email address:  [EMAIL PROTECTED]

Remember Cernan and Schmitt
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync and cygwin

2003-06-07 Thread Lapo Luchini
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
jw schultz wrote:

I'd then consider that the cygwin binaries might have been
built incorrectly.  After that, i don't know, perhaps
creeping corruption on the two machines?

BTW: those are my latest builds or rsync/cygwin:
$ md5sum /usr/bin/rsync.exe
60ee6b728e94c3dde4b5da0274057b7e */usr/bin/rsync.exe
$ sha1sum /usr/bin/rsync.exe
d4afecddd974fbdddc3d4d152f8a0d8da942daf8 */usr/bin/rsync.exe
$ rsync --version
rsync  version 2.5.6  protocol version 26
Copyright (C) 1996-2002 by Andrew Tridgell and others
http://rsync.samba.org/
Capabilities: 32-bit files, socketpairs, hard links, symlinks, batchfiles,
 no IPv6, 32-bit system inums, 64-bit internal inums
rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
- --
Lapo 'Raist' Luchini
[EMAIL PROTECTED] (PGP  X.509 keys available)
http://www.lapo.it (ICQ UIN: 529796)
-BEGIN PGP SIGNATURE-
Version: PGP 8.0 - not licensed for commercial use: www.pgp.com
iQA/AwUBPuJzWmiYgizI8lL7EQJFrgCgrBfEmfpOHdmM6BudrBb4ucvSjoYAnRQ7
1BCozzEfoknqAi/R1kN9P6YE
=fqCd
-END PGP SIGNATURE-
--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync and cygwin

2003-06-07 Thread cbarratt
 On Sat, Jun 07, 2003 at 05:36:03PM -0500, Trey Nolen wrote:
  HmmThis backup server is handling backup jobs from about a dozen other
  machines -- all but three are Linux. The other three are Windows 2000
  servers running Cygwin.  Two of those are not working (giving the error).
  The backup machine has other duties, too, and has not had any issues. The
  uptime is great, and the performance has been fine.

Another thing to try is the --blocking-io or --no-blocking-io options.
It's not too likely this will help (since some of your cygwin boxes work
ok), but it is still worth a try.  This was an issue on older Solaris
versions because of problems with their ssh.

I've also found rsync over ssh on cygwin is not fully reliable.  On
the other hand, rsyncd on cygwin has been highly reliable for me.  So
if all else fails you could run rsyncd as a WinXX service (assuming
you initiate the connection from the backup server rather than the
other way around).

Craig
-- 
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: rsync and cygwin

2003-06-07 Thread Lapo Luchini
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
jw schultz wrote:

I'd then consider that the cygwin binaries might have been
built incorrectly.  After that, i don't know, perhaps
creeping corruption on the two machines?
BTW: those are my latest builds or rsync/cygwin:

$ md5sum /usr/bin/rsync.exe
60ee6b728e94c3dde4b5da0274057b7e */usr/bin/rsync.exe
$ sha1sum /usr/bin/rsync.exe
d4afecddd974fbdddc3d4d152f8a0d8da942daf8 */usr/bin/rsync.exe
$ rsync --version
rsync  version 2.5.6  protocol version 26
Copyright (C) 1996-2002 by Andrew Tridgell and others
http://rsync.samba.org/
Capabilities: 32-bit files, socketpairs, hard links, symlinks, batchfiles,
 no IPv6, 32-bit system inums, 64-bit internal inums
rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
- --
Lapo 'Raist' Luchini
[EMAIL PROTECTED] (PGP  X.509 keys available)
http://www.lapo.it (ICQ UIN: 529796)
-BEGIN PGP SIGNATURE-
Version: PGP 8.0 - not licensed for commercial use: www.pgp.com
iQA/AwUBPuJzWmiYgizI8lL7EQJFrgCgrBfEmfpOHdmM6BudrBb4ucvSjoYAnRQ7
1BCozzEfoknqAi/R1kN9P6YE
=fqCd
-END PGP SIGNATURE-


--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: state of the rsync nation? (revisited 6/2003 from 11/2000)

2003-06-07 Thread Donovan Baarda
On Sun, 2003-06-08 at 00:31, Jeff Kowalczyk wrote:
 I'm interested in these very questions (librsync-rsync relationship,
 remaining limitations of rsync, active prospects for ground-up rewrites),
 Google searches for rsync info have proved a little too vague due to the
 programs ubiquity. Much has certainly changed since this was written,
 could some people with knowledge in these areas could update martin's
 response for the state of rsync, June 2003? Thanks.

regarding librsync... It is still in sort-of-active development on
SourceForge by a variety of developers... a new release is waiting in
CVS for me to finally get around to releasing it, but I'm busy on a big
contract at the moment so its currently on hold pending some more
cygwin/win32 testing. It is in active use by projects like rdiff-backup.

AFAIK, rproxy is pretty much dead, and the only version that exists
depends on a very old version of libhsync. The closest thing to this
available now is the http proxy proof of concept with xdelta, but it's
radically different in many ways to the old rproxy (due to xdelta not
using signatures).

 On 13 Nov 2000, Jason Ozolins wrote:
 http://lists.samba.org/pipermail/rsync/2000-November/003147.html
  Just a quick question: is the librsync contained within the rproxy
  source code meant to be tracking the development of the mainstream
  rsync, or is it a stripped-down thing meant only to support rproxy?
 
 On 13 Nov 2000, Martin Pool Responded: Here's a quick history:
[...]
 rsync gave rise to Josh Macdonald's XDelta, which is optimized for the
 case where old and new versions are on the same machine, and so it can
 generate more efficient deltas.

xdelta is still under active development by Josh, and is evolving into a
fancy versioning virtual file system... an ideal back-end for something
like subversion. Josh tends to develop stuff with little fanfare, but
his code tends to be _very_ clean.

 tridge extracted the algorithm into librsync, which I renamed to libhsync
 when I changed the wire format.  The code currently checked in as librsync
 is in my opinion not very good.  It tries to make the algorithm available
 at various levels to programs that would like to use it, though the only
 user at the moment is rproxy.  rsync doesn't use libhsync -- possibly it
 never will, as we care enough about rsync performance that tighter
 integration is justified.  Well, if we were starting from scratch it might
 be separated out, but it's not worth doing it retrospectively now.
[...]

This is largely still true, except libhsync changed back to librsync and
now has its own project on SourceForge separate from the mostly defunct
rproxy. librsync itself has no wire format, being just a general
purpose signature/delta/patch library implementing the rsync algorithm.

The comments about rsync never using libhsync/librsync are still true
for the foreseeable future. There are many things rsync includes that
are still missing from librsync, and the rsync implementation is very
tightly coupled, with many backwards compatibility issues. Even when
librsync reaches the point of being as good or better than rsync at
signature/delta/patch calculation, it would be a major task to fit it
into rsync.

rsync also has more active development, mostly in the form of
incremental feature additions and the resulting bugfix fire-fighting,
all of which lead to an even more tangled implementation. Occasionally
there are efforts to re-write and clean up sections of the code, but
they are (rightly) regarded cautiously because of the breakage risk
involved for little immediate gain.

The librsync code in CVS is still largely not very good. It is pretty 
messy and needs a good cleanup. The API is mostly OK though, and it
_does_ work quite well, with no known bugs. I have some plans for a
major cleanup and optimisation of the code based on my experiences with
pysync. I have a patch submitted that I plan to commit after the next
release that optimises and cleans up the delta calculation code quite a
bit.

The next big thing in delta calculation is probably going to be the
vcdiff encoding format, which should allow a common delta format for
various applications and supports self-referencing delta's, which
makes it capable of compression. According to the xdelta project this
has already been implemented, and I'm keen to see Josh's code, as it
could be used as the basis for a cleanup/replacement of at least the
patch component of librsync.

Possibly worth also mentioning is things like pysync which is a
demonstration implementation of rsync and xdelta, as well as a wrapper
for librsync. I'm kind of embarrassed though that at the moment
rdiff-backup probably has a better python wrapper of librsync than
pysync does.

I believe there has also been some implementations of rsync in Perl (one
that claims to talk to rsync, which is an amazing achievement), but I'm
not up to date on those. I think someone has a Perl wrapper for librsync
that was being used as