I've trying to do a backup from a remote host to a local machine via rsync using the following configuration:

Remote host (Red Hat Enterprise Linux ES3):

$ uname -srmpi
Linux 2.4.21-47.EL i686 athlon i386
$ cat /etc/redhat-release
Red Hat Enterprise Linux ES release 3 (Taroon Update 8)
$ rsync --version
rsync  version 2.6.8  protocol version 29

Local machine (Mac OS X 10.4.8, Intel):

$ uname -v
Darwin Kernel Version 8.8.1: Mon Sep 25 19:42:00 PDT 2006; root:xnu-792.13.8.obj~1/RELEASE_I38
$ rsync --version
rsync  version 2.6.8  protocol version 29

The backup is started from the local machine with root privs (using sudo), encrypted transport provided by SSH with public-key authentication; the invocation looks like this:

sudo rsync -e "ssh -i full_path_to_key_file" -avzx \
           --numeric-ids --delete --progress \
           [EMAIL PROTECTED]:/ /var/root/backups/example.com/

On the remote host, rsync logs in with root privileges and a forced command appears in the authorized keys file; logging shows that this is the actual command being sent to the server:

rsync --server --sender -vlogDtprxz --numeric-ids . /

Now, I think the SSH side of things is working fine because dry runs (with the -n switch) work fine but as soon as I try to do a real backup I always bail with an error like this:

Invalid file index: -1610612736 (count=9406) [sender]
rsync error: protocol incompatibility (code 2) at sender.c(169) [sender=2.6.8] rsync: writefd_unbuffered failed to write 4092 bytes [generator]: Broken pipe (32)rsync: connection unexpectedly closed (195901 bytes received so far) [receiver] rsync error: error in rsync protocol data stream (code 12) at io.c (463) [receiver=2.6.8] rsync error: error in rsync protocol data stream (code 12) at io.c (1119) [generator=2.6.8]

Sometimes the numbers are different (-1610612736, 1610612736 or 524288) but the error is always the same: an "Invalid file index" error and rsync subsequently dies.

In a previous post to the list (http://lists.samba.org/archive/rsync/ 2006-June/015828.html) Wayne wrote this:

This number is exactly 0x60000000, so another possibility that comes to mind is that the byte-order messed up somehow. Look in byteorder.h and make sure that CAREFUL_ALIGNMENT is being defined on any system that uses most-significant-byte-first ordering.

I'm getting the same number (1610612736), and I also note that 524288 is a suspicious-looking number as well (0x80000). I don't think there is a byte-ordering issue at play here. The configure script correctly identifies the build machine as little-endian (i386) and even if I override things and set CAREFUL_ALIGNMENT rsync still bails with those invalid file index errors.

Some warnings were produced during the build but they all look harmless to me. Things like:

rsync.c: In function ‘sig_int’:
rsync.c:231: warning: unused parameter ‘val’

This is because __attribute__((__unused__)) is omitted on the Mac OS X builds because of a bug in GCC; incidentally, that bug is no longer present in the current version:

$ gcc --version
i686-apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build 5363)

And some other warnings like this:

log.c: In function ‘rwrite’:
log.c:307: warning: passing argument 2 of ‘libiconv’ from incompatible pointer type

But that's just complaining because it expects a "const char*" and it's being passed a "char *", and it looks harmless enough to me.

"make test" succeeds with no failures. I also tried all this out with a copy of rsync built from yesterday's CVS and still the same problems...

Any ideas for how I could troubleshoot this further? I've tried running the backup with super verbosity (-vvv) at both ends; not sure if this will be of any use, but here are some relevant bits of output:

* It transfers the files list like so; this continues on for many thousands of lines:

receiving file list ...
server_sender starting pid=8651
[sender] make_file(.,*,2)
recv_file_name(.)
[sender] make_file(lost+found,*,2)
recv_file_name(lost+found)
[sender] make_file(boot,*,2)
recv_file_name(boot)
[sender] make_file(dev,*,2)
recv_file_name(dev)
[sender] make_file(proc,*,2)
recv_file_name(proc)
[sender] make_file(tmp,*,2)
recv_file_name(tmp)
[sender] make_file(var,*,2)
recv_file_name(var)
[sender] make_file(etc,*,2)
recv_file_name(etc)
[sender] make_file(root,*,2)
recv_file_name(root)
[sender] make_file(usr,*,2)
recv_file_name(usr)
[sender] make_file(lib,*,2)
recv_file_name(lib)
[sender] make_file(bin,*,2)
recv_file_name(bin)
[sender] make_file(home,*,2)
recv_file_name(home)
[sender] make_file(initrd,*,2)
recv_file_name(initrd)
[sender] make_file(mnt,*,2)
recv_file_name(mnt)
[sender] make_file(opt,*,2)
recv_file_name(opt)
[sender] make_file(sbin,*,2)
recv_file_name(sbin)
[sender] make_file(misc,*,2)
recv_file_name(misc)
[sender] make_file(nsr,*,2)
recv_file_name(nsr)
[sender] make_file(.autofsck,*,2)
recv_file_name(.autofsck)
[sender] make_file(dev/pts,*,2)
recv_file_name(dev/pts)

* After about 320,000 lines this phase comes to an end and the next one starts:

[sender] make_file(nsr/cores/nsrexecd,*,2)
recv_file_name(nsr/cores/nsrexecd)
[sender] make_file(nsr/cores/nsrexecd/.nsr,*,2)
recv_file_name(nsr/cores/nsrexecd/.nsr)
received 161416 names
161416 files to consider
recv_file_list done
get_local_name count=161416 /var/root/backups/example.com/
generator starting pid=7608 count=161416
deleting in .
recv_files(161416) starting
delete_in_dir(.)
send_file_list done
send_files starting
[generator] make_file(..autofsck.RLTEQ5,*,2)
[generator] make_file(bin,*,2)
[generator] make_file(boot,*,2)
[generator] make_file(dev,*,2)
[generator] make_file(etc,*,2)
[generator] make_file(home,*,2)
[generator] make_file(lib,*,2)
deleting ..autofsck.RLTEQ5
delete_in_dir(bin)
[generator] make_file(bin/awk,*,2)
[generator] make_file(bin/bash2,*,2)
[generator] make_file(bin/bsh,*,2)
[generator] make_file(bin/csh,*,2)
[generator] make_file(bin/dnsdomainname,*,2)
[generator] make_file(bin/domainname,*,2)
[generator] make_file(bin/egrep,*,2)
[generator] make_file(bin/ex,*,2)
[generator] make_file(bin/fgrep,*,2)
[generator] make_file(bin/gtar,*,2)
[generator] make_file(bin/nisdomainname,*,2)
[generator] make_file(bin/rbash,*,2)

* This continues up to about the 348,000-line mark and we see "delta- transmission enabled":

delete_in_dir(lib)
delta-transmission enabled
recv_generator(.,0)
set modtime of . to (1153412202) Thu Jul 20 18:16:42 2006
recv_generator(.autofsck,1)
recv_generator(bin,2)
recv_generator(bin/arch,3)
recv_generator(bin/ash,4)
recv_generator(bin/ash.static,5)
recv_generator(bin/awk,6)
bin/awk is uptodate
recv_generator(bin/basename,7)
recv_generator(bin/bash,8)
recv_generator(bin/bash2,9)
bin/bash2 is uptodate
recv_generator(bin/bsh,10)
bin/bsh is uptodate

* After a while there are a whole bunch of "mknod" calls; up to this point we produce about 350,000 lines of log statements:

recv_generator(dev/hdo19,974)
mknod(dev/hdo19,060660,0x59000013)
set modtime of dev/hdo19 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo19 from 0 to 6
recv_generator(dev/hdo2,975)
mknod(dev/hdo2,060660,0x59000002)
set modtime of dev/hdo2 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo2 from 0 to 6
recv_generator(dev/hdo20,976)

* At this point it appears that the directories and device nodes have all been created, so it starts to actually send/receive regular files and very soon runs into the "Invalid file index" failure:

send_files(1, /.autofsck)
recv_files(.autofsck)
mknod(dev/hdo20,060660,0x59000014)
.autofsck
set modtime of dev/hdo20 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo20 from 0 to 6
recv_generator(dev/hdo21,977)
send_files(0, /.)
send_files(0, /.)
mknod(dev/hdo21,060660,0x59000015)
send_files(0, /.)
set modtime of dev/hdo21 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo21 from 0 to 6
recv_generator(dev/hdo22,978)
0 100% 0.00kB/s 0:00:00 (xfer#1, to- check=161414/161416)
send_files(0, /.)
mknod(dev/hdo22,060660,0x59000016)
Invalid file index: -1610612736 (count=161416) [sender]
set modtime of dev/hdo22 to (1088100278) Thu Jun 24 20:04:38 2004
rsync error: protocol incompatibility (code 2) at sender.c(169) [sender=2.6.8]
_exit_cleanup(code=2, file=sender.c, line=169): about to call exit(2)
set gid of dev/hdo22 from 0 to 6
recv_generator(dev/hdo23,979)
mknod(dev/hdo23,060660,0x59000017)

* Finally we see a bunch more mknod calls (about 3,000 lines' worth):

set modtime of dev/hdo23 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo23 from 0 to 6
recv_generator(dev/hdo24,980)
mknod(dev/hdo24,060660,0x59000018)
set modtime of dev/hdo24 to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/hdo24 from 0 to 6
recv_generator(dev/hdo25,981)
mknod(dev/hdo25,060660,0x59000019)

* And we ultimately die like this:

set modtime of dev/ptydd to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/ptydd from 0 to 5
recv_generator(dev/ptyde,1822)
mknod(dev/ptyde,020666,0x20000ee)
set modtime of dev/ptyde to (1088100278) Thu Jun 24 20:04:38 2004
set gid of dev/ptyde from 0 to 5
rsync: connection unexpectedly closed (3208995 bytes received so far) [generator] rsync error: error in rsync protocol data stream (code 12) at io.c (463) [generator=2.6.8]
_exit_cleanup(code=12, file=io.c, line=463): about to call exit(12)

So, not sure if there's anything in there which may explain the failure. Please let me know if there's any more information that would be useful.

--
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Reply via email to