Re: exit status 13 in version 3.1

2009-11-15 Thread Wayne Davison
On Thu, Nov 12, 2009 at 10:00 PM, Matt McCutchen m...@mattmccutchen.netwrote:

 My point is that it's a clunky way to achieve the goal, and it would be
 simpler for the sender to just keep reading after a write error.


Yeah, that's a good idiom, and the latest code wasn't doing a good enough
job of that when the socket closed.  One new issue turned out to be that the
client started getting a read error (where the old code would get a write
error) due to the buffer-filling code getting an EOF on the socket (even
though the perform_io() call was ostensibly there to try to write some
data).  There was also a bug where an EOF error did not try to empty out any
message data currenly in the buffer.

I've committed some improvements for this, so the error reporting should be
better when talking to older versions.  For instance, the input code now
just notes an input EOF unless it is there to read some data and the data is
not present.  It also does a better job of draining any pending error
messages from the input buffer (and reading any final messages) when a write
error occurs.

As for the arbitrary io_timeout that gets set, yes, it is debatable if it is
a good idea.  While not strictly necessary, I kind of like having it just as
a final sanity check for the error-exit code.  I did make the arbitrary
timeout larger, though, since it should not ever really be needed.  I'll
also review the error-handling a bit more, and possibly reconsider this
further.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: exit status 13 in version 3.1

2009-11-12 Thread Matt McCutchen
On Sun, 2009-11-08 at 01:57 -0500, Matt McCutchen wrote:
 I tested commit 2907af472d1f33b3c422cb9f601c121b242aa9c7 and, again, the
 output is different but the problem is not fixed:
 
 $ rsync-dev big-file small-fs/
 rsync: connection unexpectedly closed (146 bytes received so far) [sender]
 
 $ rsync-dev --msgs2stderr big-file small-fs/
 rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device 
 (28)
 rsync: connection unexpectedly closed (36 bytes received so far) [sender]
 rsync error: error in file IO (code 11) at io.c(181) [receiver=3.1.0dev]

I tested commit cece2e3f5e335b8d1bd0862dbc9edbf2d5a4f5dd and the problem
is finally fixed:

$ rsync-dev big-file small-fs/
rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device 
(28)
rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev]

The output is not as clean as Wayne may have intended for remote pushes
and pulls, but at least the critical information is still there:

$ rsync-dev localhost:PATH/TO/big-file small-fs/
rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device 
(28)
rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev]
rsync: connection unexpectedly closed (35 bytes received so far) [generator]

$ rsync-dev big-file localhost:PATH/TO/small-fs/
rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device 
(28)
rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev]
rsync: [sender] write error: Broken pipe (32)

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: exit status 13 in version 3.1

2009-11-12 Thread Matt McCutchen
On Sat, 2009-11-07 at 09:38 -0800, Wayne Davison wrote:
 Yeah, that's the long-standing issue where a fatal error on the server
 side can cause the client side to get a socket error trying to write
 to the socket before it has a chance to read the error(s) from the
 socket.  The latest git archive finally has a fix for this.

It looks like the implementation has the receiver hang around for a
hard-coded 10 seconds, accepting data from the sender and discarding it.
That's a hack: I don't like to have the sender dependent upon this
special cooperation from the receiver in the event of abnormal
termination.  It seems to me that when the sender hits a write error, it
could just read messages on a best-effort basis before exiting, as in
the old IO code.  Is this approach unworkable for some reason?

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: exit status 13 in version 3.1

2009-11-12 Thread Wayne Davison
On Thu, Nov 12, 2009 at 5:47 PM, Matt McCutchen m...@mattmccutchen.netwrote:

 It looks like the implementation has the receiver hang around for a
 hard-coded 10 seconds, accepting data from the sender and discarding it.


No, it sets a timeout of 10 seconds (i.e. 10 seconds of inactivity), which
in the new protocol should never be reached because the we're exiting with
an error message gets everyone to die in unison.  The necessity of
discarding data is there due to the pipelining nature of rsync, particularly
if the error is coming from the receiver.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: exit status 13 in version 3.1

2009-11-12 Thread Matt McCutchen
On Thu, 2009-11-12 at 21:47 -0800, Wayne Davison wrote:
 On Thu, Nov 12, 2009 at 5:47 PM, Matt McCutchen
 m...@mattmccutchen.net wrote:
 It looks like the implementation has the receiver hang around
 for a
 hard-coded 10 seconds, accepting data from the sender and
 discarding it.
 
 No, it sets a timeout of 10 seconds (i.e. 10 seconds of inactivity),

You're right, my mistake.

 which in the new protocol should never be reached because the we're
 exiting with an error message gets everyone to die in unison.

Unless the network is slow.  IMO, hard-coding values like this should be
avoided when an easy alternative exists.

 The necessity of discarding data is there due to the pipelining nature
 of rsync, particularly if the error is coming from the receiver.

I understand that the data discarding serves to avoid giving the sender
a write error so that it survives to read the message explaining the
error exit.  My point is that it's a clunky way to achieve the goal, and
it would be simpler for the sender to just keep reading after a write
error.

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: exit status 13 in version 3.1

2009-11-07 Thread Wayne Davison
On Wed, Nov 4, 2009 at 8:05 PM, Matt McCutchen m...@mattmccutchen.netwrote:

 With commit 84c11e85a4c4a12ecacba24afe9617222e4361e6, I get different
 output, but still not the desired No space left on device:


Yeah, that's the long-standing issue where a fatal error on the server side
can cause the client side to get a socket error trying to write to the
socket before it has a chance to read the error(s) from the socket.  The
latest git archive finally has a fix for this.  It should make the exit
messages much cleaner for an abnormally-exiting protocol 31 transfer, both
receiving the final error from the server and exiting with the proper error
code.  When an older client is talking to a new server, the error should
also make it to the client, but the client will exit with the (sadly)
traditional error about an unexpectedly closed connection.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: exit status 13 in version 3.1

2009-11-07 Thread Matt McCutchen
On Sat, 2009-11-07 at 09:38 -0800, Wayne Davison wrote:
 On Wed, Nov 4, 2009 at 8:05 PM, Matt McCutchen
 m...@mattmccutchen.net wrote:
 
 With commit 84c11e85a4c4a12ecacba24afe9617222e4361e6, I get
 different
 
 output, but still not the desired No space left on device:
 
 
 Yeah, that's the long-standing issue where a fatal error on the server
 side can cause the client side to get a socket error trying to write
 to the socket before it has a chance to read the error(s) from the
 socket.  The latest git archive finally has a fix for this.

I tested commit 2907af472d1f33b3c422cb9f601c121b242aa9c7 and, again, the
output is different but the problem is not fixed:

$ rsync-dev big-file small-fs/
rsync: connection unexpectedly closed (146 bytes received so far) [sender]

$ rsync-dev --msgs2stderr big-file small-fs/
rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device 
(28)
rsync: connection unexpectedly closed (36 bytes received so far) [sender]
rsync error: error in file IO (code 11) at io.c(181) [receiver=3.1.0dev]

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: exit status 13 in version 3.1

2009-11-04 Thread Wayne Davison
On Tue, Nov 3, 2009 at 9:25 AM, Matt McCutchen m...@mattmccutchen.netwrote:

 rsync error: errors with program diagnostics (code 13) at log.c(340)
 [sender=3.1.0dev]


This means that you didn't update recently.  Sadly, it appears my reply
mentioning that I fixed the problem only went to Carlos, and missed the
list.  Here it is:

Wayne said:

 That is likely to have been caused by the code going into --msgs2stderr
 mode when it gets a socket error, but rwrite() couldn't handle some of the
 more esoteric error codes (because they don't normally make it that far
 without being transformed into a simpler code).  I've checked in a fix for
 this.


If you would, please re-check your disk-overflow test case with the latest
code (and --msgs2stderr turned off) and see if that gets the error to you.

..wayne..
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

Re: exit status 13 in version 3.1

2009-11-03 Thread Matt McCutchen
On Thu, 2009-10-29 at 13:20 -0200, Carlos Carvalho wrote:
 Got this in the log:
 
 rsync error: errors with program diagnostics (code 13) at log.c(340) 
 [generator=
 3.1.0dev]

I got this too on a big local run backing up my system to an external
disk using rsnapshot.

/etc/rsnapshot-rsync -aAXxx --del --numeric-ids --relative \
--delete-excluded --remote-option=--munge-links --filter=P_/home/ \
--filter=P_/boot/ --exclude=/home/*/.gvfs \
--link-dest=/media/crypt-backup/snapshots-mattlaptop2/occasional.0/ml2/ \
/home /media/crypt-backup/snapshots-mattlaptop2/.sync/ml2/ 
rsync error: errors with program diagnostics (code 13) at log.c(340) 
[sender=3.1.0dev]
rsync error: received SIGUSR1 (code 19) at log.c(340) [generator=3.1.0dev]

(The /etc/rsnapshot-rsync wrapper script is just prepending another
--link-dest option to the command line.)

This probably has to do with Wayne's recent I/O changes.  When I have
some time tonight, I'll see if I can reproduce the problem and capture
some more information that might help Wayne figure it out.

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: exit status 13 in version 3.1

2009-11-03 Thread Matt McCutchen
On Tue, 2009-11-03 at 12:25 -0500, Matt McCutchen wrote:
 On Thu, 2009-10-29 at 13:20 -0200, Carlos Carvalho wrote:
  Got this in the log:
  
  rsync error: errors with program diagnostics (code 13) at log.c(340) 
  [generator=
  3.1.0dev]
 
 I got this too on a big local run backing up my system to an external
 disk using rsnapshot.

It looks like I was just out of disk space, but the No space left on
device message was not making it through the pipeline.  I reproduced
this with a single file and a small reiserfs image:

$ rsync-dev big-file small-fs/
rsync error: errors with program diagnostics (code 13) at log.c(340) 
[sender=3.1.0dev]

$ rsync-dev --msgs2stderr big-file small-fs/
rsync: write failed on /PATH-TO/small-fs/big-file: No space left on device 
(28)
rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev]
rsync error: errors with program diagnostics (code 13) at log.c(340) 
[sender=3.1.0dev]
rsync error: received SIGUSR1 (code 19) at main.c(1387) [generator=3.1.0dev]

Since I'm doing a local run, I can probably use --msgs2stderr for now.

-- 
Matt

-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


Re: exit status 13 in version 3.1

2009-10-29 Thread Carlos Carvalho
Carlos Carvalho (car...@fisica.ufpr.br) wrote on 29 October 2009 13:20:
 Got this in the log:
 
 rsync error: errors with program diagnostics (code 13) at log.c(340)
 [generator= 3.1.0dev]

Another event:

rsync: read error: Connection reset by peer (104)rsync error: errors with 
program diagnostics (code 13) at log.c(340) [generator=3.1.0dev]
rsync error: received SIGUSR1 (code 19) at main.c(1384) [receiver=3.1.0dev]
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html