Re: exit status 13 in version 3.1
On Thu, Nov 12, 2009 at 10:00 PM, Matt McCutchen m...@mattmccutchen.netwrote: My point is that it's a clunky way to achieve the goal, and it would be simpler for the sender to just keep reading after a write error. Yeah, that's a good idiom, and the latest code wasn't doing a good enough job of that when the socket closed. One new issue turned out to be that the client started getting a read error (where the old code would get a write error) due to the buffer-filling code getting an EOF on the socket (even though the perform_io() call was ostensibly there to try to write some data). There was also a bug where an EOF error did not try to empty out any message data currenly in the buffer. I've committed some improvements for this, so the error reporting should be better when talking to older versions. For instance, the input code now just notes an input EOF unless it is there to read some data and the data is not present. It also does a better job of draining any pending error messages from the input buffer (and reading any final messages) when a write error occurs. As for the arbitrary io_timeout that gets set, yes, it is debatable if it is a good idea. While not strictly necessary, I kind of like having it just as a final sanity check for the error-exit code. I did make the arbitrary timeout larger, though, since it should not ever really be needed. I'll also review the error-handling a bit more, and possibly reconsider this further. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Sun, 2009-11-08 at 01:57 -0500, Matt McCutchen wrote: I tested commit 2907af472d1f33b3c422cb9f601c121b242aa9c7 and, again, the output is different but the problem is not fixed: $ rsync-dev big-file small-fs/ rsync: connection unexpectedly closed (146 bytes received so far) [sender] $ rsync-dev --msgs2stderr big-file small-fs/ rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device (28) rsync: connection unexpectedly closed (36 bytes received so far) [sender] rsync error: error in file IO (code 11) at io.c(181) [receiver=3.1.0dev] I tested commit cece2e3f5e335b8d1bd0862dbc9edbf2d5a4f5dd and the problem is finally fixed: $ rsync-dev big-file small-fs/ rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev] The output is not as clean as Wayne may have intended for remote pushes and pulls, but at least the critical information is still there: $ rsync-dev localhost:PATH/TO/big-file small-fs/ rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev] rsync: connection unexpectedly closed (35 bytes received so far) [generator] $ rsync-dev big-file localhost:PATH/TO/small-fs/ rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev] rsync: [sender] write error: Broken pipe (32) -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Sat, 2009-11-07 at 09:38 -0800, Wayne Davison wrote: Yeah, that's the long-standing issue where a fatal error on the server side can cause the client side to get a socket error trying to write to the socket before it has a chance to read the error(s) from the socket. The latest git archive finally has a fix for this. It looks like the implementation has the receiver hang around for a hard-coded 10 seconds, accepting data from the sender and discarding it. That's a hack: I don't like to have the sender dependent upon this special cooperation from the receiver in the event of abnormal termination. It seems to me that when the sender hits a write error, it could just read messages on a best-effort basis before exiting, as in the old IO code. Is this approach unworkable for some reason? -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Thu, Nov 12, 2009 at 5:47 PM, Matt McCutchen m...@mattmccutchen.netwrote: It looks like the implementation has the receiver hang around for a hard-coded 10 seconds, accepting data from the sender and discarding it. No, it sets a timeout of 10 seconds (i.e. 10 seconds of inactivity), which in the new protocol should never be reached because the we're exiting with an error message gets everyone to die in unison. The necessity of discarding data is there due to the pipelining nature of rsync, particularly if the error is coming from the receiver. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Thu, 2009-11-12 at 21:47 -0800, Wayne Davison wrote: On Thu, Nov 12, 2009 at 5:47 PM, Matt McCutchen m...@mattmccutchen.net wrote: It looks like the implementation has the receiver hang around for a hard-coded 10 seconds, accepting data from the sender and discarding it. No, it sets a timeout of 10 seconds (i.e. 10 seconds of inactivity), You're right, my mistake. which in the new protocol should never be reached because the we're exiting with an error message gets everyone to die in unison. Unless the network is slow. IMO, hard-coding values like this should be avoided when an easy alternative exists. The necessity of discarding data is there due to the pipelining nature of rsync, particularly if the error is coming from the receiver. I understand that the data discarding serves to avoid giving the sender a write error so that it survives to read the message explaining the error exit. My point is that it's a clunky way to achieve the goal, and it would be simpler for the sender to just keep reading after a write error. -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Wed, Nov 4, 2009 at 8:05 PM, Matt McCutchen m...@mattmccutchen.netwrote: With commit 84c11e85a4c4a12ecacba24afe9617222e4361e6, I get different output, but still not the desired No space left on device: Yeah, that's the long-standing issue where a fatal error on the server side can cause the client side to get a socket error trying to write to the socket before it has a chance to read the error(s) from the socket. The latest git archive finally has a fix for this. It should make the exit messages much cleaner for an abnormally-exiting protocol 31 transfer, both receiving the final error from the server and exiting with the proper error code. When an older client is talking to a new server, the error should also make it to the client, but the client will exit with the (sadly) traditional error about an unexpectedly closed connection. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Sat, 2009-11-07 at 09:38 -0800, Wayne Davison wrote: On Wed, Nov 4, 2009 at 8:05 PM, Matt McCutchen m...@mattmccutchen.net wrote: With commit 84c11e85a4c4a12ecacba24afe9617222e4361e6, I get different output, but still not the desired No space left on device: Yeah, that's the long-standing issue where a fatal error on the server side can cause the client side to get a socket error trying to write to the socket before it has a chance to read the error(s) from the socket. The latest git archive finally has a fix for this. I tested commit 2907af472d1f33b3c422cb9f601c121b242aa9c7 and, again, the output is different but the problem is not fixed: $ rsync-dev big-file small-fs/ rsync: connection unexpectedly closed (146 bytes received so far) [sender] $ rsync-dev --msgs2stderr big-file small-fs/ rsync: write failed on /PATH/TO/small-fs/big-file: No space left on device (28) rsync: connection unexpectedly closed (36 bytes received so far) [sender] rsync error: error in file IO (code 11) at io.c(181) [receiver=3.1.0dev] -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Tue, Nov 3, 2009 at 9:25 AM, Matt McCutchen m...@mattmccutchen.netwrote: rsync error: errors with program diagnostics (code 13) at log.c(340) [sender=3.1.0dev] This means that you didn't update recently. Sadly, it appears my reply mentioning that I fixed the problem only went to Carlos, and missed the list. Here it is: Wayne said: That is likely to have been caused by the code going into --msgs2stderr mode when it gets a socket error, but rwrite() couldn't handle some of the more esoteric error codes (because they don't normally make it that far without being transformed into a simpler code). I've checked in a fix for this. If you would, please re-check your disk-overflow test case with the latest code (and --msgs2stderr turned off) and see if that gets the error to you. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Thu, 2009-10-29 at 13:20 -0200, Carlos Carvalho wrote: Got this in the log: rsync error: errors with program diagnostics (code 13) at log.c(340) [generator= 3.1.0dev] I got this too on a big local run backing up my system to an external disk using rsnapshot. /etc/rsnapshot-rsync -aAXxx --del --numeric-ids --relative \ --delete-excluded --remote-option=--munge-links --filter=P_/home/ \ --filter=P_/boot/ --exclude=/home/*/.gvfs \ --link-dest=/media/crypt-backup/snapshots-mattlaptop2/occasional.0/ml2/ \ /home /media/crypt-backup/snapshots-mattlaptop2/.sync/ml2/ rsync error: errors with program diagnostics (code 13) at log.c(340) [sender=3.1.0dev] rsync error: received SIGUSR1 (code 19) at log.c(340) [generator=3.1.0dev] (The /etc/rsnapshot-rsync wrapper script is just prepending another --link-dest option to the command line.) This probably has to do with Wayne's recent I/O changes. When I have some time tonight, I'll see if I can reproduce the problem and capture some more information that might help Wayne figure it out. -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
On Tue, 2009-11-03 at 12:25 -0500, Matt McCutchen wrote: On Thu, 2009-10-29 at 13:20 -0200, Carlos Carvalho wrote: Got this in the log: rsync error: errors with program diagnostics (code 13) at log.c(340) [generator= 3.1.0dev] I got this too on a big local run backing up my system to an external disk using rsnapshot. It looks like I was just out of disk space, but the No space left on device message was not making it through the pipeline. I reproduced this with a single file and a small reiserfs image: $ rsync-dev big-file small-fs/ rsync error: errors with program diagnostics (code 13) at log.c(340) [sender=3.1.0dev] $ rsync-dev --msgs2stderr big-file small-fs/ rsync: write failed on /PATH-TO/small-fs/big-file: No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(334) [receiver=3.1.0dev] rsync error: errors with program diagnostics (code 13) at log.c(340) [sender=3.1.0dev] rsync error: received SIGUSR1 (code 19) at main.c(1387) [generator=3.1.0dev] Since I'm doing a local run, I can probably use --msgs2stderr for now. -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: exit status 13 in version 3.1
Carlos Carvalho (car...@fisica.ufpr.br) wrote on 29 October 2009 13:20: Got this in the log: rsync error: errors with program diagnostics (code 13) at log.c(340) [generator= 3.1.0dev] Another event: rsync: read error: Connection reset by peer (104)rsync error: errors with program diagnostics (code 13) at log.c(340) [generator=3.1.0dev] rsync error: received SIGUSR1 (code 19) at main.c(1384) [receiver=3.1.0dev] -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html