I think I've tracked down another hang in rsync 2.4.6. This one appears to be caused by the sender process finishing up all its work and going into a pid-reading loop before it finishes reading all the error stream coming in from the generator process -- if this data is large enough, the generator hangs waiting for the sender to read the data. A simple way to reproduct the hang is to simply rsync a local copy of the Linux source (e.g. 2.4.3) to another local dir with options -av. The fix appears to be to put the read of the final goodbye message before the sender begins its pid-wait loop. This allows the error stream to flush, and things don't hang. Here's what I changed: Index: main.c --- main.c 29 May 2001 14:37:54 -0000 1.127 +++ main.c 5 Jun 2001 09:30:44 -0000 @@ -504,15 +504,15 @@ rprintf(FINFO,"file list sent\n"); send_files(flist,f_out,f_in); + if (remote_version >= 24) { + /* final goodbye message */ + read_int(f_in); + } if (pid != -1) { if (verbose > 3) rprintf(FINFO,"client_run waiting on %d\n",pid); io_flush(); wait_process(pid, &status); - } - if (remote_version >= 24) { - /* final goodbye message */ - read_int(f_in); } report(-1); exit_cleanup(status); I'd appreciate it if someone more familiar with this code would take a look at this to see if there might be any unforseen problems with this change. I've also refined my previous anti-hang patch some more since I noticed that in a really rare circumstance it could cause the buffered redo bytes to get read in the wrong order (for this to happen the input buffer had to be empty, and some error output had to arrive at the same time as some redo bytes and we had to be in the read function reading the raw redo-channel fd -- when all that came together, bytes would get written down the pipe to the sender causing redo bytes to get buffered and the following read of the redo fd would read some bytes in the wrong order). While I was at it I also made my input-buffer code more efficient in certain boundary cases (it might do too much memcpy-ing if the buffer was nearly full and the read and write calls started to alternate). My latest anti-hang changes (including the change above) can be grabbed from here: http://www.clari.net/~wayne/rsync-nohang.patch This is relative to the CVS source, and replaces my previous patches. ..wayne..