On Tue, Sep 03, 2002 at 05:00:02PM -0700, [EMAIL PROTECTED] wrote: > > JW, > > I pushed everything to a LINUX box and did the diff again this time with the > -bur option. > It does look significantly different. I think I've got below what you are > looking for, so I am > resending it to [EMAIL PROTECTED] > > I'm including two additional people as they have asked for the patch information > as well. > The diff was performed on main.c (2.5.5 version from rsync.samba.org). I hope > everyone > knows that I'm my purpose in providing this information, is so that everyone can > critique > my code (to make it better) and we can eliminate this problem for everyone. > > Dave
Thank you Dave. I've been abusing/educating him a bit offline so be nice everybody. He has not only diagnosed the bug independantly but has come up with a direction to go toward a permanent solution. Aside from my dislike of a fixed size array my comments are below. > ------------------------------------------------------------- > > --- main.c.orig Tue Sep 3 16:38:23 2002 > +++ main.c Tue Sep 3 16:41:08 2002 > @@ -27,6 +27,11 @@ > > extern int verbose; > > +struct pid_status { > + pid_t pid; > + int status; > + } pid_stat_table[10]; The use of a literal here is bad news. Should be a #define CHILD_TABLE_SIZE or whatever. Then macro can be used in the loops. Question. Is 10 a good size? Should it be more? I've not dug into the child spawning of rsync. > + > static void show_malloc_stats(void); > > /**************************************************************************** > @@ -34,9 +39,27 @@ > ****************************************************************************/ > void wait_process(pid_t pid, int *status) > { > - while (waitpid(pid, status, WNOHANG) == 0) { > + pid_t waited_pid; > + int cnt; > + > + while ( 1 ) { > + waited_pid = waitpid(pid, status, WNOHANG); > + if ( waited_pid == 0 ) { > msleep(20); > io_flush(); > + } else break; > + } > + if (( waited_pid == -1 ) && ( errno == ECHILD )) { > + /* status of requested child no longer available. Check */ > + /* to see if it was processed by the sigchld_handler. */ > + cnt = 0; > + while ( cnt < 10 ) { > + if ( pid == pid_stat_table[cnt].pid ) { > + *status = pid_stat_table[cnt].status; If we do 'pid_stat_table[cnt].pid = 0' here we'll be able to reuse the slots. (if necessary) > + break; > + } > + cnt++; > + } > } > > /* TODO: If the child exited on a signal, then log an > @@ -792,7 +815,22 @@ > > static RETSIGTYPE sigchld_handler(int UNUSED(val)) { > #ifdef WNOHANG > - while (waitpid(-1, NULL, WNOHANG) > 0) ; > + int cnt = 0; > + pid_t pid = 0; > + int status = 0; > + while ( 1 ) { > + pid = waitpid(-1, &status, WNOHANG); > + cnt = 0; > + while ( cnt < 10 ) { > + if (pid_stat_table[cnt].pid == 0 ) { > + pid_stat_table[cnt].pid = pid; > + pid_stat_table[cnt].status = status; > + break; > + } > + cnt++; > + } > + if ( pid < 1 ) break; If we run off the end of the array (cnt == 10) we should kick out a warning or we should resize it. > + }; > #endif > } -- ________________________________________________________________ J.W. Schultz Pegasystems Technologies email address: [EMAIL PROTECTED] Remember Cernan and Schmitt -- To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html