A NOTE has been added to this issue. ====================================================================== http://www.dbmail.org/mantis/view.php?id=363 ====================================================================== Reported By: ryo Assigned To: ====================================================================== Project: DBMail Issue ID: 363 Category: General Reproducibility: sometimes Severity: minor Priority: normal Status: new target: ====================================================================== Date Submitted: 12-Jun-06 09:22 CEST Last Modified: 13-Jun-06 07:05 CEST ====================================================================== Summary: Somtimes the count of grandchild processes does not decrease. Description: I'm sorry, my English is poor.
After many access to dbmail-imapd, somtimes the count of grandchild processes does not decrease to NCHILDREN all the time. I could know by using strace command that the child process of dbmail-imapd stopped at the waitpid() as follows. [EMAIL PROTECTED] ~]# strace -p 21208 Process 21208 attached - interrupt to quit waitpid(3422, I sent SIGTERM to the grandchild process(in the above example:pid = 3422) with kill command, then the child process resume and the count of grandchild processes decreased. I think this cause is that the waitpid function is called without WNOHANG option in the pool.c:reap_child(). Is this intentional? Any idea? ====================================================================== Relationships ID Summary ---------------------------------------------------------------------- related to 0000361 IMAP zombies after about a day. ====================================================================== ---------------------------------------------------------------------- aaron - 12-Jun-06 18:13 ---------------------------------------------------------------------- For bug http://www.dbmail.org/mantis/view.php?id=361, I removed a trigger of this bug, but it looks like the core issue is reaping the exit status from child processes. ---------------------------------------------------------------------- kaname - 13-Jun-06 07:05 ---------------------------------------------------------------------- I think that I should change the parameter of waitpid() as follows. Note is that processing stops in waitpid() when failing in kill(). Kill() sometimes fails though it succeeds almost. Kill is done as for pid that fails in kill() some time because reap_child() is called again later. ------------------------------------------------------------- # diff -urN -U 9 pool.c~ pool.c --- pool.c~ 2006-06-09 11:31:11.000000000 +0900 +++ pool.c 2006-06-13 13:47:44.939044486 +0900 @@ -461,19 +461,19 @@ static pid_t reap_child() { pid_t chpid=0; if ((chpid = get_idle_spare()) < 0) return chpid; kill(chpid, SIGTERM); - if (waitpid(chpid, NULL, 0) == chpid) + if (waitpid(chpid, NULL, WNOHANG|WUNTRACED) == chpid) scoreboard_release(chpid); return chpid; } void manage_spare_children() { /* * --------------------------------------------------------------- Issue History Date Modified Username Field Change ====================================================================== 12-Jun-06 09:22 ryo New Issue 12-Jun-06 18:11 aaron Relationship added related to 0000361 12-Jun-06 18:13 aaron Note Added: 0001244 12-Jun-06 18:19 aaron Relationship added child of 0000364 13-Jun-06 07:05 kaname Note Added: 0001246 ======================================================================