A BUGNOTE has been added to this bug.
======================================================================
http://www.dbmail.org/mantis/bug_view_advanced_page.php?bug_id=0000162
======================================================================
Reported By:                xing
Assigned To:                
======================================================================
Project:                    DBMail
Bug ID:                     162
Category:                   POP3 daemon
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     new
======================================================================
Date Submitted:             18-Jan-05 01:44 CET
Last Modified:              31-Jan-05 08:12 CET
======================================================================
Summary:                    dbmail-pop3d zombies galore..
Description: 
Belive this problem started with 2.0.3

dbmail-pop3d is creating a bunch of dbmail-pop3d zombie proceses that must
be killed via kill -9 switch.

I see a lot of the following in my mail log. 

serverchild.c,CreateChild: child_register failed
Jan 17 16:29:16 mail dbmail/pop3d[19630]: serverchild.c,CreateChild:
child_register failed

as shown in ps:

19624 ?        Z      0:00 [dbmail-pop3d] <defunct>
19625 ?        Z      0:00 [dbmail-pop3d] <defunct>
19626 ?        Z      0:00 [dbmail-pop3d] <defunct>

I have 144 of these zombies at this very moment even though I just killed
them all and restarted pop3d daemon a minute ago.

Important Note: Setting trace=5 for pop3d ALLEVIATES the problem! Thus I
cannot provide trace info here. Weird. I have duplicated this many times
on my end before submitting this report.

Here is my relevant dbmail.conf entires:
[DBMAIL]
# Database settings
host=localhost
user=postfix
pass=postfix
db=dbmail
sqlsocket=/tmp/mysql.sock
# trace level for dbmail-maintenance
TRACE_LEVEL=1


[POP]
EFFECTIVE_USER=postfix            # the user that dbmail-pop3d will run as
(need to be root to bind to a port<1024)
EFFECTIVE_GROUP=postfix           # the group that dbmail-pop3d will run
as
BINDIP=*                          # the ipaddress the dbmail-pop3d server
has to bind to, * for all addresses
PORT=110                          # the port number the dbmail-pop3d
server has to bind to.
NCHILDREN=5                       # default number of POP3 handlers (each
is a process)
MAXCHILDREN=20                    # mac. number of POP3 handlers
MAXCONNECTS=10000                 # the maximum number of connections a
default childs makes
TIMEOUT=31                        # the time (s) before the dbmail-pop3d
should shutdown a connection which is being idle.
RESOLVE_IP=no                    # if yes, the pop daemon resolves IP
numbers to DNS names in the log
POP_BEFORE_SMTP=no
TRACE_LEVEL=1




======================================================================

----------------------------------------------------------------------
 paul - 18-Jan-05 09:25 CET 
----------------------------------------------------------------------
Xing,

I recently changed the manage_stop_children code to fix bug 
http://www.dbmail.org/mantis/bug_view_advanced_page.php?bug_id=0000158. Could
you please test the current 2.0 cvs code to check if that also helps in
your case?

----------------------------------------------------------------------
 xing - 18-Jan-05 11:47 CET 
----------------------------------------------------------------------
Checked out the CVS branch and still have the exact same problem.

Again the weird thing here is that the bug is completedly gone, when trace
is set to 5 for pop daemon in dbmail.conf. 

My only theory based on the trace level difference is perhaps the trace=5
produces noticeable "delays" between thread/process forking which allow
the system to work? Without the verbose trace, the server is trying to
spawn way too fast? Just a wild guess.

Extra info:

I can reproduce this bug with trace=1 almost immediately upon pop3d
startup each time. However, sometimes, the startup would be fine but after
3-5 minutes, all the childs get unregistered and the registering/failed
attempts create the same zombie pool. So the problem not only related to
startup.

edited on: 18-Jan-05 11:47

----------------------------------------------------------------------
 sersop - 26-Jan-05 11:39 CET 
----------------------------------------------------------------------
the same problem for dbmail-pop3d and dbmail-lmtpd on high load system

Fedora Core 2
Linux  2.6.10 
http://www.dbmail.org/mantis/bug_view_advanced_page.php?bug_id=0000001 SMP Mon 
Jan 24 14:01:32 YEKT 2005 i686 i686 i386
GNU/Linux

----------------------------------------------------------------------
 xing - 31-Jan-05 08:12 CET 
----------------------------------------------------------------------
Running the trace=5 workdaround has so far eliminated the pop3d errors for
the past week but today my 2.0.3 dbmail-pop3d servers completedly locked
up. It will not accept any new connections yet it is running. I feel this
is related to the zombie problem as far as the server thread starting and
killing child processes.

Bug History
Date Modified  Username       Field                    Change              
======================================================================
18-Jan-05 01:44xing           New Bug                                      
18-Jan-05 09:25paul           Bugnote Added: 0000539                       
18-Jan-05 11:42xing           Bugnote Added: 0000540                       
18-Jan-05 11:47xing           Bugnote Edited: 0000540                      
26-Jan-05 11:39sersop         Bugnote Added: 0000569                       
31-Jan-05 08:12xing           Bugnote Added: 0000572                       
======================================================================

Reply via email to