The following issue has been REOPENED. 
====================================================================== 
http://www.dbmail.org/mantis/view.php?id=256 
====================================================================== 
Reported By:                idk
Assigned To:                paul
====================================================================== 
Project:                    DBMail
Issue ID:                   256
Category:                   General
Reproducibility:            always
Severity:                   major
Priority:                   normal
Status:                     feedback
====================================================================== 
Date Submitted:             20-Aug-05 23:54 CEST
Last Modified:              23-Aug-05 18:24 CEST
====================================================================== 
Summary:                    Invalid child management after database restart etc.
Description: 
After stopping mysql service all children killed by pool manager
(pool.c,manage_stop_children: General stop requested. Killing children..),
after mysql service starting MINSPARECHILDREN only was started and any more
children wasn't started even they was requested.
====================================================================== 

---------------------------------------------------------------------- 
 idk - 21-Aug-05 00:06  
---------------------------------------------------------------------- 
My suggestions are:

1) call of manage_start_children() instead of manage_spare_children()
after database resuming

2) after resuming db conn call alarm(10) for recovery alarm timer (I'm not
testing if is it adequate)

3) do corrections in LIFO and infinite loop described above 

---------------------------------------------------------------------- 
 paul - 22-Aug-05 10:15  
---------------------------------------------------------------------- 
I've fixed this problem. There was some faulty login in
manage_spare_children, the alarm is reset after the database resumes, and
the missing breaks were added. Thanks a lot for working on this. Please
test the latest svn code. 

---------------------------------------------------------------------- 
 idk - 23-Aug-05 18:24  
---------------------------------------------------------------------- 
I tried to add

trace(1, "spare: %d %d", count_children(), count_spare_children());

into manage_spare_children() just before first loop, I started daemon,
then I made MAX+ connections, this was logged (attached maillog.txt, I
hope)

Aug 23 17:13:21 start
Aug 23 17:13:21 spare: 5 5
Aug 23 17:13:51 spare: 5 5 - 20s, not 10
Aug 23 17:14:00 connect
Aug 23 17:14:01 spare: 5 4
Aug 23 17:14:05 disconnect
Aug 23 17:14:11 spare: 5 5
Aug 23 17:14:21 spare: 5 5
Aug 23 17:14:31 spare: 5 5
Aug 23 17:14:33+ connect 5 times
Aug 23 17:14:41 spare: 5 0 - last trace of this message, so alarm stopped
in 17:14:41-17:14:50
Aug 23 17:14:41 register children 5-19
Aug 23 17:14:41 child_register failed (21th, ok)

no more messages (alarm)

killall

Aug 23 17:23:31 got signal [15]
Aug 23 17:23:31 stop requested
Aug 23 17:23:31 child [19785] unregistered

all three 20 times, but ps ax shows many zombies

19782 ?        S      0:00 /_/dbmail/dbmail-2.0/.libs/lt-dbmail-imapd
19783 ?        S      0:00 /_/dbmail/dbmail-2.0/.libs/lt-dbmail-imapd
19785 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
19787 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
19789 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
19791 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
19793 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20075 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20077 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20079 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20081 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20083 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20085 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20087 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20089 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20091 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20093 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20095 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20097 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20099 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20101 ?        Z      0:00 [lt-dbmail-imapd] <defunct>
20103 ?        Z      0:00 [lt-dbmail-imapd] <defunct>

killing them step by step by their pid had no effect, I tried to start new
instance, but

Aug 23 17:23:39 File [/var/run/dbmail-imapd.pid] exists

So I deleted them

Aug 23 17:25:35 could not bind address to socket

Sorry, I have this production server only (no test servers), I had to
restart them immediatelly (due zombies), I cannot test this issue now,
maybe later (tonight UTC+0200, or weekend). 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
20-Aug-05 23:54 idk            New Issue                                    
21-Aug-05 00:06 idk            Note Added: 0000848                          
22-Aug-05 10:15 paul           Status                   new => resolved     
22-Aug-05 10:15 paul           Resolution               open => fixed       
22-Aug-05 10:15 paul           Assigned To               => paul            
22-Aug-05 10:15 paul           Note Added: 0000849                          
23-Aug-05 18:24 idk            Status                   resolved => feedback
23-Aug-05 18:24 idk            Resolution               fixed => reopened   
23-Aug-05 18:24 idk            Note Added: 0000872                          
======================================================================

Reply via email to