Re: prefork 'orphaned child' messages
One of my associates suggested this patch. The idea is to reduce loop_sleep if we had to spawn a lot of children last time through the loop. This way on restart (since $created_children is initialized to $idle_children), and on sudden concurrency spikes, we loop quickly enough to spawn the children we need very quickly; whereas under normal conditions, we don't bother. I still decreased loop_sleep by half in our own version, so that the longest wait clients will experience during a drastic concurrency increase would be just slightly above 15 seconds. Would there be any interest in accepting this if I tested it and put it in my git? -Jared On 06/02/2009 08:09 AM, Jared Johnson wrote: Why not set --idle-children at start-up to something higher (or just 0 to disable)? Setting it higher is a bit bothersome because our QP children use too much memory right now (mainly from using DBIx::Class), so it would be a bit unfortunate to have more memory-consuming kids around that aren't needed. As for disabling idle-children.. what exactly would the behavior be with disabled idle children? In a situation where concurrency is saturated, it seems that QP spawns $idle_children processes every $loop_wait seconds. Would QP instead spawn a new child _immediately_ when a new connection attempt came in? Or if it still waits $loop_wait seconds, how many children does it spawn when it sees some are needed? -Jared --- a/qpsmtpd-prefork +++ b/qpsmtpd-prefork @@ -75,7 +75,7 @@ my $max_children= 15; # max number of child processes to spawn my $idle_children = 5;# number of idle child processes to spawn my $maxconnip = 10; my $child_lifetime = 100; # number of times a child may be reused -my $loop_sleep = 30; # seconds main_loop sleeps before checking children +my $loop_sleep = 15; # seconds main_loop sleeps before checking children my $re_nice = 5;# substracted from parents current nice level my $d_start = 0; my $quiet = 0; @@ -339,10 +339,11 @@ sub reaper { #arg0: void #ret0: void sub main_loop { +my $created_children = $idle_children; while (1) { # if there is no child death to process, then sleep EXPR seconds # or until signal (i.e. child death) is received -sleep $loop_sleep unless @children_term; +sleep $loop_sleep / ($created_children * 2 + 1) unless @children_term; # block CHLD signals to avoid race my $sigset = block_signal(SIGCHLD); @@ -379,9 +380,9 @@ sub main_loop { } # spawn children -for (my $i = scalar(keys %children) ; $i < $chld_pool ; $i++) { -new_child();# add to the child pool -} +$created_children = $chld_pool - keys %children; +$created_children = 0 if $created_children < 0; +new_child() for 1..$created_children; # unblock signals unblock_signal($sigset);
Re: prefork 'orphaned child' messages
Jared Johnson wrote: Why not set --idle-children at start-up to something higher (or just 0 to disable)? Setting it higher is a bit bothersome because our QP children use too much memory right now (mainly from using DBIx::Class), so it would be a bit unfortunate to have more memory-consuming kids around that aren't needed. I understand :-) As for disabling idle-children.. what exactly would the behavior be with disabled idle children? Parent would spawn --max-children. -- Best regards, Diego d'Ambra smime.p7s Description: S/MIME Cryptographic Signature
Re: prefork 'orphaned child' messages
Why not set --idle-children at start-up to something higher (or just 0 to disable)? Setting it higher is a bit bothersome because our QP children use too much memory right now (mainly from using DBIx::Class), so it would be a bit unfortunate to have more memory-consuming kids around that aren't needed. As for disabling idle-children.. what exactly would the behavior be with disabled idle children? In a situation where concurrency is saturated, it seems that QP spawns $idle_children processes every $loop_wait seconds. Would QP instead spawn a new child _immediately_ when a new connection attempt came in? Or if it still waits $loop_wait seconds, how many children does it spawn when it sees some are needed? -Jared
Re: prefork 'orphaned child' messages
Diego d'Ambra wrote: [...] But you're right, there is also code in the reaper function - remove the array of children terminated, hmmm... I think we should delete that. This can't be deleted - parent is using this to track children, clean-up and possible reset of shared memory. -- Best regards, Diego d'Ambra smime.p7s Description: S/MIME Cryptographic Signature
Re: prefork 'orphaned child' messages
Robert Spier wrote: Diego d'Ambra wrote: Charlie Brady wrote: On Fri, 29 May 2009, Diego d'Ambra wrote: [...] Latest version of prefork also handles a possible race better, the parent will detect a lock and reset shared memory. Sorry, I've to correct myself, that's not true. Apparently my previously suggested changes didn't make it into trunk. Please re-post it. After looking a little more at the code, I did find a reset of shared memory - Radu just implemented my patch in a better way :-) -- Best regards, Diego d'Ambra smime.p7s Description: S/MIME Cryptographic Signature
Re: prefork 'orphaned child' messages
Jared Johnson wrote: Even if you're not near max children the parent will only spawn max idle children, then it sleeps until an event, wake-up and see if more children are needed. Debug log should give some clue, if this is the reason. Bingo. Looking further into things, it was apparent that a freshly restarted node was grossly underprovisioned WRT child processes. On intial restart, it had 5... 30 seconds later, it had 10. I manually set $loop_sleep=5 (originally hard-coded to 30) and this addressed the issue. Is this drastically smaller loop likely to have any side-effects? Parent using more CPU time looking after children, but I guess it's minor. If not, it might be a good idea to set this upstream. Why not set --idle-children at start-up to something higher (or just 0 to disable)? -- Best regards, Diego d'Ambra smime.p7s Description: S/MIME Cryptographic Signature
Re: prefork 'orphaned child' messages
Even if you're not near max children the parent will only spawn max idle children, then it sleeps until an event, wake-up and see if more children are needed. Debug log should give some clue, if this is the reason. Bingo. Looking further into things, it was apparent that a freshly restarted node was grossly underprovisioned WRT child processes. On intial restart, it had 5... 30 seconds later, it had 10. I manually set $loop_sleep=5 (originally hard-coded to 30) and this addressed the issue. Is this drastically smaller loop likely to have any side-effects? If not, it might be a good idea to set this upstream. -Jared
Re: prefork 'orphaned child' messages
Diego d'Ambra wrote: > > Charlie Brady wrote: > > On Fri, 29 May 2009, Diego d'Ambra wrote: > [...] > >> > >> Latest version of prefork also handles a possible race better, the > >> parent will detect a lock and reset shared memory. > > > > Sorry, I've to correct myself, that's not true. Apparently my > previously suggested changes didn't make it into trunk. Please re-post it. -R > > > While what you say may be true, I think there is further improvement > > to be made. Child processes should remove themselves from the shared > > memory hash, rather than do it by the parent via sigchild. > > > > That is how it currently should work - look at the end of function > qpsmtpd_session - the child removes pid from shared memory. > > But you're right, there is also code in the reaper function - remove > the array of children terminated, hmmm... I think we should delete > that. > > Also add code so the child detects if the parent has gone away (child > should exit, not go back and wait for next connection). > > I would do a patch myself, if time permitted, but currently none is > available, sorry. > > > is there someone who is the 'design authority' on this aspect of the > > prefork daemon? > > > > I don't know, but last time I suggested a patch Radu Greab made the > commit. I posted a PATCH: message to the list and he picked it up. > > -- > Best regards, > Diego d'Ambra
Re: prefork 'orphaned child' messages
Charlie Brady wrote: On Fri, 29 May 2009, Diego d'Ambra wrote: [...] Latest version of prefork also handles a possible race better, the parent will detect a lock and reset shared memory. Sorry, I've to correct myself, that's not true. Apparently my previously suggested changes didn't make it into trunk. While what you say may be true, I think there is further improvement to be made. Child processes should remove themselves from the shared memory hash, rather than do it by the parent via sigchild. That is how it currently should work - look at the end of function qpsmtpd_session - the child removes pid from shared memory. But you're right, there is also code in the reaper function - remove the array of children terminated, hmmm... I think we should delete that. Also add code so the child detects if the parent has gone away (child should exit, not go back and wait for next connection). I would do a patch myself, if time permitted, but currently none is available, sorry. is there someone who is the 'design authority' on this aspect of the prefork daemon? I don't know, but last time I suggested a patch Radu Greab made the commit. I posted a PATCH: message to the list and he picked it up. -- Best regards, Diego d'Ambra smime.p7s Description: S/MIME Cryptographic Signature
Re: prefork 'orphaned child' messages
On Fri, 29 May 2009, Diego d'Ambra wrote: Jared Johnson wrote: > What's orphaned is not a child process, but a shared mem hash record for > a > process which no longer exists. I suspect that code is racy. Hrm, then if we're getting a whole lot of these, does this mean child processes are going away at a high rate? I wouldn't expect such a condition using prefork with relatively low concurrency, maybe this reflects a problem during the end of the connection.. Are you running latest version of prefork? IIRC there was an issue with calculation of needed children, so if your current running children are all terminated at once, it could take some time before the parent notice this and spawns new. Even if you're not near max children the parent will only spawn max idle children, then it sleeps until an event, wake-up and see if more children are needed. Debug log should give some clue, if this is the reason. Latest version of prefork also handles a possible race better, the parent will detect a lock and reset shared memory. While what you say may be true, I think there is further improvement to be made. Child processes should remove themselves from the shared memory hash, rather than do it by the parent via sigchild. is there someone who is the 'design authority' on this aspect of the prefork daemon? --- Charlie
Re: prefork 'orphaned child' messages
Jared Johnson wrote: What's orphaned is not a child process, but a shared mem hash record for a process which no longer exists. I suspect that code is racy. Hrm, then if we're getting a whole lot of these, does this mean child processes are going away at a high rate? I wouldn't expect such a condition using prefork with relatively low concurrency, maybe this reflects a problem during the end of the connection.. Are you running latest version of prefork? IIRC there was an issue with calculation of needed children, so if your current running children are all terminated at once, it could take some time before the parent notice this and spawns new. Even if you're not near max children the parent will only spawn max idle children, then it sleeps until an event, wake-up and see if more children are needed. Debug log should give some clue, if this is the reason. Latest version of prefork also handles a possible race better, the parent will detect a lock and reset shared memory. -- Best regards, Diego d'Ambra smime.p7s Description: S/MIME Cryptographic Signature
Re: prefork 'orphaned child' messages
Inbound and outbound email scanned for spam and viruses by the DoubleCheck Email Manager v5: http://www.doublecheckemail.com Do we have to be exposed to this spam? *blush* Since I administer our qp installation that adds those, I suppose I could exempt myself without anybody noticing :) -Jared
Re: prefork 'orphaned child' messages
On Fri, 29 May 2009, Jared Johnson wrote: What's orphaned is not a child process, but a shared mem hash record for a process which no longer exists. I suspect that code is racy. Hrm, then if we're getting a whole lot of these, does this mean child processes are going away at a high rate? Perhaps. Would need further analysis. I think we already established a little while ago that lifetime of pre-forked child processes is cut short by any plugin which does a disconnect. That would lead to premature child death, but wouldn't by itself be a race. What I suspect is that there is a backlog of sigchild processing while the shared mem is locked (or perhaps missed sigchild). I haven't looked in great detail at the code, but I think the race would be removed by the child itself removing its entry in the shared mem hash. A dying child would be blocked waiting for the lock until the parent had finished its !kill(0, $pid) loop. I wouldn't expect such a condition using prefork with relatively low concurrency, maybe this reflects a problem during the end of the connection.. -Jared Inbound and outbound email scanned for spam and viruses by the DoubleCheck Email Manager v5: http://www.doublecheckemail.com Do we have to be exposed to this spam?
Re: prefork 'orphaned child' messages
On Fri, 2009-29-05 at 11:47 -0500, Larry Nedry wrote: > Hey Guy, Better to CC the list. > > I'd like a copy of your script please. http://p6.hpfamily.net/myTune Enjoy. This version is public domain but it'll probably be GPL/Artistic if I find time to improve it. I have a different email address listed in the pod. > > Nedry > > On 5/29/09 at 10:48 AM -0400 you wrote: > >On Fri, 2009-29-05 at 08:26 -0500, Jared Johnson wrote: > >> The basic problem we've been encountering is that very rarely, all of > >> our dozen QP nodes inexplicably introduce long delays before answering > >> with a banner (no banner delay involved); watching the logs, it doesn't > >> look like any child is spawned during this wait, and we aren't anywhere > >> near max-children. > > > >Mysql can run out of database connections and it can be tuned from the > >SQL prompt. I have a script which allows tuning parameters safely from > >the shell. It will also dump all the tunable parameters to STDOUT. I > >can post it if anyone wants. > > > >-- > >--gh > -- --gh
Re: prefork 'orphaned child' messages
What's orphaned is not a child process, but a shared mem hash record for a process which no longer exists. I suspect that code is racy. Hrm, then if we're getting a whole lot of these, does this mean child processes are going away at a high rate? I wouldn't expect such a condition using prefork with relatively low concurrency, maybe this reflects a problem during the end of the connection.. -Jared Inbound and outbound email scanned for spam and viruses by the DoubleCheck Email Manager v5: http://www.doublecheckemail.com
Re: prefork 'orphaned child' messages
On Thu, 28 May 2009, Jared Johnson wrote: We're experiencing some strange issues and have been looking at qpsmtpd-prefork's output with $debug set. We're getting a whole lot of lines like this: orphaned child, pid: 1285 removed from memory at /usr/bin/qpsmtpd-prefork line 598. ... Any ideas? What's orphaned is not a child process, but a shared mem hash record for a process which no longer exists. I suspect that code is racy.
Re: prefork 'orphaned child' messages
On Fri, 2009-29-05 at 08:26 -0500, Jared Johnson wrote: > The basic problem we've been encountering is that very rarely, all of > our dozen QP nodes inexplicably introduce long delays before answering > with a banner (no banner delay involved); watching the logs, it doesn't > look like any child is spawned during this wait, and we aren't anywhere > near max-children. Mysql can run out of database connections and it can be tuned from the SQL prompt. I have a script which allows tuning parameters safely from the shell. It will also dump all the tunable parameters to STDOUT. I can post it if anyone wants. -- --gh
Re: prefork 'orphaned child' messages
No I don't think it's normal. What are you doing in your plugins? Wasn't there some issue we uncovered a while ago to do with MySQL? I don't recall hearing of issues with mysql.. we use postgres here and do, well, lots of stuff: lookups for for global and ip-based max concurrency rules in hook_pre_connect, lookups for various preferences in hook_rcpt.. if it's a plugin that's breaking things then it's almost certainly our own code, as we've re-written most of qp's plugins for our uses :) Any clue what's going on in general? Do children wind up 'orphaned' when they go completely unresponsive, e.g. they're hanging out in some infinite loop somewhere or something similarly evil? The basic problem we've been encountering is that very rarely, all of our dozen QP nodes inexplicably introduce long delays before answering with a banner (no banner delay involved); watching the logs, it doesn't look like any child is spawned during this wait, and we aren't anywhere near max-children. Eventually, a child is spawned and then handles the connection normally, assuming the client hasn't timed out. And eventually, every slave recovers and starts behaving normally. It seems like this issue would have to be network related in some way, as every node is affected around the same time; DNS was the first obvious choice but we don't seem to be having any problems with that. One we find said network problem this issue might go away, but it seems that our smtpd should not be so fragile to whatever issue is hanging it up.. -Jared Inbound and outbound email scanned for spam and viruses by the DoubleCheck Email Manager v5: http://www.doublecheckemail.com
Re: prefork 'orphaned child' messages
On Thu, 28 May 2009, Jared Johnson wrote: orphaned child, pid: 1285 removed from memory at /usr/bin/qpsmtpd-prefork line 598. [snip] Is this expected behavior? Note that since we have some customizations to qpsmtpd-prefork and plenty of other forked code, I'm not necessarily ready to call this a QP bug rather than a bug in my own code. But I'm not sure what would cause these warnings to be generated, or even whether it's reasonable to assume they reflect any bug at all. Any ideas? No I don't think it's normal. What are you doing in your plugins? Wasn't there some issue we uncovered a while ago to do with MySQL? Matt.