Some day, I knew I had to learn more about mpm_event. =)

Adding more DEBUGs, I see in the example below that 2 connections were idling 
at start of the graceful and the get added to the linger_chain. 2 workers are 
then woken up and process the socket. connection_count stays at 2 however. As I 
read it, that count drops when the connection pool is destroyed/cleanup'ed. 
This normally seem to happen on the worker_queue_info, but in this example this 
just does not happen.

Is this a correct read?

[Thu Jul 08 09:39:32.423198 2021] [mpm_event:trace1] [pid 9272:tid 4494732800] 
event.c(2973): graceful termination received, join workers
[Thu Jul 08 09:39:32.423032 2021] [mpm_event:debug] [pid 9272:tid 
123145324126208] event.c(2250): (4)Interrupted system call: AH: listener: 
process defer_linger_chain have_idle_worker=1 listener_may_exit=1 
connection_count=2
[Thu Jul 08 09:39:32.422765 2021] [mpm_event:debug] [pid 9272:tid 
123145321443328] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.422251 2021] [mpm_event:debug] [pid 9272:tid 
123145311784960] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.421638 2021] [mpm_event:debug] [pid 9272:tid 
123145318760448] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.423541 2021] [mpm_event:debug] [pid 9272:tid 
123145324126208] event.c(2259): (4)Interrupted system call: AH: listener: 
process defer_linger_chain pushed to worker, listener_may_exit=1 
connection_count=2
[Thu Jul 08 09:39:32.422212 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(2408): (4)Interrupted system call: AH: worker 
pop_something, workers_may_exit=0, dying=1
[Thu Jul 08 09:39:32.421806 2021] [mpm_event:debug] [pid 9272:tid 
123145312858112] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.422789 2021] [mpm_event:debug] [pid 9272:tid 
123145323053056] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.423346 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.423755 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(2431): (4)Interrupted system call: AH: worker 
pop_something interrupted, goto worker_pop
[Thu Jul 08 09:39:32.424230 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2402): (4)Interrupted system call: AH: worker read 
worker_queue, workers_may_exit=0, dying=1
[Thu Jul 08 09:39:32.424373 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(2402): (4)Interrupted system call: AH: worker read 
worker_queue, workers_may_exit=0, dying=1
[Thu Jul 08 09:39:32.424384 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2408): AH: worker pop_something, workers_may_exit=0, 
dying=1
[Thu Jul 08 09:39:32.424477 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2338): AH: process_defer_linger, processing socket, 
connection_count=2
[Thu Jul 08 09:39:32.424653 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2342): AH: process_defer_linger, processed socket, 
connection_count=2
[Thu Jul 08 09:39:32.424664 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2338): AH: process_defer_linger, processing socket, 
connection_count=2
[Thu Jul 08 09:39:32.424680 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2342): AH: process_defer_linger, processed socket, 
connection_count=2
[Thu Jul 08 09:39:32.424708 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2319): AH: process_defer_linger, nothing more to do
[Thu Jul 08 09:39:32.424754 2021] [mpm_event:debug] [pid 9272:tid 
123145317150720] event.c(2402): AH: worker read worker_queue, 
workers_may_exit=0, dying=1
[Thu Jul 08 09:39:32.424756 2021] [mpm_event:debug] [pid 9272:tid 
123145324126208] event.c(1978): AH: pollset returned listener_may_exit=1 
connection_count=2
[Thu Jul 08 09:39:32.424830 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(2408): AH: worker pop_something, workers_may_exit=0, 
dying=1
[Thu Jul 08 09:39:32.424910 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(808): AH: decrement_connection_count: 
last_connection=0 listener_may_exit=1 connection_count=1
[Thu Jul 08 09:39:32.424924 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(816): AH: decrement_connection_count: pollset_wakeup
[Thu Jul 08 09:39:32.424958 2021] [mpm_event:debug] [pid 9272:tid 
123145311248384] event.c(2319): AH: process_defer_linger, nothing more to do
[Thu Jul 08 09:39:32.424960 2021] [mpm_event:debug] [pid 9272:tid 
123145324126208] event.c(1978): (4)Interrupted system call: AH: pollset 
returned listener_may_exit=1 connection_count=1
[Thu Jul 08 09:39:37.349569 2021] [mpm_event:debug] [pid 9272:tid 
123145324126208] event.c(1978): (70007)The timeout specified has expired: AH: 
pollset returned listener_may_exit=1 connection_count=1
[Thu Jul 08 09:39:37.349661 2021] [mpm_event:trace1] [pid 9272:tid 
123145324126208] event.c(1816): All workers are busy or dying, will close 0 
keep-alive connections

Attachment: event_debugv3.diff
Description: Binary data

> Am 07.07.2021 um 18:49 schrieb Stefan Eissing <[email protected]>:
> 
> 
> 
>> Am 07.07.2021 um 18:34 schrieb Yann Ylavic <[email protected]>:
>> 
>> Does this attached patch help? This is to verify an hypothesis where
>> killed keepalive connections are still in the defer_linger_chain
>> without a worker to handle them.
>> 
>> I don't see "interrupted, but continuing" triggering in your log but
>> possibly it happened earlier..
> 
> 
> [Wed Jul 07 16:48:04.237865 2021] [mpm_event:debug] [pid 55134:tid 
> 4643098112] event.c(599): AH: wakeup_listener: start
> [Wed Jul 07 16:48:04.237895 2021] [mpm_event:debug] [pid 55134:tid 
> 4643098112] event.c(603): AH: wakeup_listener: apr_pollset_wakeup
> [Wed Jul 07 16:48:04.238042 2021] [mpm_event:debug] [pid 55134:tid 
> 4643098112] event.c(610): AH: wakeup_listener: ap_queue_info_term
> [Wed Jul 07 16:48:04.238224 2021] [mpm_event:debug] [pid 55134:tid 
> 123145362530304] event.c(1971): (4)Interrupted system call: AH: pollset 
> returned listener_may_exit=1 connection_count=1
> [Wed Jul 07 16:48:04.238381 2021] [mpm_event:trace1] [pid 55134:tid 
> 123145362530304] event.c(1809): All workers are busy or dying, will close 1 
> keep-alive connections
> [Wed Jul 07 16:48:04.238196 2021] [mpm_event:debug] [pid 55134:tid 
> 4643098112] event.c(629): AH: wakeup_listener: pthread_kill
> [Wed Jul 07 16:48:04.238513 2021] [mpm_event:debug] [pid 55134:tid 
> 123145362530304] event.c(1971): (4)Interrupted system call: AH: pollset 
> returned listener_may_exit=1 connection_count=1
> [Wed Jul 07 16:48:04.238506 2021] [mpm_event:debug] [pid 55134:tid 
> 4643098112] event.c(635): AH: wakeup_listener: end
> [Wed Jul 07 16:48:04.238555 2021] [mpm_event:trace1] [pid 55134:tid 
> 4643098112] event.c(2928): graceful termination received, join workers
> [Wed Jul 07 16:48:08.443130 2021] [mpm_event:debug] [pid 55134:tid 
> 123145362530304] event.c(1971): (70007)The timeout specified has expired: AH: 
> pollset returned listener_may_exit=1 connection_count=1
> [Wed Jul 07 16:48:08.443149 2021] [mpm_event:trace1] [pid 55134:tid 
> 123145362530304] event.c(1809): All workers are busy or dying, will close 0 
> keep-alive connections
> 
>> 
>> On Wed, Jul 7, 2021 at 5:51 PM Stefan Eissing
>> <[email protected]> wrote:
>>> 
>>> The things I do for you...
>>> 
>>> [Wed Jul 07 15:48:52.591983 2021] [mpm_event:debug] [pid 48747:tid 
>>> 4385197568] event.c(599): AH: wakeup_listener: start
>>> [Wed Jul 07 15:48:52.592000 2021] [mpm_event:debug] [pid 48747:tid 
>>> 4385197568] event.c(603): AH: wakeup_listener: apr_pollset_wakeup
>>> [Wed Jul 07 15:48:52.592046 2021] [mpm_event:debug] [pid 48747:tid 
>>> 4385197568] event.c(610): AH: wakeup_listener: ap_queue_info_term
>>> [Wed Jul 07 15:48:52.592188 2021] [mpm_event:debug] [pid 48747:tid 
>>> 4385197568] event.c(629): AH: wakeup_listener: pthread_kill
>>> [Wed Jul 07 15:48:52.592166 2021] [mpm_event:debug] [pid 48747:tid 
>>> 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset 
>>> returned listener_may_exit=1 connection_count=2
>>> [Wed Jul 07 15:48:52.592523 2021] [mpm_event:debug] [pid 48747:tid 
>>> 4385197568] event.c(635): AH: wakeup_listener: end
>>> [Wed Jul 07 15:48:52.592727 2021] [mpm_event:trace1] [pid 48747:tid 
>>> 4385197568] event.c(2942): graceful termination received, join workers
>>> [Wed Jul 07 15:48:52.592515 2021] [mpm_event:trace1] [pid 48747:tid 
>>> 123145502474240] event.c(1809): All workers are busy or dying, will close 2 
>>> keep-alive connections
>>> [Wed Jul 07 15:48:52.593381 2021] [mpm_event:debug] [pid 48747:tid 
>>> 123145502474240] event.c(1985): AH: pollset returned listener_may_exit=1 
>>> connection_count=2
>>> [Wed Jul 07 15:48:52.593478 2021] [mpm_event:debug] [pid 48747:tid 
>>> 123145502474240] event.c(1985): (4)Interrupted system call: AH: pollset 
>>> returned listener_may_exit=1 connection_count=1
>>> [Wed Jul 07 15:48:57.552623 2021] [mpm_event:debug] [pid 48747:tid 
>>> 123145502474240] event.c(1985): (70007)The timeout specified has expired: 
>>> AH: pollset returned listener_may_exit=1 connection_count=1
>>> [Wed Jul 07 15:48:57.552658 2021] [mpm_event:trace1] [pid 48747:tid 
>>> 123145502474240] event.c(1809): All workers are busy or dying, will close 0 
>>> keep-alive connections
>>> 
>>> 
>>> 
>>> Btw. I opened a bottle of nice wine - this is entertaining. ;-)
>>> 
>>> - Stefan
>>> 
>>>> Am 07.07.2021 um 17:48 schrieb Yann Ylavic <[email protected]>:
>>>> 
>>>> On Wed, Jul 7, 2021 at 5:45 PM Stefan Eissing
>>>> <[email protected]> wrote:
>>>>> 
>>>>>> Am 07.07.2021 um 17:39 schrieb Ruediger Pluem <[email protected]>:
>>>>>> 
>>>>>> Hm. The ap_log_error statically writes listener_may_exit=0. Can you put 
>>>>>> the actual value of listener_may_exit in the log message?
>>>>> 
>>>>> I put it statically since that was the value in the "if ()" 6 lines 
>>>>> above. it should have been visible then and there to work.
>>>> 
>>>> The above if () does not always break.
>>>> 
>>>> 
>>>> Cheers;
>>>> Yann.
>>> 
>> <event_EINTR.diff>
> 

Reply via email to