Hello Yann,

i'm observing some deadlocks again.

I'm using
httpd 2.4.27
+ mod_h2
+ httpd-2.4.x-mpm_event-wakeup-v7.1.patch
+ your ssl linger fix patch from this thread

What kind of information do you need? If you need a full stack backtrace
 - from which pid? Or from all httpd pids?

Thanks!

Greets,
Stefan

Am 14.07.2017 um 21:52 schrieb Yann Ylavic:
> On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <[email protected]> wrote:
>> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <[email protected]> wrote:
>>>
>>> On 06/30/2017 12:18 PM, Yann Ylavic wrote:
>>>>
>>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in
>>>> modssl_smart_shutdown(), only in the "abortive" mode of
>>>> ssl_filter_io_shutdown().
>>>
>>> I think the issue starts before that.
>>> ap_prep_lingering_close calls the pre_close_connection hook and modules 
>>> that are registered
>>> to this hook can perform all sort of long lasting blocking operations there.
>>> While it can be argued that this would be a bug in the module I think the 
>>> only safe way
>>> is to have the whole start_lingering_close_nonblocking being executed by a 
>>> worker thread.
>>
>> Correct, that'd be much simpler/safer indeed.
>> We need a new SHUTDOWN state then, right?
> 
> Actually it was less simple than expected, and it has some caveats 
> obviously...
> 
> The attached patch does not introduce a new state but reuses the
> existing CONN_STATE_LINGER since it was not really considered by the
> listener thread (which uses CONN_STATE_LINGER_NORMAL and
> CONN_STATE_LINGER_SHORT instead), but that's a detail.
> 
> Mainly, start_lingering_close_nonblocking() now simply schedules a
> shutdown (i.e. pre_close_connection() followed by immediate close)
> that will we be run by a worker thread.
> A new shutdown_linger_q is created/handled (with the same timeout as
> the short_linger_q, namely 2 seconds) to hold connections to be
> shutdown.
> 
> So now when a connection times out in the write_completion or
> keepalive queues, it needs (i.e. the listener may wait for) an
> available worker to process its shutdown/close.
> This means we can *not* close kept alive connections immediatly like
> before when becoming short of workers, which will favor active KA
> connections over new ones in this case (I don't think it's that
> serious but the previous was taking care of that. For me it's up to
> the admin to size the workers appropriately...).
> 
> Same when a connection in the shutdown_linger_q itself times out, the
> patch will require a worker immediatly to do the job (see
> shutdown_lingering_close() callback).
> 
> So overall, this patch may introduce the need for more workers than
> before, what was (wrongly) done by the listener thread has to be done
> somewhere anyway...
> 
> Finally, I think there is room for improvements like batching
> shutdowns in the same worker if there is no objection on the approach
> so far.
> 
> WDYT?
> 
> Regards,
> Yann.
> 

Reply via email to