Hello Yann, i'm observing some deadlocks again.
I'm using httpd 2.4.27 + mod_h2 + httpd-2.4.x-mpm_event-wakeup-v7.1.patch + your ssl linger fix patch from this thread What kind of information do you need? If you need a full stack backtrace - from which pid? Or from all httpd pids? Thanks! Greets, Stefan Am 14.07.2017 um 21:52 schrieb Yann Ylavic: > On Fri, Jun 30, 2017 at 1:33 PM, Yann Ylavic <[email protected]> wrote: >> On Fri, Jun 30, 2017 at 1:20 PM, Ruediger Pluem <[email protected]> wrote: >>> >>> On 06/30/2017 12:18 PM, Yann Ylavic wrote: >>>> >>>> IMHO mod_ssl shoudn't (BIO_)flush unconditionally in >>>> modssl_smart_shutdown(), only in the "abortive" mode of >>>> ssl_filter_io_shutdown(). >>> >>> I think the issue starts before that. >>> ap_prep_lingering_close calls the pre_close_connection hook and modules >>> that are registered >>> to this hook can perform all sort of long lasting blocking operations there. >>> While it can be argued that this would be a bug in the module I think the >>> only safe way >>> is to have the whole start_lingering_close_nonblocking being executed by a >>> worker thread. >> >> Correct, that'd be much simpler/safer indeed. >> We need a new SHUTDOWN state then, right? > > Actually it was less simple than expected, and it has some caveats > obviously... > > The attached patch does not introduce a new state but reuses the > existing CONN_STATE_LINGER since it was not really considered by the > listener thread (which uses CONN_STATE_LINGER_NORMAL and > CONN_STATE_LINGER_SHORT instead), but that's a detail. > > Mainly, start_lingering_close_nonblocking() now simply schedules a > shutdown (i.e. pre_close_connection() followed by immediate close) > that will we be run by a worker thread. > A new shutdown_linger_q is created/handled (with the same timeout as > the short_linger_q, namely 2 seconds) to hold connections to be > shutdown. > > So now when a connection times out in the write_completion or > keepalive queues, it needs (i.e. the listener may wait for) an > available worker to process its shutdown/close. > This means we can *not* close kept alive connections immediatly like > before when becoming short of workers, which will favor active KA > connections over new ones in this case (I don't think it's that > serious but the previous was taking care of that. For me it's up to > the admin to size the workers appropriately...). > > Same when a connection in the shutdown_linger_q itself times out, the > patch will require a worker immediatly to do the job (see > shutdown_lingering_close() callback). > > So overall, this patch may introduce the need for more workers than > before, what was (wrongly) done by the listener thread has to be done > somewhere anyway... > > Finally, I think there is room for improvements like batching > shutdowns in the same worker if there is no objection on the approach > so far. > > WDYT? > > Regards, > Yann. >
