On Wed, Jun 20, 2018 at 04:42:58PM +0200, Christopher Faulet wrote:
> When HAProxy is shutting down, it exits the polling loop when there is no jobs
> anymore (jobs == 0). When there is no thread, it works pretty well, but when
> HAProxy is started with several threads, a thread can decide to exit because
> jobs variable reached 0 while another one is processing a task (e.g. a
> health-check). At this stage, the running thread could decide to request a
> synchronization. But because at least one of them has already gone, the others
> will wait infinitly in the sync point and the process will never die.

Just a comment on this last sentence, I think this is the root cause of the
problem : a thread quits and its thread_mask bit doesn't disappear. In my
opinion if we're looping, it's precisely because there's no way by looking
at the all_threads_mask if some threads are missing. Thus I think that a more
reliable long term solution would require that each exiting thread does at
least "all_threads_mask &= ~tid_bit".

Now I have no idea whether or not the current sync point code is compatible
with this nor if this will be sufficient, but I'm pretty sure that over time
we'll have to go this way to fix this inconsistency.

Willy

Reply via email to