Hi again Pieter, On Tue, Jun 11, 2019 at 04:24:47AM +0200, Willy Tarreau wrote: > I'm > going to have a look at this this morning. I now see how to make things > worse to observe the changes, I suspect that forcing a high nbthread and > binding all of them to a single CPU should reveal the issue much better.
So I cannot reproduce your cases but by cheating I could make a very slight difference : I have started 50 processes in parallel, all on CPU #0, and all having 64 threads. That's a total of 3200 threads on a single CPU. Doing this with the TLS health check regtest, I see that before the patches it tool 14.2 seconds and after it took 14.7. However by modifying the startup code with the attached patch, it goes down to 11.3 seconds. I'd like you to give it a try in your environment to confirm whether or not it does improve things. If so, I'll clean it up and merge it. I'm also interested in any reproducer you could have, given that the made up test case I did above doesn't even show anything alarming. Thank you! Willy