On Tue, Jun 11, 2019 at 09:06:46AM +0200, Willy Tarreau wrote:
> I'd like you to give it a try in your environment to confirm whether or
> not it does improve things. If so, I'll clean it up and merge it. I'm
> also interested in any reproducer you could have, given that the made up
> test case I did above doesn't even show anything alarming.
No need to waste your time anymore, I now found how to reproduce it with
this config :
global
stats socket /tmp/sock1 mode 666 level admin
nbthread 64
backend stopme
timeout server 1s
option tcp-check
tcp-check send "debug dev exit\n"
server cli unix@/tmp/sock1 check
The I run it in loops bound to different CPU counts :
$ time for i in {1..20}; do
taskset -c 0,1,2,3 ./haproxy -db -f slow-init.cfg >/dev/null 2>&1
done
With a single CPU, it can take up to 10 seconds to run the loop on
commits e186161 and e4d7c9d while it takes 0.18 second with the patch.
With 4 CPUs like above, it takes 1.5s with e186161, 2.3s with e4d7c9d
and 0.16 second with the patch.
The tests I had run consisted in starting hundreds of thousands of
listeners to amplify the impact of the start time, but in the end
it was diluting the extra time in an already very long time. Running
it in loops like above is quite close to what regtests do and explains
why I couldn't spot the difference (e.g. a few hundreds of ms at worst
among tens of seconds).
Thus I'm merging the patch now (cleaned up already and tested as well
without threads).
Let's hope it's the last time :-)
Thanks,
Willy