Op 10-6-2019 om 16:14 schreef Willy Tarreau:
On Mon, Jun 10, 2019 at 04:06:13PM +0200, PiBa-NL wrote:
Things certainly look better again now regarding this issue.
Running the test repeatedly, and manually looking over the results its
pretty much as good as it was before. There seems to be a 1 ms increase in
the check-duration, but maybe this is because of the moved initialization
which on startup delays the first test a millisecond or something?
It should not. At this point I think it can be anything including measurement
noise or even thread assigment on startup!
Below some test results that are based on manual observation and some in my
head filtering of the console output.. (mistakes included ;) )
repeat 10 ./vt -v ./work/haproxy-*/reg-tests/checks/tls_health_checks.vtc |
grep Layer7 | grep OK | grep WARNING
Commit-ID , min-max time for +-95% check durations , comment
e4d7c9d , 6 - 9 ms , all tests pass (1 tests out of +- a hundred showed
29ms , none below 6ms and almost half of them show 7ms)
6ec902a , 11 - 150 ms , of the 12 tests that passed
That's quite a difference indeed.
e186161 , 5 - 8 ms , all tests pass (1 test used 15 ms, more than half
the tests show 5ms check duration the majority of the remainder show 6ms)
I'm not sure if this deserves further investigation at the moment, i think
it does not. Thanks for spending your weekend on this :) that wasn't my
Oh don't worry, you know I'm a low-level guy, just give me a problem to solve
with a few bits available only and I can spend countless hours on it! Others
entertain themselves playing games, for me this is a game :-)
Thanks a lot for testing, at least we know there isn't another strange
thing hidden behind.
After a bit more fiddling i noticed that the new startup method seems
more CPU intensive.
Also it can be seen the vtest does take a bit longer to pass 1.3sec v.s.
0.8sec even though the health-check durations themselves are short as
expected. Also its using quite a bit more 'user' cpu. I was wondering if
this is a consequence of the new init sequence, or perhaps some
improvement is still needed there.? I noticed this after trying to run
multiple tests simultaneously again they interfered more with each-other
then they used to..
** h1 1.279 WAIT4 pid=63820 status=0x0002 (user 4.484293 sys 0.054781)
** h2 1.394 WAIT4 pid=63823 status=0x0002 (user 4.637692 sys 0.015588)
# top TEST ./test/tls_health_checks-org.vtc passed (1.395)
Before the latest changes it used less 'user':
** h1 0.783 WAIT4 pid=65811 status=0x0002 (user 1.077052 sys 0.031218)
** h2 0.897 WAIT4 pid=65814 status=0x0002 (user 0.341360 sys 0.037928)
# top TEST ./test/tls_health_checks-org.vtc passed (0.899)
And with 'nbthread 1' the user cpu usage is even more dramatically lower
with the same test..
** h1 0.684 WAIT4 pid=67990 status=0x0002 (user 0.015203 sys 0.015203)
** h2 0.791 WAIT4 pid=67993 status=0x0002 (user 0.013551 sys 0.009034)
# top TEST ./test/tls_health_checks-org.vtc passed (0.793)
** h1 0.682 WAIT4 pid=65854 status=0x0002 (user 0.007158 sys 0.021474)
** h2 0.790 WAIT4 pid=65857 status=0x0002 (user 0.007180 sys 0.014361)
If a single threaded haproxy process can run with 0.015 user-cpu-usage,
i would not have expected it to required 4.4 on a 16 core cpu for the
same startup&actions. Where it should be easier to spawn a second thread
with the already parsed config instead of more expensive.?. Even if it
parses the config once in each thread separately it doesn't make sense
So i thought also to try with 'nbthread 8' and that still seems to be
'alright' as seen below.. so i guess with the default of nbthread 16 the
h1 and h2 get into some conflict fighting over the available cores.??.
And haproxy by default will use all cores since 2.0-dev3 so i guess it
might cause some undesirable effects in the field once it gets released
and isn't the only process running on a machine, and even if it is the
only intensive process, i wonder what other VM's might think about it on
the same hypervisor, though i know VM's always give 'virtual
performance' ;) ..
Running with nbthread 8, still relatively low user usage & test time:
** h1 0.713 WAIT4 pid=68467 status=0x0002 (user 0.197443 sys 0.022781)
** h2 0.824 WAIT4 pid=68470 status=0x0002 (user 0.184567 sys 0.026366)
# top TEST ./test/tls_health_checks-org.vtc passed (0.825)
Hope you can make sense of some of this. Sorry for not noticing earlier,
i guess i was to focused at only the health-check-duration. Or maybe its
just me interpreting the numbers wrongly, that's surely also an option.