Hi,
HAProxy 2.4-dev11 was released on 2021/03/05. It added 60 new commits
after version 2.4-dev10.
This version got a lot of cleanups for code style, typos, naming, etc,
and brings some improvements to the wireshark peers protocol dissector.
In addition, that left us some time to start to attack some long-lasting
annoying issues that frequently pop up on the issue tracker from people
getting trace dumps under many threads. Having had the opportunity to
run extended tests on a 8core/16thread then on a 64 core machine allowed
us to address another dose of high contention issues. Among them, I can
list:
- excessive sharing on a few counters updated by the scheduler for
stats reporting
- excessive sharing of a few lists, such as the list of streams attached
to a server in order to honnor "shutdown server sessions" on the CLI.
- missing CPU relax calls in the multi-threaded lists, resulting in the
situation not to always recover
- expensive locking of the idle lists that happened on every I/O wakeup
On some test workloads running on 40 to 48 threads, the request rate had
increased by a factor of 14-20 and the response time decreased by as much
(in fact we were way past the point where CPU was essentially contention).
But more importantly, I used to occasionally trigger some watchdog panics
under extreme contention on certain lists. Also, thanks to @ngaugler who
continues to run some tests in relation to issue #822, now I've become
strongly convinced that a number of the occasional reports of panics in
socket() or socket_at() when running on many threads were just the outcome
of the expensive locking of the idle lists: one of the trace he provided
me showed a thread being killed there on the lock after not having done
anything that could justify looping, and the link with the socket() call
is just that it's the first syscall after these locks, and that it can
definitely trigger the check for the CPU timout.
For this reason I decided that some of these patches will have to be
backported becase some users are facing performance or stability issues
under certain situations. The patches were arranged to be easier to
backport and a -next branch was created for 2.3 with the backport
candidates in it, that survived all tests and showed close to same
performance gains.
As you can expect, I'm very interested in getting some test reports of
this version, especially from those facing occasional issues. In any case,
we'll try to emit another 2.3 next week, hopefully with some of these
improvements backported. I don't know yet if any of these ones will go to
2.2 though, time will tell.
There are still quite some cleanups pending in the todo list and some
issues to address but for now we're on the right track, so let's keep up
the good work and have all a nice week-end.
Please find the usual URLs below :
Site index : http://www.haproxy.org/
Discourse : http://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Wiki : https://github.com/haproxy/wiki/wiki
Sources : http://www.haproxy.org/download/2.4/src/
Git repository : http://git.haproxy.org/git/haproxy.git/
Git Web browsing : http://git.haproxy.org/?p=haproxy.git
Changelog : http://www.haproxy.org/download/2.4/src/CHANGELOG
Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/
Willy
PS: sorry for author "Ubuntu" below, it was me from a test machine, and
I've got caught a few times by this: when re-editing the commit
message later, the user never appears and I don't see that I need to
fix it. It will certainly continue to happen until git commit exposes
all fields like a mailer does :-/ Not a big deal anyway.
---
Complete changelog :
Amaury Denoyelle (7):
CLEANUP: backend: fix a wrong comment
BUG/MINOR: backend: free allocated bind_addr if reuse conn
MINOR: backend: handle reuse for conns with no server as target
REGTESTS: test http-reuse if no server target
DOC: fix originalto except clause on destination address
MINOR: backend: add a BUG_ON if conn mux NULL in connect_server
BUG/MINOR: backend: fix condition for reuse on mode HTTP
Christopher Faulet (8):
BUG/MINOR: tcp-act: Don't forget to set the original port for IPv4
set-dst rule
BUG/MINOR: connection: Use the client's dst family for adressless servers
BUG/MEDIUM: spoe: Kill applets if there are pending connections and
nbthread > 1
DOC: spoe: Add a note about fragmentation support in HAProxy
BUG/MINOR: hlua: Don't strip last non-LWS char in
hlua_pushstrippedstring()
BUG/MINOR: server-state: Don't load server-state file for disabled
backends
CLEANUP: dns: Use DISGUISE() on a never-failing ring_attach() call
CLEANUP: dns: Remove useless test on ns->dgram in dns_connect_nameserver()
Frédéric Lécaille (4):
BUILD: proxy: Missing header inclusion for quic_transport_params_init()
BUILD: quic: Implicit conversion between SSL related enums.
MINOR: contrib: add support for heartbeat control messages.
MINOR: contrib: Enhance peers dissector heuristic.
Ilya Shipitsin (3):
CI: codespell: skip Makefile for spell check
CLEANUP: assorted typo fixes in the code and comments
CLEANUP: assorted typo fixes in the code and comments
Olivier Houchard (1):
BUILD: Fix build when using clang without optimizing.
Tim Duesterhus (9):
CLEANUP: Use ist2(const void*, size_t) whenever possible
CLEANUP: Use IST_NULL whenever possible
BUG/MINOR: mux-h2: Fix typo in scheme adjustment
CLEANUP: Reapply the ist2() replacement patch
CLEANUP: Use istadv(const struct ist, const size_t) whenever possible
CLEANUP: Use isttest(const struct ist) whenever possible
Revert "CI: Pin VTest to a known good commit"
CLEANUP: Use the ist() macro whenever possible
CLEANUP: Replace for loop with only a condition by while
Ubuntu (4):
MINOR: atomic: implement a more efficient arm64 __ha_cas_dw() using pairs
CLEANUP: stream: explain why we queue the stream at the head of the
server list
MEDIUM: backend: use a trylock when trying to grab an idle connection
OPTIM: lb-random: use a cheaper PRNG to pick a server
Willy Tarreau (24):
REORG: atomic: reimplement pl_cpu_relax() from atomic-ops.h
BUG/MINOR: mt-list: always perform a cpu_relax call on failure
MINOR: atomic: add armv8.1-a atomics variant for cas-dw
BUG/MINOR: ssl: don't truncate the file descriptor to 16 bits in debug
mode
MEDIUM: pools: add CONFIG_HAP_NO_GLOBAL_POOLS and CONFIG_HAP_GLOBAL_POOLS
MINOR: pools: double the local pool cache size to 1 MB
MINOR: stream: use ABORT_NOW() and not abort() in stream_dump_and_crash()
REORG: tools: promote the debug PRNG to more general use as a statistical
one
MINOR: task: stop abusing the nice field to detect a tasklet
MINOR: task: move the nice field to the struct task only
MEDIUM: task: extend the state field to 32 bits
MINOR: task: add an application specific flag to the state: TASK_F_USR1
MEDIUM: muxes: mark idle conns tasklets with TASK_F_USR1
MINOR: xprt: add new xprt_set_idle and xprt_set_used methods
MEDIUM: ssl: implement xprt_set_used and xprt_set_idle to relax context
checks
MINOR: server: don't read curr_used_conns multiple times
CLEANUP: global: reorder some fields to respect cache lines
CLEANUP: sockpair: silence a coverity check about fcntl()
CLEANUP: lua: set a dummy file name and line number on the dummy servers
MINOR: server: add a global list of all known servers
MINOR: cfgparse: finish to set up servers outside of the proxy setup loop
MINOR: server: allocate a per-thread struct for the per-thread
connections stuff
MINOR: server: move actconns to the per-thread structure
CLEANUP: server: reorder some fields in the server struct to respect
cache lines
---