Hi,
HAProxy 2.4-dev9 was released on 2021/02/20. It added 54 new commits
after version 2.4-dev8.
I'm pleased to see things calming down. A number of bugs were adressed,
though nothing critical nor unusual. In fact the libressl issue affecting
3.3.x kept a few of us busy looking deep at plenty of places in the code,
trying to add debugging information. While it now looks clearly as purely
a libressl issue, we cannot yet totally rule out the possibility that we
have our share of the responsibility for triggering it, even though this
possibility is progressively fading away thanks to all the anlaysis that
were performed.
After discussing with Christopher and Baptiste, it turned out that the
state file version increase was not necessary so the 5 extra fields were
added to the regular v1 and the file format was brought back to v1. There
is still the need for a more durable format but that's probably not for
2.4 anymore (unless someone manages to quickly do something awesome and
non-intrusive that can be reviewed but I doubt it).
The cleanup phase has visibly started, with less important changes being
applied at various places with code cleanups and some doc updates. Some
new fields were added to the prometheus exporter to report some listener
statistics. The dynamic idle connection part I was talking about last week
was already moved to a dynamically allocated node so as to safe frontend
connections memory. A few cleanups were performed on the buffer_wait queues.
To be honest I don't think these continue to work on modern versions (they
were used to subscribe tasks requesting a buffer under out-of-memory condition
and to wake them up once a buffer was available, following some heuristics).
At least with the latest changes it should work less badly.
Some minor adjustments were made based on some recent benchmarks on 16
threads showing a few severe contention points, like the server lock and
a few counters. In addition it appeared that the old default value of 200
for the runqueue-depth was way too high with the modern scheduler and
tasklets, and that a reduction to 40 increased the performance by up to
18% in both request rate and bandwidth, and reduced average response times
by as much. Similarly changing maxaccept from 64 to 4 gave comparable
results for short-lived connections. These default values were thus changed
to better match the current situation. I'm even thinking about backporting
these changes as far as 2.2 after some time. Some other low-hanging fruits
were spotted, essentially in the scheduler with historic accesses to some
shared variables (like tasks_run_queue) that significantly hinder scalability.
These ones are more or less easy to do and will have to be done before 2.4,
at least because the variables names are now totally unclear or misleading.
For developers, debugging has further improved with DEBUG_TASK that notes the
location where a task_wakeup() / tasklet_wakeup() was performed and eases the
production of debugging stats.
While walking over some code parts I noted a number of cleanups to do:
- keywords list should take a const for the default proxy. This will
be easy to do but annoying as it will roughly touch half of the files
and might require some manual processing (but while painful, this is
a safe change)
- there are still quite a number of init_* functions that should be
moved to initcalls so that they don't need to be called anymore from
haproxy.c
- lots of old occurrences of "struct stream *sess" dating from the
migration from sessions to streams. Probably 1/4 of the files have to
be touched, but this must really be done as some parts are confusing;
similarly, there's still "sess_" in some function names, some of their
local variables or some struct members and we really need to clear that
up.
- the worklists haven't been used in a while, I was thinking about just
getting rid of them;
- I found 290 occurrences of "free(foo); foo=NULL;". I made a patch to
change them to "destroy(&foo)" to encourage resetting pointers on free
but I figured that a macro was better as it could ultimately also allow
some easy compile-time checks. However we can't call a macro "destroy"
as this conflicts with existing functions. I thought about ha_free()
but then should we call ha_free(&ptr) or ha_free(ptr) ? I tend to
prefer the former to know that it will modify ptr, but I'm pretty sure
that more often than not the second form may be used by thinking about
"free()". So I didn't merge anything and am interested in developers'
opinions on this and even alternate proposals.
- I noticed while reading the code that "option tcpka" has apparently no
effect in a defaults section
- I wanted to free all the unused "defaults" sections but I figured that
they may be convenient in the future if we want to create backends at
runtime from the CLI. These could be a nice way to pass most of the
settings. So maybe I'll end up only freeing the unusable ones (those
with conflicting names). Opinions welcome on this as well.
That's about all for this one, I'm going to deploy it on haproxy.org. Seeing
that dev8 worked without any glitch is already encouraging, so thanks to all
those involved!
Please find the usual URLs below :
Site index : http://www.haproxy.org/
Discourse : http://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Wiki : https://github.com/haproxy/wiki/wiki
Sources : http://www.haproxy.org/download/2.4/src/
Git repository : http://git.haproxy.org/git/haproxy.git/
Git Web browsing : http://git.haproxy.org/?p=haproxy.git
Changelog : http://www.haproxy.org/download/2.4/src/CHANGELOG
Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/
Willy
---
Complete changelog :
Amaury Denoyelle (8):
BUG/MAJOR: connection: prevent double free if conn selected for removal
REGTESTS: fix http_reuse_conn_hash proxy test
BUG/MINOR: backend: do not call smp_make_safe for sni conn hash
MINOR: connection: remove pointers for prehash in conn_hash_params
REGTESTS: reorder reuse conn proxy protocol test
MINOR: mux_h2: do not try to remove front conn from idle trees
REGTESTS: workaround for a crash with recent libressl on http-reuse sni
MINOR: connection: allocate dynamically hash node for backend conns
Christopher Faulet (8):
BUG/MINOR: server: Remove RMAINT from admin state when loading server
state
BUG/MINOR: sample: Always consider zero size string samples as unsafe
BUG/MEDIUM: spoe: Resolve the sink if a SPOE logs in a ring buffer
BUG/MINOR: http-rules: Always replace the response status on a return
action
BUG/MINOR: server: Init params before parsing a new server-state line
BUG/MINOR: server: Be sure to cut the last parsed field of a server-state
line
MEDIUM: server: Don't introduce a new server-state file version
BUG/MINOR: server: Fix test on number of fields allowed in a server-state
line
David Carlier (2):
BUILD/MEDIUM: da Adding pcre2 support.
DOC: DeviceAtlas documentation typo fix.
Emeric Brun (5):
BUG/MINOR: dns: add test on result getting value from buffer into ring.
BUG/MINOR: dns: dns_connect_server must return -1 unsupported
nameserver's type
BUG/MINOR: dns: missing test writing in output channel in session handler
BUG/MINOR: dns: fix ring attach control on dns_session_new
BUG/MEDIUM: dns: fix multiple double close on fd in dns.c
Ilya Shipitsin (2):
BUILD: ssl: introduce fine guard for OpenSSL specific SCTL functions
CI: github actions: switch to stable LibreSSL release
Olivier Houchard (1):
BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ().
William Dauchy (9):
CLEANUP: check: fix get_check_status_info declaration
CLEANUP: contrib/prometheus-exporter: align for with srv status case
MEDIUM: stats: allow to select one field in `stats_fill_li_stats`
MINOR: stats: add helper to get status string
MEDIUM: contrib/prometheus-exporter: add listen stats
MINOR: cli: add missing agent commands for set server
DOC: contrib/prometheus-exporter: remove htx reference
REGTESTS: contrib/prometheus-exporter: test NaN values
REGTESTS: contrib/prometheus-exporter: test well known labels
Willy Tarreau (19):
BUG/MINOR: session: atomically increment the tracked sessions counter
BUG/MINOR: checks: properly handle wrapping time in __health_adjust()
BUG/MEDIUM: checks: don't needlessly take the server lock in
health_adjust()
DEBUG: thread: add 5 extra lock labels for statistics and debugging
OPTIM: server: switch the actconn list to an mt-list
Revert "MINOR: threads: change lock_t to an unsigned int"
MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the
lock
OPTIM: lb-first: do not take the server lock on take_conn/drop_conn
OPTIM: lb-leastconn: do not take the server lock on take_conn/drop_conn
OPTIM: lb-leastconn: do not unlink the server if it did not change
MINOR: tasks: add DEBUG_TASK to report caller info in a task
MINOR: tasks/debug: add some extra controls of use-after-free in
DEBUG_TASK
DOC: explain the relation between pool-low-conn and tune.idle-pool.shared
MINOR: tasks: refine the default run queue depth
MINOR: listener: refine the default MAX_ACCEPT from 64 to 4
MINOR: dynbuf: make the buffer wait queue per thread
MINOR: dynbuf: use regular lists instead of mt_lists for buffer_wait
MINOR: dynbuf: pass offer_buffers() the number of buffers instead of a
threshold
MINOR: sched: have one runqueue ticks counter per thread
---