Hi, HAProxy 2.4-dev9 was released on 2021/02/20. It added 54 new commits after version 2.4-dev8.
I'm pleased to see things calming down. A number of bugs were adressed, though nothing critical nor unusual. In fact the libressl issue affecting 3.3.x kept a few of us busy looking deep at plenty of places in the code, trying to add debugging information. While it now looks clearly as purely a libressl issue, we cannot yet totally rule out the possibility that we have our share of the responsibility for triggering it, even though this possibility is progressively fading away thanks to all the anlaysis that were performed. After discussing with Christopher and Baptiste, it turned out that the state file version increase was not necessary so the 5 extra fields were added to the regular v1 and the file format was brought back to v1. There is still the need for a more durable format but that's probably not for 2.4 anymore (unless someone manages to quickly do something awesome and non-intrusive that can be reviewed but I doubt it). The cleanup phase has visibly started, with less important changes being applied at various places with code cleanups and some doc updates. Some new fields were added to the prometheus exporter to report some listener statistics. The dynamic idle connection part I was talking about last week was already moved to a dynamically allocated node so as to safe frontend connections memory. A few cleanups were performed on the buffer_wait queues. To be honest I don't think these continue to work on modern versions (they were used to subscribe tasks requesting a buffer under out-of-memory condition and to wake them up once a buffer was available, following some heuristics). At least with the latest changes it should work less badly. Some minor adjustments were made based on some recent benchmarks on 16 threads showing a few severe contention points, like the server lock and a few counters. In addition it appeared that the old default value of 200 for the runqueue-depth was way too high with the modern scheduler and tasklets, and that a reduction to 40 increased the performance by up to 18% in both request rate and bandwidth, and reduced average response times by as much. Similarly changing maxaccept from 64 to 4 gave comparable results for short-lived connections. These default values were thus changed to better match the current situation. I'm even thinking about backporting these changes as far as 2.2 after some time. Some other low-hanging fruits were spotted, essentially in the scheduler with historic accesses to some shared variables (like tasks_run_queue) that significantly hinder scalability. These ones are more or less easy to do and will have to be done before 2.4, at least because the variables names are now totally unclear or misleading. For developers, debugging has further improved with DEBUG_TASK that notes the location where a task_wakeup() / tasklet_wakeup() was performed and eases the production of debugging stats. While walking over some code parts I noted a number of cleanups to do: - keywords list should take a const for the default proxy. This will be easy to do but annoying as it will roughly touch half of the files and might require some manual processing (but while painful, this is a safe change) - there are still quite a number of init_* functions that should be moved to initcalls so that they don't need to be called anymore from haproxy.c - lots of old occurrences of "struct stream *sess" dating from the migration from sessions to streams. Probably 1/4 of the files have to be touched, but this must really be done as some parts are confusing; similarly, there's still "sess_" in some function names, some of their local variables or some struct members and we really need to clear that up. - the worklists haven't been used in a while, I was thinking about just getting rid of them; - I found 290 occurrences of "free(foo); foo=NULL;". I made a patch to change them to "destroy(&foo)" to encourage resetting pointers on free but I figured that a macro was better as it could ultimately also allow some easy compile-time checks. However we can't call a macro "destroy" as this conflicts with existing functions. I thought about ha_free() but then should we call ha_free(&ptr) or ha_free(ptr) ? I tend to prefer the former to know that it will modify ptr, but I'm pretty sure that more often than not the second form may be used by thinking about "free()". So I didn't merge anything and am interested in developers' opinions on this and even alternate proposals. - I noticed while reading the code that "option tcpka" has apparently no effect in a defaults section - I wanted to free all the unused "defaults" sections but I figured that they may be convenient in the future if we want to create backends at runtime from the CLI. These could be a nice way to pass most of the settings. So maybe I'll end up only freeing the unusable ones (those with conflicting names). Opinions welcome on this as well. That's about all for this one, I'm going to deploy it on haproxy.org. Seeing that dev8 worked without any glitch is already encouraging, so thanks to all those involved! Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Wiki : https://github.com/haproxy/wiki/wiki Sources : http://www.haproxy.org/download/2.4/src/ Git repository : http://git.haproxy.org/git/haproxy.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy.git Changelog : http://www.haproxy.org/download/2.4/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Amaury Denoyelle (8): BUG/MAJOR: connection: prevent double free if conn selected for removal REGTESTS: fix http_reuse_conn_hash proxy test BUG/MINOR: backend: do not call smp_make_safe for sni conn hash MINOR: connection: remove pointers for prehash in conn_hash_params REGTESTS: reorder reuse conn proxy protocol test MINOR: mux_h2: do not try to remove front conn from idle trees REGTESTS: workaround for a crash with recent libressl on http-reuse sni MINOR: connection: allocate dynamically hash node for backend conns Christopher Faulet (8): BUG/MINOR: server: Remove RMAINT from admin state when loading server state BUG/MINOR: sample: Always consider zero size string samples as unsafe BUG/MEDIUM: spoe: Resolve the sink if a SPOE logs in a ring buffer BUG/MINOR: http-rules: Always replace the response status on a return action BUG/MINOR: server: Init params before parsing a new server-state line BUG/MINOR: server: Be sure to cut the last parsed field of a server-state line MEDIUM: server: Don't introduce a new server-state file version BUG/MINOR: server: Fix test on number of fields allowed in a server-state line David Carlier (2): BUILD/MEDIUM: da Adding pcre2 support. DOC: DeviceAtlas documentation typo fix. Emeric Brun (5): BUG/MINOR: dns: add test on result getting value from buffer into ring. BUG/MINOR: dns: dns_connect_server must return -1 unsupported nameserver's type BUG/MINOR: dns: missing test writing in output channel in session handler BUG/MINOR: dns: fix ring attach control on dns_session_new BUG/MEDIUM: dns: fix multiple double close on fd in dns.c Ilya Shipitsin (2): BUILD: ssl: introduce fine guard for OpenSSL specific SCTL functions CI: github actions: switch to stable LibreSSL release Olivier Houchard (1): BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ(). William Dauchy (9): CLEANUP: check: fix get_check_status_info declaration CLEANUP: contrib/prometheus-exporter: align for with srv status case MEDIUM: stats: allow to select one field in `stats_fill_li_stats` MINOR: stats: add helper to get status string MEDIUM: contrib/prometheus-exporter: add listen stats MINOR: cli: add missing agent commands for set server DOC: contrib/prometheus-exporter: remove htx reference REGTESTS: contrib/prometheus-exporter: test NaN values REGTESTS: contrib/prometheus-exporter: test well known labels Willy Tarreau (19): BUG/MINOR: session: atomically increment the tracked sessions counter BUG/MINOR: checks: properly handle wrapping time in __health_adjust() BUG/MEDIUM: checks: don't needlessly take the server lock in health_adjust() DEBUG: thread: add 5 extra lock labels for statistics and debugging OPTIM: server: switch the actconn list to an mt-list Revert "MINOR: threads: change lock_t to an unsigned int" MINOR: lb/api: let callers of take_conn/drop_conn tell if they have the lock OPTIM: lb-first: do not take the server lock on take_conn/drop_conn OPTIM: lb-leastconn: do not take the server lock on take_conn/drop_conn OPTIM: lb-leastconn: do not unlink the server if it did not change MINOR: tasks: add DEBUG_TASK to report caller info in a task MINOR: tasks/debug: add some extra controls of use-after-free in DEBUG_TASK DOC: explain the relation between pool-low-conn and tune.idle-pool.shared MINOR: tasks: refine the default run queue depth MINOR: listener: refine the default MAX_ACCEPT from 64 to 4 MINOR: dynbuf: make the buffer wait queue per thread MINOR: dynbuf: use regular lists instead of mt_lists for buffer_wait MINOR: dynbuf: pass offer_buffers() the number of buffers instead of a threshold MINOR: sched: have one runqueue ticks counter per thread ---