Hi, HAProxy 2.4.8 was released on 2021/11/03. It added 61 new commits after version 2.4.7.
After almost one month, this is a bug fix release which addresses quite a number of issues that were reported since 2.4.7: - resolvers: there were a large number of structural issues in the code, and quite frankly we're not proud of the solutions but it's impossible to do something more elegant in the current state without a major rewrite. So what matters here is that all race conditions are addressed and that the code works reliably. While the 2.5 fixes add a lookup tree to perform significant CPU savings on SRV records, that code was not backported to 2.4 because it adds further changes that do not seem necessary in the current situation. We may revisit that choice later if users still face important CPU usage. But I'm now more confident that the observed CPU loops were in fact infinite loops due to the list bugs, rather than high CPU usage. In the current situation everything was done so that the resolvers code couldn't crash anymore (19 patches). I sincerely hope that we will not have to experience another journey in that swamp for a while. - an interesting bug in the ring API caused boundary checks for the wrapping at the end of the buffer to be shifted by one both in the producer and the consumer, thus they both cancel each other and are not observable... until the byte after the buffer is not mapped or belongs to another area. One crash was met on boot (since startup messages are duplicated into a ring for later retrieval), and it is possible that those sending logs over TCP might have faced it as well, otherwise it's extremely unlikely to be observed outside of these use cases. - the CPU affinity setting on FreeBSD was relying on a wrong macro to get the number of CPU, assuming it was always one, so the affinity settings were rejected for the second and higher CPUs. - using the tarpit could lead to head-of-line blocking of an H2 connection as the pending data were not drained. And in other protocols, the presence of these pending data could cause a wakeup loop between the mux and the stream, which usually ended in the process being detected as faulty and being killed by the safety checks. - a similar wakeup loop could also happen when waiting for more data (e.g. option http-buffer-request) with lots of data already present in the receive buffer while the lower layer could only deliver a full block at once, that couldn't fit. - a slow memory leak in 2.4 with Lua on non-glibc systems was addressed. Glibc's realloc() function exactly matches Lua's allocator semantics, thus the allocator was simplified in 2.4... except that the man page is not very clear on the fact that it's a glibc-ism to free on realloc(0), leading to a slow leak in other environments. - the h2spec tests in the CI were regularly failing on a few tests expecting HTTP/2 GOAWAY frames that were sent (even seen in strace). The problem was that we didn't perform a graceful shutdown and that this copes badly with bidirectional communications as unread pending data cause the connection to be reset and the frame to be lost. This was addressed by performing a clean shutdown. It's unlikely that anyone ever noticed this given that this essentially happens when communication errors are reported (i.e. when the client has little reason to complain). - some users complained that TLS handshakes were renewed too often in some cases. Emeric found that with the migration to the muxes in 1.9-2.0 we've lost the clean shutdown on end of connection that's also used to commit the TLS session cache entry. For HTTP/2 this was addressed as a side effect of the fix above, and for HTTP/1, a fix was produced to also perform a clean shutdown on keep-alive connections (it used to work fine only for close ones). - the validity checks for sample fetch functions were only applied to the frontend capability of a proxy. This means that using a small set of sample fetch functions (like "be_name()") in proxies that are both a frontend and a backend ("listen" or "defaults") would lead to a config error while it is technically valid. This problem has always been there and never reported. - automatic cast of variables to other types would fail to first verify if a cast method was known, possibly causing a crash at runtime when calling them for the first time (e.g. using a variable of type address as an argument to strcmp() or a boolean with secure_memcmp()). - some streams could sometimes be frozen when filters were enabled (such as compression) and an error was raised with data still left to be processed. - HTTP health check could report L7 timeout when facing a parse error, because the response is dropped before being translated to HTX, while the check waiting for a response didn't explicitly check for a possible end-of-input. - http-after-response rules must stop after an "allow" action, to match their http-response counter-part. - the parsing of the "Authorization" header field would fail if more than one space was present between the scheme and the value. - the "fix_tag_value()" fetch function wouldn't properly wait for more data due to an inverted condition. - build failures could happen on Mac/arm64 with a recent clang compiler. A few tiny improvements: - halog updates to report headers and query strings were backported, as these are the type of improvements expected where halog is used (i.e. in field). - the memory profiler now also takes accounts of the bookkeeping size used by each allocated area so that any future leak like the Lua one will not be able to stay unnoticed anymore. What is pleasant here is to see that very few of the issues above concern 2.4 only. In other words, 2.4 is at least as good as older versions. This is very encouraging because it will allow us to push a bit less hard to risk the backport of complex fixes too far: for some rare issues we might sometimes prefer to encourage users to use a more recent version rather than risk to break many other usages. [Yes, I'm looking at you, resolver patches, that will hardly apply to 2.0 while it's not there that users complain the most]. Updates for other versions are on their way as well. It just takes time. Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Wiki : https://github.com/haproxy/wiki/wiki Sources : http://www.haproxy.org/download/2.4/src/ Git repository : http://git.haproxy.org/git/haproxy-2.4.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy-2.4.git Changelog : http://www.haproxy.org/download/2.4/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Amaury Denoyelle (3): BUG/MEDIUM: cpuset: fix cpuset size for FreeBSD BUILD: fix compilation on NetBSD BUG/MINOR: backend: fix improper insert in avail tree for always reuse Christopher Faulet (14): BUG/MEDIUM: mux_h2: Handle others remaining read0 cases on partial frames BUG/MINOR: http-ana: Don't eval front after-response rules if stopped on back BUG/MINOR: sample: Fix 'fix_tag_value' sample when waiting for more data BUG/MEDIUM: stream: Keep FLT_END analyzers if a stream detects a channel error BUG/MEDIUM: tcpcheck: Properly catch early HTTP parsing errors BUG/MINOR: mux-h1: Save shutdown mode if the shutdown is delayed BUG/MEDIUM: mux-h1: Perform a connection shutdown when the h1c is released BUG/MEDIUM: resolvers: Don't recursively perform requester unlink BUG/MEDIUM: resolvers: Track api calls with a counter to free resolutions BUG/MEDIUM: http-ana: Drain request data waiting the tarpit timeout expiration BUG/MEDIUM: stream-int: Block reads if channel cannot receive more data BUG/MEDIUM: sample: Cumulate frontend and backend sample validity flags DOC: config: Fix alphabetical order of fc_* samples MINOR: stream: Improve dump of bogus streams David CARLIER (1): BUILD: atomic: fix build on mac/arm64 David Carlier (1): BUILD/MINOR: cpuset freebsd build fix Emeric Brun (2): BUG/MAJOR: dns: tcp session can remain attached to a list after a free BUG/MAJOR: dns: attempt to lock globaly for msg waiter list instead of use barrier John Roesler (1): DOC/peers: some grammar fixes for peers 2.1 spec Olivier Houchard (1): MINOR: initcall: Rename __GLOBL and __GLOBL1. Remi Tricot-Le Breton (1): BUG/MINOR: http: Authorization value can have multiple spaces after the scheme Thayne McCombs (1): DOC: configuration: add clarification on escaping in keyword arguments Tim Duesterhus (6): MINOR: halog: Add -qry parameter allowing to preserve the query string in -uX DOC: halog: Move the `-qry` parameter into the correct section in help text MINOR: halog: Rename -qry to -query CLEANUP: halog: Use consistent indentation in help() BUG/MINOR: halog: Add missing newlines in die() messages MINOR: halog: Add support for extracting captures using -hdr William Lallemand (1): Revert "CLEANUP: server: always include the storage for SSL settings" Willy Tarreau (29): CLEANUP: server: always include the storage for SSL settings CLEANUP: sample: rename sample_conv_var2smp() to *_sint CLEANUP: sample: uninline sample_conv_var2smp_str() MINOR: sample: provide a generic var-to-sample conversion function BUG/MEDIUM: sample: properly verify that variables cast to sample MINOR: resolvers: fix the resolv_str_to_dn_label() API about trailing zero BUG/MEDIUM: resolver: make sure to always use the correct hostname length BUG/MINOR: resolvers: do not reject host names of length 255 in SRV records MINOR: resolvers: fix the resolv_dn_label_to_str() API about trailing zero BUG/MEDIUM: resolvers: fix truncated TLD consecutive to the API fix BUG/MEDIUM: resolvers: use correct storage for the target address MINOR: resolvers: merge address and target into a union "data" BUG/MAJOR: resolvers: add other missing references during resolution removal BUILD: resolvers: avoid a possible warning on null-deref BUG/MEDIUM: resolvers: always check a valid item in query_list BUG/MAJOR: buf: fix varint API post- vs pre- increment BUG/MINOR: task: do not set TASK_F_USR1 for no reason BUG/MINOR: mux-h2: do not prevent from sending a final GOAWAY frame BUG/MEDIUM: lua: fix memory leaks with realloc() on non-glibc systems MINOR: memprof: report the delta between alloc and free on realloc() MINOR: memprof: add one pointer size to the size of allocations CLEANUP: resolvers: do not export resolv_purge_resolution_answer_records() CLEANUP: always initialize the answer_list CLEANUP: resolvers: simplify resolv_link_resolution() regarding requesters CLEANUP: resolvers: replace all LIST_DELETE with LIST_DEL_INIT MEDIUM: resolvers: use a kill list to preserve the list consistency MEDIUM: resolvers: remove the last occurrences of the "safe" argument BUG/MINOR: sample: fix backend direction flags consecutive to last fix SCRIPTS: git-show-backports: re-enable file-based filtering ---