Hi, HAProxy 2.5-dev11 was released on 2021/10/22. It added 50 new commits after version 2.5-dev10.
This version essentially focuses on bug fixes, most of which will deserve a backport, and that's a good thing, it indicates the code is improving and is in good shape already. Among the most visible ones, I can mention the following ones: - new batch of fixes on resolvers (22 patches or so!). Several causes of crashes and memory corruptions were found there, despite the amount of fixes in the last months. After all that painful work, an improvement was made to include a tree-based lookup of SRV records in order to avoid huge latencies and crashes on moderately large setups. In my latest test, CPU usage dropped from 70 to 4%, so depending whether users will still experience crashes due to excessive CPU usage or not, we might even have to backport these improvements (I hope not). - an interesting bug was found in rings causing a crash on a rare case of a single-byte overflow when writing a variable integer (post- increment instead of pre-increment). It used to work because the reader and writer had the same bug. It could have caused occasional memory corruption and crashes for those using a lot of traces or sending logs over TCP. Here it was discovered with the startup logs that are also sent to a buffer. - a memory leak could happen with Lua on non-glibc systems where realloc(ptr, 0) does not cause a free() of the pointer as documented in the glibc man page. This only happened since 2.4 where the code was simplified. - there was a possibility of hung streams when I/O errors were faced on filters. I think this could impact the cache or the compression, though I'm not certain. - HTTP health checks (which are now internally based on tcp-checks) could fail to report a correct error in case of HTTP parsing error, in which case they would only see a timeout due to the response being lost between layers. There could also be some transient CPU spinning loops while waiting for the response in this case. - After the H2 mux returned a GOAWAY error, it would close the connection, but if a client was still sending, it could miss the GOAWAY frame due to TCP transforming dropping the output data and sending a reset when seeing incoming data. This was the cause for random failures with h2spec. I can't think of a scenario where a regular client could notice this, though if anyonne notices any improvement I'm interested of course. - The httpclient entry was missing in the management doc. Also it now requires expert mode. - The build was broken on NetBSD (missing sys/time.h in some includes). - Some cleanups on idle connections were performed (e.g. do not compute the connection hash when reuse is disabled such as with TCP), and traces added to help troubleshooting when needed. - the QUIC default max packet size, albeit respecting the spec, was too small for some clients such as Firefox or ngtcp2. It was raised to 2kB but this point will be cleared later. Keep in mind that this is still extremely experimental and that a lot of features are still missing, including some error handling. Some doc will be provided soon to guide users through the build of a patched SSL library to support QUIC. - a few cleanups were applied to JWT. I noticed after tagging the release that there were a few pending patches that I missed, I'm sorry about that, I had my brain completely washed by the resolvers mess and totally forgot to check the list over the full week! I'll go review them and likely merge the ones that need to. The same pending issues that I mentioned last week still hold. Christopher almost finished addressing the set-src/set-dst stuff that's causing some grief to its users so that will allow us to have it in 2.5 before deciding whether it needs to be backported or not. If some users think about some traps that would deserve a warning or at least a diag message, it's not too late. Diags could be added anywhere during a maintenance release, but I don't want to add warnings after a release, so if you have one in mind, please do not wait to suggest it. I noticed that some very old issues reported on github correspond to problems that were fixed, or that we used to keep some open because of a suggestion to implement something that was now done. I've closed a few of them and/or asked if still relevant. I think that we need to force ourselves to be a bit more rigorous when we keep them open because one covers several points, and maybe ask the report to split them into multiple issues and close what's already done. At the moment their number is growing faster than we can close them and that causes me a concern, especially seeing that I'm now having to keep 30-40 tabs open in my browser just for the recent issues I'm following and that I had to install a plugin to help me deduplicate them :-/ I do not pretend I have any solution to propose, I just want to raise this concern so that we can start to think about what can be improved in the process. We definitely do not want to reach 1000 open issues, as we know the oldest 950 ones could be closed without anyone reminding what they were about. There's no emergency here as it's still manageable, but it would be nice if we manage to improve this before 2.6 gets released, i.e. let's give us 6 more months. Based on the number of bugs fixed, it's clear that we'll have to issue another 2.4 very soon, maybe this week-end if I feel brave enough, or early next week, and that older versions will have to follow. For those those running on stable snapshots, the 2.4-stable branch is now up to date with latest fixes, so tomorrow's nightly snapshot will already contain all known fixes. Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Wiki : https://github.com/haproxy/wiki/wiki Sources : http://www.haproxy.org/download/2.5/src/ Git repository : http://git.haproxy.org/git/haproxy.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy.git Changelog : http://www.haproxy.org/download/2.5/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Amaury Denoyelle (5): BUILD: fix compilation on NetBSD MINOR: backend: add traces for idle connections reuse BUG/MINOR: backend: fix improper insert in avail tree for always reuse MINOR: backend: improve perf with tcp proxies skipping idle conns MINOR: connection: remove unneeded memset 0 for idle conns Björn Jacke (1): MINOR: add ::1 to predefined LOCALHOST acl Christopher Faulet (2): BUG/MEDIUM: stream: Keep FLT_END analyzers if a stream detects a channel error BUG/MEDIUM: tcpcheck: Properly catch early HTTP parsing errors Emeric Brun (2): BUG/MAJOR: dns: tcp session can remain attached to a list after a free BUG/MAJOR: dns: attempt to lock globaly for msg waiter list instead of use barrier Frédéric Lécaille (1): MINOR: quic: Increase the size of handshake RX UDP datagrams Ilya Shipitsin (1): CLEANUP: assorted typo fixes in the code and comments Remi Tricot-Le Breton (3): MINOR: jwt: Empty the certificate tree during deinit MINOR: jwt: jwt_verify returns negative values in case of error MINOR: jwt: Do not rely on enum order anymore Tim Duesterhus (5): DEV: coccinelle: Add strcmp.cocci CLEANUP: Apply strcmp.cocci CI: Add `permissions` to GitHub Actions CI: Clean up formatting in GitHub Action definitions CLEANUP: Consistently `unsigned int` for bitfields William Lallemand (2): MINOR: httpclient/cli: access should be only done from expert mode DOC: management: doc about the CLI httpclient Willy Tarreau (28): MEDIUM: resolvers: lower-case labels when converting from/to DNS names MEDIUM: resolvers: replace bogus resolv_hostname_cmp() with memcmp() CLEANUP: dns: always detach the appctx from the dns session on release DEBUG: dns: add a few more BUG_ON at sensitive places BUG/MAJOR: resolvers: add other missing references during resolution removal CLEANUP: resolvers: do not export resolv_purge_resolution_answer_records() BUILD: resolvers: avoid a possible warning on null-deref BUG/MEDIUM: resolvers: always check a valid item in query_list CLEANUP: always initialize the answer_list CLEANUP: resolvers: simplify resolv_link_resolution() regarding requesters CLEANUP: resolvers: replace all LIST_DELETE with LIST_DEL_INIT MEDIUM: resolvers: use a kill list to preserve the list consistency MEDIUM: resolvers: remove the last occurrences of the "safe" argument BUG/MEDIUM: checks: fix the starting thread for external checks MEDIUM: resolvers: replace the answer_list with a (flat) tree MEDIUM: resolvers: hash the records before inserting them into the tree BUG/MAJOR: buf: fix varint API post- vs pre- increment OPTIM: resolvers: move the eb32 node before the data in the answer_item MINOR: list: add new macro LIST_INLIST_ATOMIC() OPTIM: dns: use an atomic check for the list membership BUG/MINOR: task: do not set TASK_F_USR1 for no reason BUG/MINOR: mux-h2: do not prevent from sending a final GOAWAY frame MINOR: connection: add a new CO_FL_WANT_DRAIN flag to force drain on close MINOR: mux-h2: perform a full cycle shutdown+drain on close CLEANUP: resolvers: get rid of single-iteration loop in resolv_get_ip_from_response() BUG/MEDIUM: lua: fix memory leaks with realloc() on non-glibc systems MINOR: memprof: report the delta between alloc and free on realloc() MINOR: memprof: add one pointer size to the size of allocations ---