Hi, HAProxy 2.2.11 was released on 2021/03/18. It added 31 new commits after version 2.2.10. This release comes few days after the 2.3.7 with the same aim, fix the resolvers part. In the 2.2.10, some fixes about the cached records expiration revealed design problems leading to pretty annoying bugs around the SRV records. Servers based on SRV resolution may enter in a flapping loop, switching them from UP state to DOWN and UP again at each resolution. The current design has clearly shown its limitations and it is already planned to refactor it. However, this will not take part of the stable releases. Thus several patches were pushed to fix all known issues, based on the current design. First the detection of cached additional records was fixed, just like the record expiration loop, fixing this way the flapping servers bug. The triggering of extra DNS resolutions to overcome the lack of some additional records in a SRV response was improved, as the ability to stop them. And finally, the time unit for cached records is now the millisecond. It should mitigate some edge cases with short-live records and heavy resolution frequencies. Thanks to the feedback of the community on the 2.3.7, this all seems to be behind us now. However note there remains performance issues because of the current design that may trigger the watchdog if you use huge number of servers in a server-template declaration. It is hard to say from how many servers a crash may occur. This will be addressed with the resolvers refactoring. If possible, some fixes will be backported in stable versions. But it might required too huge change to be feasible. It is too early to know. Note also there is still an identified bug in the listeners part, but not fixed yet. It is a AB/BA deadlock problem that may occur when HAProxy is stopped. Maciej also reports on the mailing list a 100% CPU usage bug for few seconds on old processes after releads. Not sure it is the same bug. We are still investigating. Here are the list of other fixes : * An issue leading to possible infinite loops because of a double locking effect in the mt lists was fixed by Olivier. If MT_LIST_TRY_ADDQ() macro, it was possible to try to lock twice the same element, making the second lock attempt to fail in loop. It happened when there is exactly one element in the mt-list and we tried to add it again into the same list. * The filters part was fixed to be sure the end analyzer (flt_end_analyse) is always called for the request and the response, especially when the request analysis is finished before the response start. * William fixed possible bugs about the listeners. Listeners are not necessarily present when the client is an applet (peers, spoe, Lua) and we need to be careful when updating counters. It was too hard to say whether those could be triggered but there was at least one way consisting in adding TCP rules to an SPOE backend. * Two bugs in the tcpchecks were fixed. The first one was about the agent-checks when mixed with the health-checks. When a agent was setting the server state, instead of updating the agent's .health threshold, the health-check's one was updated. Thus the agent was competing with the health-check and might mark a DOWN server as UP, while the precedence should be on the health-checks. The second bug was a double free error on invalid tcp-check/http-check rule. * Emeric fixed a ref counting bug into the stick-tables. Setting multiple http track-scX rules generated never expiring entries. This bug was introduced in the 2.2, when the http actions management was refactored. * Willy fixed a bug in the frequency counters because they were using the thread's own time as the start of the current period leading to non-monotonic updates in case of contention. See the commit message for details. Now, freq counters rely on a global monotonic time. Thanks to everyone for this release ! Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Wiki : https://github.com/haproxy/wiki/wiki Sources : http://www.haproxy.org/download/2.2/src/ Git repository : http://git.haproxy.org/git/haproxy-2.2.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy-2.2.git Changelog : http://www.haproxy.org/download/2.2/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ --- Complete changelog : Baptiste Assmann (1): MINOR: resolvers: new function find_srvrq_answer_record() Christopher Faulet (23): BUG/MINOR: hlua: Don't strip last non-LWS char in hlua_pushstrippedstring() BUG/MEDIUM: filters: Set CF_FL_ANALYZE on channels when filters are attached BUG/MINOR: tcpcheck: Update .health threshold of agent inside an agent-check BUG/MINOR: proxy/session: Be sure to have a listener to increment its counters BUG/MINOR: session: Add some forgotten tests on session's listener BUG/MINOR: tcpcheck: Fix double free on error path when parsing tcp/http-check Revert "BUG/MINOR: resolvers: Only renew TTL for SRV records with an additional record" BUG/MINOR: resolvers: Consider server to have no IP on DNS resolution error BUG/MINOR: resolvers: Reset server address on DNS error only on status change BUG/MINOR: resolvers: Unlink DNS resolution to set RMAINT on SRV resolution BUG/MEDIUM: resolvers: Don't set an address-less server as UP BUG/MEDIUM: resolvers: Fix the loop looking for an existing ADD item BUG/MINOR; resolvers: Ignore DNS resolution for expired SRV item BUG/MEDIUM: resolvers: Trigger a DNS resolution if an ADD item is obsolete MINOR: resolvers: Use a function to remove answers attached to a resolution MINOR: resolvers: Purge answer items when a SRV resolution triggers an error MINOR: resolvers: Add function to change the srv status based on SRV resolution MINOR: resolvers: Directly call srvrq_update_srv_state() when possible BUG/MEDIUM: resolvers: Don't release resolution from a requester callbacks BUG/MEDIUM: resolvers: Skip DNS resolution at startup if SRV resolution is set MINOR: resolvers: Use milliseconds for cached items in resolver responses MINOR: resolvers: Don't try to match immediatly renewed ADD items BUG/MINOR: resolvers: Add missing case-insensitive comparisons of DNS hostnames Emeric Brun (1): BUG/MEDIUM: stick-tables: fix ref counter in table entry using multiple http tracksc. Olivier Houchard (1): BUG/MEDIUM: lists: Avoid an infinite loop in MT_LIST_TRY_ADDQ(). William Lallemand (1): BUG/MEDIUM: session: NULL dereference possible when accessing the listener Willy Tarreau (4): BUG/MINOR: ssl: don't truncate the file descriptor to 16 bits in debug mode CLEANUP: tcp-rules: add missing actions in the tcp-request error message MINOR: time: export the global_now variable BUG/MINOR: freq_ctr/threads: make use of the last updated global time -- Christopher Faulet