On Thu, Apr 27, 2023 at 04:59:24PM +0200, Christopher Faulet wrote: > Hi, > > HAProxy 2.7.7 was released on 2023/04/27. It added 163 new commits > after version 2.7.6. > > This release is pretty huge. In one month, the QUIC team achieved an amazing > work to improve the stack and make it more stable. A big thanks to Tristan > for his priceless help. More than half of commits concern the QUIC stack. It > is hard to sum up all changes. Many bugs were fixed, the most visible are: > > * The Congestion algorithms state was shared between connections instead of > being private. > > * HTTP/1.0 responses with an unknown content length and finished on close > were not properly handled. It was considered as an early close by the > QUIC multiplexer, leading to a RESET_STREAM emission. > > * The streams fairness was improved to prevent timeouts. A stream sending a > large object could block other streams. With a small client timeout, > blocked streams could be aborted via a RESET_STREAM. > > * Some contradictions in code could lead to very long loops sending empty > packets (PADDING only packets). One visible effect was a very low > throughput performance when the client serialized its requests. > > * The control window in congestion algorithms could be zero because of a > wrong calculation and could lead to a SIGFPE crash. > > * Padding was missing in very short probe packets > > * Possible memory leaks were fixed > > And of course, some improvements were brought. The main one is the support > of the thread loadbalancing on accept. A series of changes allowed to use > the default mechanism to accept and handle connections. The less loaded > thread is now selected, improving the global performance of the QUIC > stack. In addition, a timer was added to delay the acknowledgments. > > Recent refactoring about the stream-connector layer introduced a regression > since the 2.7.4. The read timer was no longer rearmed if the end of the > message was reached. This change was introduced to avoid server timeouts > when the server replies before the end of the request. But it revealed > several bugs, some was fixed, but some others pretty are hard to fix without > changing some internals. It is too sensitive for a stable version. Thus for > now we decided to revert this change, waiting for a better solution. Note > that it is not a big deal because we only restore a behavior that has been > there for ages. > > On soft-stop or reload, idle DNS session are now killed. Since the 2.7.5, > these sessions were no longer killed, preventing the process to finish. In > addition, we now force the connect timeout for the DNS resolution. The > "resolve" timeout is used to set its value. Have no connect timeout was an > issue for resolution over TCP. Connection failures might take quite long to > report, leading to an excess of unusable DNS sessions in connecting > state. It was especially visible on soft-stop because this prevented the > process to quickly exit. Still on the DNS, errors are now properly handled > when a response is consumed. This was an issue for truncated responses > followed by an abort. The applet could ignore the abort and loop waiting for > more data until a timeout is triggered. A similar issue was fixed in the > syslog applet. > > Several bugs in lua part were fixed. First, except for lua tasks, it is no > longer possible to register functions at runtime. It was clearly stated in > the documentation, but nothing forbidden it in the code. An error is now > triggered if this happens, preventing potential segfaults. Memory leaks on > references were fixed and the lua locking was simplified to be re-entrant to > prevent deadlocks. > > Aurélien fixed several issues on the servers management. The "visible" > server list consistency was fixed. It was possible, at least in theory, to > access an invalid server if several dynamic server deletions were performed > while the list was accessed. For instance it might happen when the server > list was dumped in the stats. He also fixed wrong report for tracking > servers leaving drain state. Finally, he centralized proxy and server stats > updates on server state transition to be sure to not miss an update on some > transitions. > > The issues that were occasionally met around the use of malloc_trim() that > had been addressed in 2.8 were finally backported after one month of > exposure in 2.8. The issue was that not only malloc_trim() could still > sometimes be used when jemalloc (or any other allocator) was used, but our > attempts at plugging these special cases didn't work when linking with > external libs that also explicitly call it. In the end, the opposite was > done: we redefine our own version of malloc_trim(), which contains the tests > for the presence of an alternate lib, and call the original if the allocator > comes from libc, or call the equivalent function from other allocators. This > way external libs that would use it are safe as well. > > The mixed library version detection in dlopen() was still a bit sensitive, > and could sometimes detect anomalies related to an external lib depending on > libcrypto but not libssl for example, as well as libs that were linked with > an ABI-compatible version of the lib, but not exactly the same one. The > tests were improved to only validate the grouped presence of a combination > of relevant symbols that allow to distinguish between different ABI > versions. It manages to catch libs loaded from Lua that were compiled > against a totally different libcrypto version without being triggered when > the ABI is compatible, which was the initial purpose of the test. Typically, > loading luaossl built against openssl with haproxy build against quictls is > properly detected as an error, but loading luaossl built against a slightly > older but compatible libssl version than haproxy's (or conversely) is > OK. Similarly this has been in 2.8 for one month and a few -dev versions, > and some users continue to experience problems in 2.7 so it was about time > to backport this. > > The remaining are the usual bunch of bug fixes: > > * In the H2 multiplexer, connection errors are now properly detected > during handshake. This avoids to insert invalid connection into an idle > list. It can be an issue if such connection is the only idle > connection. If the traffic is too low to create new connections but > sufficient to always reuse it before purging it, no connection to the > server is possible. > > * It was possible to trigger the watchdog purging stick-tables on > soft-strop. To not spend too much time purging expired entries, we now > enforce a budget limitation and the purge is performed in several > steps. In addition, memory is reclaimed only when entries are > released. Indeed, this operation involves a call to malloc_trim() on > glibc, which is rather expensive. > > * NUMA topology detected on FreeBSD was fixed. > > * It was not possible to use the lua filter API if used in conjunction > with a "wait-for-body" action. Switching the HTTP message in DATA state > preventing the call to most of lua filter functions. It was fixed by > keep the HTTP message in BODY state at this stage. > > * The read expiration date is now updated on synchronous sends for all > streams except independent ones. This fixed an old bug when a filter is > configured. Write activities on synchronous sends were lost. With slow > clients uploading large object, it was possible to reach the server > timeout. > > * More internal variables are unset from program section. More > specifically, HAPROXY_STARTUPLOGS_FD, HAPROXY_MWORKER_WAIT_ONLY and > HAPROXY_PROCESSES variables are now unset. > > * ssl-min-ver and ss-max-ver parameters are now duplicated for bundles in > crt-list. > > * A warning is now emitted during configuration parsing if a header field > name contains forbidden chars in an HTTP action. A special case is made > for the colon at the beginning so that it remains possible to place any > future pseudo-headers that may appear. > > * An error is reported during configuration parsing if when the "len" > argument of a stick table type contains incorrect characters. > > * DeviceAtlas compile options were updated to support the API v3 from > 3.1.7 and onwards. > > * The strict-sni documentation was updated to state it is possible to tart > without certificate on a bind line. > > Thanks everyone for you help and your contributions !
And thanks for going through the pain of reviewing and summarizing all of these! Willy