On Thu, Apr 27, 2023 at 04:59:24PM +0200, Christopher Faulet wrote:
> Hi,
> 
> HAProxy 2.7.7 was released on 2023/04/27. It added 163 new commits
> after version 2.7.6.
> 
> This release is pretty huge. In one month, the QUIC team achieved an amazing
> work to improve the stack and make it more stable. A big thanks to Tristan
> for his priceless help. More than half of commits concern the QUIC stack. It
> is hard to sum up all changes. Many bugs were fixed, the most visible are:
> 
>  * The Congestion algorithms state was shared between connections instead of
>    being private.
> 
>  * HTTP/1.0 responses with an unknown content length and finished on close
>    were not properly handled. It was considered as an early close by the
>    QUIC multiplexer, leading to a RESET_STREAM emission.
> 
>  * The streams fairness was improved to prevent timeouts. A stream sending a
>    large object could block other streams. With a small client timeout,
>    blocked streams could be aborted via a RESET_STREAM.
> 
>  * Some contradictions in code could lead to very long loops sending empty
>    packets (PADDING only packets). One visible effect was a very low
>    throughput performance when the client serialized its requests.
> 
>  * The control window in congestion algorithms could be zero because of a
>    wrong calculation and could lead to a SIGFPE crash.
> 
>  * Padding was missing in very short probe packets
> 
>  * Possible memory leaks were fixed
> 
> And of course, some improvements were brought. The main one is the support
> of the thread loadbalancing on accept. A series of changes allowed to use
> the default mechanism to accept and handle connections. The less loaded
> thread is now selected, improving the global performance of the QUIC
> stack. In addition, a timer was added to delay the acknowledgments.
> 
> Recent refactoring about the stream-connector layer introduced a regression
> since the 2.7.4. The read timer was no longer rearmed if the end of the
> message was reached. This change was introduced to avoid server timeouts
> when the server replies before the end of the request. But it revealed
> several bugs, some was fixed, but some others pretty are hard to fix without
> changing some internals. It is too sensitive for a stable version. Thus for
> now we decided to revert this change, waiting for a better solution. Note
> that it is not a big deal because we only restore a behavior that has been
> there for ages.
> 
> On soft-stop or reload, idle DNS session are now killed. Since the 2.7.5,
> these sessions were no longer killed, preventing the process to finish. In
> addition, we now force the connect timeout for the DNS resolution. The
> "resolve" timeout is used to set its value. Have no connect timeout was an
> issue for resolution over TCP. Connection failures might take quite long to
> report, leading to an excess of unusable DNS sessions in connecting
> state. It was especially visible on soft-stop because this prevented the
> process to quickly exit. Still on the DNS, errors are now properly handled
> when a response is consumed. This was an issue for truncated responses
> followed by an abort. The applet could ignore the abort and loop waiting for
> more data until a timeout is triggered. A similar issue was fixed in the
> syslog applet.
> 
> Several bugs in lua part were fixed. First, except for lua tasks, it is no
> longer possible to register functions at runtime. It was clearly stated in
> the documentation, but nothing forbidden it in the code. An error is now
> triggered if this happens, preventing potential segfaults. Memory leaks on
> references were fixed and the lua locking was simplified to be re-entrant to
> prevent deadlocks.
> 
> Aurélien fixed several issues on the servers management. The "visible"
> server list consistency was fixed. It was possible, at least in theory, to
> access an invalid server if several dynamic server deletions were performed
> while the list was accessed. For instance it might happen when the server
> list was dumped in the stats. He also fixed wrong report for tracking
> servers leaving drain state. Finally, he centralized proxy and server stats
> updates on server state transition to be sure to not miss an update on some
> transitions.
> 
> The issues that were occasionally met around the use of malloc_trim() that
> had been addressed in 2.8 were finally backported after one month of
> exposure in 2.8. The issue was that not only malloc_trim() could still
> sometimes be used when jemalloc (or any other allocator) was used, but our
> attempts at plugging these special cases didn't work when linking with
> external libs that also explicitly call it. In the end, the opposite was
> done: we redefine our own version of malloc_trim(), which contains the tests
> for the presence of an alternate lib, and call the original if the allocator
> comes from libc, or call the equivalent function from other allocators. This
> way external libs that would use it are safe as well.
> 
> The mixed library version detection in dlopen() was still a bit sensitive,
> and could sometimes detect anomalies related to an external lib depending on
> libcrypto but not libssl for example, as well as libs that were linked with
> an ABI-compatible version of the lib, but not exactly the same one. The
> tests were improved to only validate the grouped presence of a combination
> of relevant symbols that allow to distinguish between different ABI
> versions. It manages to catch libs loaded from Lua that were compiled
> against a totally different libcrypto version without being triggered when
> the ABI is compatible, which was the initial purpose of the test. Typically,
> loading luaossl built against openssl with haproxy build against quictls is
> properly detected as an error, but loading luaossl built against a slightly
> older but compatible libssl version than haproxy's (or conversely) is
> OK. Similarly this has been in 2.8 for one month and a few -dev versions,
> and some users continue to experience problems in 2.7 so it was about time
> to backport this.
> 
> The remaining are the usual bunch of bug fixes:
> 
>   * In the H2 multiplexer, connection errors are now properly detected
>     during handshake. This avoids to insert invalid connection into an idle
>     list. It can be an issue if such connection is the only idle
>     connection. If the traffic is too low to create new connections but
>     sufficient to always reuse it before purging it, no connection to the
>     server is possible.
> 
>   * It was possible to trigger the watchdog purging stick-tables on
>     soft-strop. To not spend too much time purging expired entries, we now
>     enforce a budget limitation and the purge is performed in several
>     steps. In addition, memory is reclaimed only when entries are
>     released. Indeed, this operation involves a call to malloc_trim() on
>     glibc, which is rather expensive.
> 
>   * NUMA topology detected on FreeBSD was fixed.
> 
>   * It was not possible to use the lua filter API if used in conjunction
>     with a "wait-for-body" action. Switching the HTTP message in DATA state
>     preventing the call to most of lua filter functions. It was fixed by
>     keep the HTTP message in BODY state at this stage.
> 
>   * The read expiration date is now updated on synchronous sends for all
>     streams except independent ones. This fixed an old bug when a filter is
>     configured. Write activities on synchronous sends were lost. With slow
>     clients uploading large object, it was possible to reach the server
>     timeout.
> 
>   * More internal variables are unset from program section. More
>     specifically, HAPROXY_STARTUPLOGS_FD, HAPROXY_MWORKER_WAIT_ONLY and
>     HAPROXY_PROCESSES variables are now unset.
> 
>   * ssl-min-ver and ss-max-ver parameters are now duplicated for bundles in
>     crt-list.
> 
>   * A warning is now emitted during configuration parsing if a header field
>     name contains forbidden chars in an HTTP action. A special case is made
>     for the colon at the beginning so that it remains possible to place any
>     future pseudo-headers that may appear.
> 
>   * An error is reported during configuration parsing if when the "len"
>     argument of a stick table type contains incorrect characters.
> 
>   * DeviceAtlas compile options were updated to support the API v3 from
>     3.1.7 and onwards.
> 
>   * The strict-sni documentation was updated to state it is possible to tart
>     without certificate on a bind line.
> 
> Thanks everyone for you help and your contributions !

And thanks for going through the pain of reviewing and summarizing all of
these!

Willy

Reply via email to