Re: [ANNOUNCE] haproxy-2.7.7

2023-05-02 Thread Willy Tarreau
On Thu, Apr 27, 2023 at 04:59:24PM +0200, Christopher Faulet wrote:
> Hi,
> 
> HAProxy 2.7.7 was released on 2023/04/27. It added 163 new commits
> after version 2.7.6.
(...)

Just a follow up for those who would just catch this thread now, but
if you use QUIC please do not use 2.7.7 but switch to 2.7.8 instead,
and see Amaury's announcement for explanations (a patch was missing).

Thanks,
Willy



Re: [ANNOUNCE] haproxy-2.7.7

2023-04-27 Thread Willy Tarreau
On Thu, Apr 27, 2023 at 04:59:24PM +0200, Christopher Faulet wrote:
> Hi,
> 
> HAProxy 2.7.7 was released on 2023/04/27. It added 163 new commits
> after version 2.7.6.
> 
> This release is pretty huge. In one month, the QUIC team achieved an amazing
> work to improve the stack and make it more stable. A big thanks to Tristan
> for his priceless help. More than half of commits concern the QUIC stack. It
> is hard to sum up all changes. Many bugs were fixed, the most visible are:
> 
>  * The Congestion algorithms state was shared between connections instead of
>being private.
> 
>  * HTTP/1.0 responses with an unknown content length and finished on close
>were not properly handled. It was considered as an early close by the
>QUIC multiplexer, leading to a RESET_STREAM emission.
> 
>  * The streams fairness was improved to prevent timeouts. A stream sending a
>large object could block other streams. With a small client timeout,
>blocked streams could be aborted via a RESET_STREAM.
> 
>  * Some contradictions in code could lead to very long loops sending empty
>packets (PADDING only packets). One visible effect was a very low
>throughput performance when the client serialized its requests.
> 
>  * The control window in congestion algorithms could be zero because of a
>wrong calculation and could lead to a SIGFPE crash.
> 
>  * Padding was missing in very short probe packets
> 
>  * Possible memory leaks were fixed
> 
> And of course, some improvements were brought. The main one is the support
> of the thread loadbalancing on accept. A series of changes allowed to use
> the default mechanism to accept and handle connections. The less loaded
> thread is now selected, improving the global performance of the QUIC
> stack. In addition, a timer was added to delay the acknowledgments.
> 
> Recent refactoring about the stream-connector layer introduced a regression
> since the 2.7.4. The read timer was no longer rearmed if the end of the
> message was reached. This change was introduced to avoid server timeouts
> when the server replies before the end of the request. But it revealed
> several bugs, some was fixed, but some others pretty are hard to fix without
> changing some internals. It is too sensitive for a stable version. Thus for
> now we decided to revert this change, waiting for a better solution. Note
> that it is not a big deal because we only restore a behavior that has been
> there for ages.
> 
> On soft-stop or reload, idle DNS session are now killed. Since the 2.7.5,
> these sessions were no longer killed, preventing the process to finish. In
> addition, we now force the connect timeout for the DNS resolution. The
> "resolve" timeout is used to set its value. Have no connect timeout was an
> issue for resolution over TCP. Connection failures might take quite long to
> report, leading to an excess of unusable DNS sessions in connecting
> state. It was especially visible on soft-stop because this prevented the
> process to quickly exit. Still on the DNS, errors are now properly handled
> when a response is consumed. This was an issue for truncated responses
> followed by an abort. The applet could ignore the abort and loop waiting for
> more data until a timeout is triggered. A similar issue was fixed in the
> syslog applet.
> 
> Several bugs in lua part were fixed. First, except for lua tasks, it is no
> longer possible to register functions at runtime. It was clearly stated in
> the documentation, but nothing forbidden it in the code. An error is now
> triggered if this happens, preventing potential segfaults. Memory leaks on
> references were fixed and the lua locking was simplified to be re-entrant to
> prevent deadlocks.
> 
> Aurélien fixed several issues on the servers management. The "visible"
> server list consistency was fixed. It was possible, at least in theory, to
> access an invalid server if several dynamic server deletions were performed
> while the list was accessed. For instance it might happen when the server
> list was dumped in the stats. He also fixed wrong report for tracking
> servers leaving drain state. Finally, he centralized proxy and server stats
> updates on server state transition to be sure to not miss an update on some
> transitions.
> 
> The issues that were occasionally met around the use of malloc_trim() that
> had been addressed in 2.8 were finally backported after one month of
> exposure in 2.8. The issue was that not only malloc_trim() could still
> sometimes be used when jemalloc (or any other allocator) was used, but our
> attempts at plugging these special cases didn't work when linking with
> external libs that also explicitly call it. In the end, the opposite was
> done: we redefine our own version of malloc_trim(), which contains the tests
> for the presence of an alternate lib, and call the original if the allocator
> comes from libc, or call the equivalent function from other allocators. This
> way external libs that would use