[ANNOUNCE] haproxy-2.3.18

Christopher Faulet Wed, 02 Mar 2022 01:53:56 -0800

Hi,

HAProxy 2.3.18 was released on 2022/03/02. It added 43 new commits
after version 2.3.17.


2.4.13 and 2.4.14 announcements already explained all fixes included in this
release. So I'm copy-pasting here the relevant parts below.

The main issues fixed in this version are:

  - a tiny race condition in the scheduler affecting the rare multi-
    threaded tasks. In some cases, a task could be finishing to run on one
    thread and expiring on another one, just in the process of being
    requeued to the position being in the process of being calculated by the
    thread finishing with it. The most likely case was the peers task
    disabling the expiration while waiting for other peers to be locked,
    causing such a non-expirable task to be queued and to block all other
    timers from expiring (typically health checks, peers and resolvers, but
    others were affected). This could only happen at high peers traffic rate
    but it definitely did. When built with the suitable options such as
    DEBUG_STRICT it would immediately crash (which is how it was
    detected). This bug was present since 2.0.

  - a bug in the Set-Cookie2 response parser may result in an infinite loop
    triggering the watchdog if a server sends this while it belongs to a
    backend configured with cookie persistence. Usually cookie-based
    persistence is not used with untrusted servers, but if that was the
    case, the following rule would be usable as a workaround for the time it
    takes to upgrade:

         http-response del-header Set-Cookie2

    It reminded us that 2.5 years ago we were discussing about completely
    dropping Set-Cookie2 which never succeeded in field, Tim has opened an
    issue so that we don't forget to remove it after 2.6. This issue was
    diagnosed, reported and fixed by Andrew McDermott and Grant Spence.
    This bug was there since 1.9.

  - a bug in the SPOE error handling. When a connection to an agent dies,
    there may still be requests pending that are tied to this connection.
    The list of such requests is scanned so that they can be aborted, except
    that the condition to scan the list was incorrect, and when these
    requests were finally aborted upon processing timeout, they were
    updating the memory area they used to point to, which could have been
    reused for anything, causing random crashes very commonly seen in libc's
    malloc/free va openssl, or haproxy pools with corrupted pointers.  In
    short, anyone using SPOE must absolutely update to apply the fix
    otherwise any bug they face cannot be trusted as we know there's a rare
    but real case of memory corruption there. This bug was present since
    1.8.

  - a bug in the H2 multiplexer. An error during the response processing,
    after the HEADERS frame parsing, led to a wakeup loop consuming all the
    CPU because the error was not properly reported to the upper layer. For
    instance, this happened if an invalid header value, an invalid status
    code or a forbidden header was found in the response. Note that only
    HAProxy >= 2.4 are affected by this issue.

  - there was a possible race condition on the listeners where it was
    sometimes possible to wake up a temporarily paused listener just after
    it had failed to rebind upon a failed attempt to reload. This would
    access fdtab[-1] causing memory corruption or crashes. It's been there
    since 2.2 but really started to have an effect with 2.3.

  - the master CLI could remain stuck forever if extra characters followed
    by a shutdown were sent before the end of a response. In this case, each
    such connection would remain unusable, and a script doing this would
    face a connection failure after the 10th attempt (master's maxconn). A
    few related issues could also cause it to loop forever (e.g. too long
    pipelined requests, and empty buffers after wrapping).

  - a FD leak on reload failures. When the master process is reloaded on a
    new config, it will try to connect to the previous process' socket to
    retrieve all known listening FDs to be reused by the new listeners. If
    listeners were removed, their unused FDs are simply closed. However
    there's a catch. In case a socket fails to bind, the master will cancel
    its startup and switch to wait mode for a new operation to happen. In
    this case it didn't close the possibly remaining FDs that were left
    unused.

  - a FD leak of a sockpair upon a failed reload.  When starting HAProxy in
    master-worker, the master pre-allocate a struct mworker_proc and do a
    socketpair() before the configuration parsing. If the configuration
    loading failed, the FD was never closed because they aren't part of
    listener, they are not even in the fdtab.

  - it was possible to temporarily lose the stats sockets upon reloads in
    master-worker mode in case of early error (e.g. missing config file),
    in which case the socket transfer from the older process couldn't
    happen.

  - some issues about errors on buffers allocation. First, in the H1
    multiplexer. If we failed to send data because we failed to allocate the
    H1 output buffer, the H1 stream was erroneously woken up. This led to a
    wakeup loop to send more data while it is not possible because there is
    no output buffer. Then, in process_stream(), if we failed to allocate
    the channel response buffer while a connect or an analysis timeout
    occurred, the stream was woken up in loop because its task was requeued
    with an expired date. Now an error is reported when this happens and the
    stream processing is interrupted.

    Note there is a mechanism to deal with errors on buffers allocation.
    Unfortunately, since the 1.7, this mechanism is broken. And it is even
    worse now with the multiplexers. All this part must be refactored. But
    for now, HAProxy may be partially frozen if too many entities are
    waiting for a buffer.

  - some alignment problems that were found when using gcc-11 + RHEL8,
    resulting in instant crashes on startup.

  - an issue with multi-line ESMTP response in the mailer code.

  - an issue in the resolvers code with domain names with a trailing
    dot. The trailing dot was not ignored as expected and a junk character
    was added at the end of the encoded part of the domain name.

  - there were still a number of other issues of lower level of importance,
    such as the CLI being extremely slow to parse pipelined requests because
    it was looking for the line feed first, hence the larger the buffer, the
    slower it was with batch updates like ACL/map updates; a possibly
    truncated pidfile in master mode; a bug with the data transfer in the
    HTX layer for large data block; an inconsistency with the parsing of
    IPv4 addresses.

Note that the EOL of the 2.3 is planned for the end of this quarter. So this
release is probably one of the last 2.3 releases. For everyone running a
2.3, it could be a good idea to migrate to the 2.4.

Thanks everyone for your help and your contributions!

Please find the usual URLs below :
   Site index       : http://www.haproxy.org/
   Discourse        : http://discourse.haproxy.org/
   Slack channel    : https://slack.haproxy.org/
   Issue tracker    : https://github.com/haproxy/haproxy/issues
   Wiki             : https://github.com/haproxy/wiki/wiki
   Sources          : http://www.haproxy.org/download/2.3/src/
   Git repository   : http://git.haproxy.org/git/haproxy-2.3.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy-2.3.git
   Changelog        : http://www.haproxy.org/download/2.3/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/


---
Complete changelog :
Andrew McDermott (1):
      BUG/MAJOR: http/htx: prevent unbounded loop in 
http_manage_server_side_cookies

Christopher Faulet (9):
      BUG/MEDIUM: htx: Adjust length to add DATA block in an empty HTX buffer
      BUG/MEDIUM: cli: Never wait for more data on client shutdown
      BUG/MINOR: sink: Use the right field in appctx context in release callback
      BUG/MEDIUM: resolvers: Really ignore trailing dot in domain names
      BUG/MEDIUM: htx: Be sure to have a buffer to perform a raw copy of a 
message
      BUG/MEDIUM: mux-h1: Don't wake h1s if mux is blocked on lack of output 
buffer
      BUG/MAJOR: mux-h2: Be sure to always report HTX parsing error to the app 
layer
      BUG/MEDIUM: stream: Abort processing if response buffer allocation fails
      REGTESTS: fix the race conditions in secure_memcmp.vtc

David Carlier (1):
      BUILD/MINOR: fix solaris build with clang.

Ilya Shipitsin (5):
      BUILD: adopt script/build-ssl.sh for OpenSSL-3.0.0beta2
      CI: github actions: add OpenSSL-3.0.0 builds
      CI: github actions: relax OpenSSL-3.0.0 version comparision
      CI: github actions: update OpenSSL to 3.0.1
      CI: github actions: use cache for SSL libs

Lukas Tribus (1):
      BUG/MINOR: mailers: negotiate SMTP, not ESMTP

William Lallemand (6):
      BUG/MINOR: mworker: does not erase the pidfile upon reload
      BUG/MINOR: mworker: fix a FD leak of a sockpair upon a failed reload
      BUILD: fix compilation for OpenSSL-3.0.0-alpha17
      CI: github actions: -Wno-deprecated-declarations with OpenSSL 3.0.0
      CI: github: switch to OpenSSL 3.0.0
      BUG/MINOR: tools: url2sa reads ipv4 too far

Willy Tarreau (20):
      MEDIUM: cli: yield between each pipelined command
      MINOR: channel: add new function co_getdelim() to support multiple 
delimiters
      BUG/MINOR: cli: avoid O(bufsize) parsing cost on pipelined commands
      BUG/MEDIUM: mcli: do not try to parse empty buffers
      BUG/MEDIUM: mcli: always realign wrapping buffers before parsing them
      BUG/MEDIUM: mworker: don't lose the stats socket on failed reload
      MINOR: listener: replace the listener's spinlock with an rwlock
      BUG/MEDIUM: listener: read-lock the listener during accept()
      BUG/MAJOR: spoe: properly detach all agents when releasing the applet
      MINOR: sock: move the unused socket cleaning code into its own function
      BUG/MEDIUM: mworker: close unused transferred FDs on load failure
      BUG/MEDIUM: fd: always align fdtab[] to 64 bytes
      CI: ssl: enable parallel builds for OpenSSL on Linux
      CI: ssl: do not needlessly build the OpenSSL docs
      CI: ssl: keep the old method for ancient OpenSSL versions
      CLEANUP: atomic: add a fetch-and-xxx variant for common operations
      BUG/MINOR: task: do not set TASK_F_USR1 for no reason
      BUG/MAJOR: sched: prevent rare concurrent wakeup of multi-threaded tasks
      BUILD/MINOR: sched: drop the DEBUG_TASK parts from latest fix
      CI: github actions: add the output of $CC -dM -E

--
Christopher Faulet

[ANNOUNCE] haproxy-2.3.18

Reply via email to