Hi,

HAProxy 3.0.6 was released on 2024/11/07. It added 92 new commits
after version 3.0.5.

As usual this releases fixed a number of bugs. But a significant part of
commits are dedicated to debugging. On last months, most of our time was
spent on bugs and it becomes more and more urgent to improve HAProxy
observability to be able to reduce time spent on bugs. While debugging stuff
was firstly focused on the 3.1, some commits were backported to the 3.0:

  * The watchdog now emits warnings when it detects apparently locked up
    threads. By default, a warning is emitted if a thread is blocked for
    more than one second. But this may be configured thanks to the global
    parameter "warn-blocked-traffic-after". The "debug dev loop" command was
    also improved to be able to emit such warning when "warn" argument is
    set.

  * The dump of threads info on panic was improved. During a panic, each
    thread now uses its own buffer instead of a global one to dump its
    info. This way, all these buffers remain available in the core dump and
    can be retrieved from gdb. This should help bug analysis.

  * Memory profiling was also improved. Some entries were displayed with a
    NULL return address, causing confusion. Now, undecodable stacks causing
    an apparent NULL return address all lead to the "other" bin. In
    addition, per-DSO stats are displayed before showing the total. It is
    more convenient on systems where many libraries are loaded.

  * A magic pattern was placed at the beginning of the post_mortem structure,
    in order to ease finding it in core dumps. It now starts with the
    32-chars pattern "POST-MORTEM STARTS HERE+7654321\0". The post_mortem
    structure is now also placed in its own section, still to ease its
    finding. Finally, several important pointers were added in it, such as
    pointers on the pools list or on the proxies list.

  * Non-printable characters are now removed from the "debug dev fd" cli
    command output.

  * Some GDB hints are added when crashing, for instance on a BUG_ON().

  * The backtraces of all threads are now dumped, instead of only for the
    stuck ones.

  * The version and the command line are now added in the "show dev" cli
    command output.

  * Two new sample fetch functions were added to retrieve the internal error
    name of the frontend (fc_err_name) or the backend (bc_err_name)
    connections. In addition, connection error codes corresponding to common
    errno were added, and they are now set when such errors are encountered
    during recv/send/splice() calls.

  * The current number of alive streams and the total number of streams
    ever created are now tracked and reported in stats. This may be useful
    to diagnose some bugs, like sessions leaks.

We really hope this will help us to speed-up the debugging process.

Now, the list of bugs fixed by this release:

  * It was possible to truncate data with the HTTP compression filter
    because of a bug in the filter API. When a filter may alter the message
    payload, it is important to properly update the HTX message metadata to
    not emit the wrong payload length. But this was not systematically
    performed.

  * In 2.4, it was decided to reject HTTP/1.1 protocol upgrade requests with
    a payload because it is incompatible with the H2 on server side. Indeed,
    such upgrade requests must be converted to CONNECT requests in H2. So no
    payload are supported. However, it remains valid in HTTP/1.1. So instead
    of rejecting it on client side, these requests are now accepted and
    properly handled when sent to a H1 server. They are only rejected when
    they are sent to a H2 server.

  * No special care about H2C protocol upgrade were took. But this could be
    a security issue if accepted by a server because it could be possible
    for a client to bypass all filtering rules. To fix the issue, the
    Upgrade header is removed from the requests if "h2c" or "h2" tokens are
    found.

  * The H1 multiplexer was only able to handle timeouts if the client or
    server timeouts were defined, depending on the side. So, it was possible
    to ignore client-fin/server-fin and http-keep-alive/http-request
    timeouts.

  * It was possible to have some blocked transfert in H2 because of an issue
    with the zero-copy data forwarding. It was possible to never remove an
    H2 stream from the send list.

  * An issue with the zero-copy data forwarding of H1 requests waiting for a
    TUNNEL established was fixed. SE_FL_EOI flag was erroneously set on the
    client sedesc.

  * On QUIC side, it was possible to experience some freezes with 0-RTT
    connections; A leak was possible on post handshake frames on the error
    path. Probing packets could be malformed; A stream could be erroneously
    closed with an empty frame with FIN bit set instead of a RESET_STREAM
    frame when not data was sent at all; The server timeout was never armed
    for small requests, fully received when the stream is created; The
    glitch counter was never reported at the session level, preventing any
    tracking via a stick-table. All these bugs were fixed.

  * A server abort was reported on an invalid HTTP response payload instead
    of an internal error. And it was also possible to report a client abort
    instead of a server abort during the HTTP response forwarding. The right
    termination states are now reported in both cases.

  * Immediate client abort on the CLI was not properly handled, blocking the
    CLI applet with no timeout armed.

  * It was possible to experience a deadlock by setting the maxconn of a
    frontend on the CLI, because of a double lock on the proxy lock.

  * "set ssl cert" CLI command was not properly checking the transaction
    name. That could lead to commit accidentally a transaction on the wrong
    certificate.

  * It was possible to send more data than expected from the stats applet
    via the zero-copy data forwarding. This was an issue for client
    connections limited by a flow control, like in H2 and QUIC.

  * There were some issues with early connection shutdowns that could lead to
    truncated messages because some tests on blocked data were missing. In
    addition, blocked data by an error on the sending path were not always
    properly detected, leaving streams blocked without any timeout armed.

  * Dequeuing process was refined to fix some bugs revealed by recent fixes
    in this area.

  * Inter-thread stream shutdown, used by "shutdown sessions server XXX" CLI
    command or "on-error shutdown-sessions" server option, was not thread
    safe.

  * The dump of extra counters with the Prometheus exporter was buggy and
    could lead to a buffer overflow because of a wrong increment on a stats
    field index.

  * It was possible to reuse HTTP connections for requests to different
    endpoints because some address families where not properly handled. The
    issue was encountered with the HTTP client and UNIX socket combination.

  * A memory leak was possible if a failure is encountered when a dynamic
    server is added with a check or agent-check options. In that case, the
    server cannot be released because its refcount was incremented too
    early. In addition access to the global server list during a dynamic
    server deletion was not protected against concurrent accesses. In the
    longterm, this could cause list corruption and crashes.


As a side note, it remains some unresolved issues on this release. One of
them is about some unexplained 502/SH or 502/SD responses. There are several
reports. It is not clear all of them are related to the same issue. And it
seems possible to also experience it on older versions. We are still trying
to understand why this happens. So, have a look to your logs to check if you
are affected or not. Any info can help to progress on this issue.

Thanks everyone for your help !

Please find the usual URLs below :
   Site index       : https://www.haproxy.org/
   Documentation    : https://docs.haproxy.org/
   Wiki             : https://github.com/haproxy/wiki/wiki
   Discourse        : https://discourse.haproxy.org/
   Slack channel    : https://slack.haproxy.org/
   Issue tracker    : https://github.com/haproxy/haproxy/issues
   Sources          : https://www.haproxy.org/download/3.0/src/
   Git repository   : https://git.haproxy.org/git/haproxy-3.0.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy-3.0.git
   Changelog        : https://www.haproxy.org/download/3.0/src/CHANGELOG
   Dataplane API    : 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs     : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs    : https://www.haproxy.org/l/reviewed-bugs
   Code reports     : https://www.haproxy.org/l/code-reports
   Latest builds    : https://www.haproxy.org/l/dev-packages


---
Complete changelog :
Amaury Denoyelle (7):
      BUG/MINOR: h1: do not forward h2c upgrade header token
      BUG/MINOR: h2: reject extended connect for h2c protocol
      BUG/MINOR: mux-quic: report glitches to session
      BUG/MEDIUM: mux-quic: ensure timeout server is active for short requests
      BUG/MINOR: mux-quic: do not close STREAM with empty FIN if no data sent
      BUG/MINOR: server: fix dynamic server leak with check on failed init
      BUG/MEDIUM: server: fix race on servers_list during server deletion

Aurelien DARRAGON (7):
      BUG/MEDIUM: server: server stuck in maintenance after FQDN change
      BUG/MEDIUM: hlua: make hlua_ctx_renew() safe
      BUG/MEDIUM: hlua: properly handle sample func errors in 
hlua_run_sample_{fetch,conv}()
      DOC: config: fix rfc7239 forwarded typo in desc
      BUG/MEDIUM: connection/http-reuse: fix address collision on unhandled 
address families
      DOC: config: add missing glitch_{cnt,rate} data types
      DOC: config: add missing glitch_{cnt,rate} sample definitions

Christopher Faulet (24):
      MINOR: connection: No longer include stconn type header in connection-t.h
      MINOR: mux-h1: Set EOI on SE during demux when both side are in DONE state
      BUG/MEDIUM: mux-h1/mux-h2: Reject upgrades with payload on H2 side only
      REGTESTS: h1/h2: Update script testing H1/H2 protocol upgrades
      BUG/MEDIUM: cli: Be sure to catch immediate client abort
      BUG/MINOR: mux-h1: Fix condition to set EOI on SE during zero-copy 
forwarding
      BUG/MINOR: http-ana: Disable fast-fwd for unfinished req waiting for 
upgrade
      BUG/MEDIUM: stconn: Wait iobuf is empty to shut SE down during a check 
send
      BUG/MINOR: http-ana: Don't report a server abort if response payload is 
invalid
      BUG/MEDIUM: stconn: Check FF data of SC to perform a shutdown in 
sc_notify()
      BUG/MAJOR: filters/htx: Add a flag to state the payload is altered by a 
filter
      REGTESTS: Never reuse server connection in http-messaging/truncated.vtc
      BUG/MEDIUM: stats-html: Never dump more data than expected during 0-copy 
FF
      BUG/MEDIUM: mux-h2: Remove H2S from send list if data are sent via 0-copy 
FF
      BUG/MEDIUM: stconn: Report blocked send if sends are blocked by an error
      BUG/MINOR: http-ana: Fix wrong client abort reports during responses 
forwarding
      BUG/MINOR: stconn: Don't disable 0-copy FF if EOS was reported on 
consumer side
      BUG/MEDIUM: mux-h1: Fix how timeouts are applied on H1 connections
      BUG/MINOR: http-ana: Report internal error if an action yields on a final 
eval
      MINOR: stream: Save last evaluated rule on invalid yield
      BUG/MEDIUM: promex: Fix dump of extra counters
      MINOR: stream/stats: Expose the current number of streams in stats
      MINOR: stream/stats: Expose the total number of streams ever created in 
stats
      BUG/MINOR: stats: Fix the name for the total number of streams created

Frederic Lecaille (4):
      BUG/MINOR: quic: avoid leaking post handshake frames
      BUG/MEDIUM: quic: avoid freezing 0RTT connections
      BUG/MINOR: quic: fix malformed probing packet building
      BUILD: Missing inclusion header for ssize_t type

Oliver Dala (1):
      BUG/MEDIUM: cli: Deadlock when setting frontend maxconn

Valentine Krasnobaeva (3):
      BUG/MINOR: cfgparse-global: fix allowed args number for setenv
      BUG/MINOR: mworker: fix mworker-max-reloads parser
      MINOR: cli/debug: show dev: add cmdline and version

William Lallemand (4):
      BUG/MINOR: httpclient: return NULL when no proxy available during 
httpclient_new()
      MINOR: cli: remove non-printable characters from 'debug dev fd'
      BUG/MINOR: trace: stop rewriting argv with -dt
      BUG/MINOR: ssl/cli: 'set ssl cert' does not check the transaction name 
correctly

Willy Tarreau (42):
      REGTESTS: shorten a bit the delay for the h1/h2 upgrade test
      BUG/MINOR: server: make sure the HMAINT state is part of MAINT
      BUILD: tools: only include execinfo.h for the real backtrace() function
      MINOR: tools: do not attempt to use backtrace() on linux without glibc
      MINOR: task: define two new one-shot events for use with WOKEN_OTHER or 
MSG
      BUG/MEDIUM: stream: make stream_shutdown() async-safe
      BUG/MINOR: queue: make sure that maintenance redispatches server queue
      MINOR: server: make srv_shutdown_sessions() call pendconn_redistribute()
      BUG/MEDIUM: queue: always dequeue the backend when redistributing the 
last server
      MINOR: debug: make mark_tainted() return the previous value
      MINOR: chunk: drop the global thread_dump_buffer
      MINOR: debug: split ha_thread_dump() in two parts
      MINOR: debug: slightly change the thread_dump_pointer signification
      MINOR: debug: make ha_thread_dump_done() take the pointer to be used
      MINOR: debug: replace ha_thread_dump() with its two components
      MEDIUM: debug: on panic, make the target thread automatically allocate 
its buf
      BUG/MEDIUM: queue: make sure never to queue when there's no more served 
conns
      MINOR: activity/memprofile: always return "other" bin on NULL return 
address
      MINOR: activity/memprofile: show per-DSO stats
      BUILD: debug: silence a build warning with threads disabled
      MINOR: pools: export the pools variable
      MINOR: debug: place a magic pattern at the beginning of post_mortem
      MINOR: debug: place the post_mortem struct in its own section.
      MINOR: debug: store important pointers in post_mortem
      DOC: config: document connection error 44 (reverse connect failure)
      CLEANUP: connection: properly name the CO_ER_SSL_FATAL enum entry
      MINOR: connection: add more connection error codes to cover common errno
      MINOR: rawsock: set connection error codes when returning from 
recv/send/splice
      MINOR: connection: add new sample fetch functions fc_err_name and 
bc_err_name
      MINOR: debug: print gdb hints when crashing
      MINOR: debug: do not limit backtraces to stuck threads
      MINOR: debug: also add a pointer to struct global to post_mortem
      MINOR: debug: also add fdtab and acitvity to struct post_mortem
      MINOR: debug: remove the redundant process.thread_info array from 
post_mortem
      MINOR: wdt: move the local timers to a struct
      MINOR: debug: add a function to dump a stuck thread
      DEBUG: wdt: better detect apparently locked up threads and warn about them
      DEBUG: cli: make it possible for "debug dev loop" to trigger warnings
      DEBUG: wdt: make the blocked traffic warning delay configurable
      DEBUG: wdt: add a stats counter "BlockedTrafficWarnings" in show info
      BUILD: debug: also declare strlen() in __ABORT_NOW()
      MINOR: debug: move the "recover now" warn message after the optional notes

--
Christopher Faulet


Reply via email to