Hi,
HAProxy 2.9.4 was released on 2024/01/31. It added 24 new commits
after version 2.9.3.
This version addresses various long-term stability issues that popped all
at once, so I preferred to issue it shortly after they were all addressed
rather than risking to let more accumulate and see users forced to roll
back later due to any possible regression that may later happen. We'll
also issue new versions of older branches progressively as time permits.
The issues fixes in this version are:
- an API issue with OpenSSL. The SSL_do_handshake() function returns
SSL_ERROR_WANT_READ when it needs more data, but in certain obscure
circumstances related to internal error handling, it was found that
it may stop trying to read available data and continue to return that
status! This results in wakeup loops that prevent the process from
sleeping, hence it consumes 100% of the CPU (but it's still working
fine). The code does what the doc suggests (but the doc is basically
a one-liner), and neither aws-lc nor wolfSSL exhibit this problem.
Regardless, we decided to do like openssl does in their socket BIO
which doesn't show this problem and which always clears both direction
flags before any attempt in any direction, and this addressed the
issue without degrading anything even for the other libs. This problem
has been there since 2.0 and is very hard to reproduce without a prior
trace (it's the first time it's reported). I'd like to thank Valentin
Gutierrez for his invaluable report with a capture and a working
reproducer, and Olivier Houchard for the quick fix that clearly looks
more robust than my early workaround. That's issue #2403.
- a regression in the cache's handling of secondary keys in 2.9 that
may sometimes cause a crash (issue #2417).
- a possible crash in the QPACK encoder when encoding HTTP/3 responses
carrying status codes above 599.
- another QUIC issue whereby the some streams reset with pending outgoing
data may clog the output buffer until the connection closes, possibly
causing the connection to slow down or even stall.
- in H2, certain errors would only trigger a stream error (i.e. RESET)
instead of a connection error consecutive to an insufficient fix that
was merged in 2.9.3.
- the status of agent checks is returned as-is in the stats CSV output,
resulting in mangling the CLI's output if it contains line feeds. It
has been there since 2.0.
- the HTTP/1 chunk and header parsers were strengthened a bit. Indeed,
Ben Kallus kindly reminded us that we would still accept the NUL byte
in header values and plain LF in chunks, while we were (wrongly) quite
certain that these had long been rejected. Ben is currently not aware
of situations where this could help convey an attack to any existing
component, but given the surprises he certainly faces in his reviews,
it's probably only a matter of time before one implementation shows to
be too weak and we fail to properly protect it. So it was better to
address both at once. In the extremely unlikely case that anyone would
discover such an invalid byte on their network with an application that
heavily relies on it, option accept-invalid-http* will work as usual to
bypass the check. We'll backport that to older versions as well, and I
think it would be prudent for distros to take that as well.
- an interesting arch-specific bug in the JWT parser: by initializing
a 64-bit variable a bit too early, everything was fine on 64-bit
platforms, but on 32-bit ones, a pointer located closer to the
beginning of the structure got reset by this initialization before it
was used, causing a crash! The fact this was only noticed now by running
VTest on a 32-bit platform just shows that 32-bit users are less common
these days and that their configs are probably simple enough not to use
JWT ;-)
- the "newreno" congestion control algorithm for QUIC was misspelled
"newrno" in the code, making the config parser not recognize it.
- and a few other low-importance stuff and doc updates.
I'd suggest all users of 2.9 to adopt this one now so that we can later
switch to less important fixes and backports if needed. There's currently
nothing else in the pipe concerning bugs, but we're still investigating a
case we triggered in the lab were the QUIC congestion window sometimes
doesn't open enough, and which could be responsible for lower than expected
performance on large objects when using the default Cubic algorithm (as
Tristan observes). But that's quite difficult because the original RFC was
barely exploitable due to numerous ambiguities, and fortunately there's a
new very recent one that allows to recheck the code against it (and we'll
take this opportunity to rename some parts according to the updated spec).
We're hopeful that we'll get some good news from this front soon!
Please find the usual URLs below :
Site index : https://www.haproxy.org/
Documentation : https://docs.haproxy.org/
Wiki : https://github.com/haproxy/wiki/wiki
Discourse : https://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Sources : https://www.haproxy.org/download/2.9/src/
Git repository : https://git.haproxy.org/git/haproxy-2.9.git/
Git Web browsing : https://git.haproxy.org/?p=haproxy-2.9.git
Changelog : https://www.haproxy.org/download/2.9/src/CHANGELOG
Dataplane API :
https://github.com/haproxytech/dataplaneapi/releases/latest
Pending bugs : https://www.haproxy.org/l/pending-bugs
Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs
Code reports : https://www.haproxy.org/l/code-reports
Latest builds : https://www.haproxy.org/l/dev-packages
And thanks again to all issue reporters, it was dense but very efficient
this time!
Willy
---
Complete changelog :
Amaury Denoyelle (7):
BUG/MINOR: h3: fix checking on NULL Tx buffer
MINOR: quic: extract qc_stream_buf free in a dedicated function
BUG/MEDIUM: quic: remove unsent data from qc_stream_desc buf
MINOR: h3: add traces for stream sending function
BUG/MEDIUM: h3: do not crash on invalid response status code
BUG/MEDIUM: qpack: allow 6xx..9xx status codes
BUG/MEDIUM: quic: fix crash on invalid qc_stream_buf_free() BUG_ON
Aurelien DARRAGON (2):
DOC: configuration: fix set-dst in actions keywords matrix
BUG/MINOR: hlua: fix uninitialized var in hlua_core_get_var()
Christopher Faulet (2):
BUG/MINOR: h1: Don't support LF only at the end of chunks
BUG/MEDIUM: h1: Don't support LF only to mark the end of a chunk size
Emeric Brun (1):
BUG/MEDIUM: cli: some err/warn msg dumps add LR into CSV output on stat's
CLI
Frederic Lecaille (3):
BUG/MINOR: quic: newreno QUIC congestion control algorithm no more
available
CLEANUP: quic: Remove unused CUBIC_BETA_SCALE_FACTOR_SHIFT macro.
MINOR: quic: Stop hardcoding a scale shifting value
(CUBIC_BETA_SCALE_FACTOR_SHIFT)
Lukas Tribus (1):
DOC: httpclient: add dedicated httpclient section
Olivier Houchard (1):
BUG/MAJOR: ssl_sock: Always clear retry flags in read/write functions
Remi Tricot-Le Breton (1):
BUG/MEDIUM: cache: Fix crash when deleting secondary entry
Thayne McCombs (1):
DOC: configuration: clarify http-request wait-for-body
Willy Tarreau (5):
BUG/MEDIUM: mux-h2: refine connection vs stream error on headers
MINOR: mux-h2/traces: add a missing trace on connection WU with negative
inc
BUG/MINOR: jwt: fix jwt_verify crash on 32-bit archs
BUG/MINOR: h1-htx: properly initialize the err_pos field
BUG/MEDIUM: h1: always reject the NUL character in header values
---