Hi,
HAProxy 2.7.7 was released on 2023/04/27. It added 163 new commits
after version 2.7.6.
This release is pretty huge. In one month, the QUIC team achieved an amazing
work to improve the stack and make it more stable. A big thanks to Tristan
for his priceless help. More than half of commits concern the QUIC stack. It
is hard to sum up all changes. Many bugs were fixed, the most visible are:
* The Congestion algorithms state was shared between connections instead of
being private.
* HTTP/1.0 responses with an unknown content length and finished on close
were not properly handled. It was considered as an early close by the
QUIC multiplexer, leading to a RESET_STREAM emission.
* The streams fairness was improved to prevent timeouts. A stream sending a
large object could block other streams. With a small client timeout,
blocked streams could be aborted via a RESET_STREAM.
* Some contradictions in code could lead to very long loops sending empty
packets (PADDING only packets). One visible effect was a very low
throughput performance when the client serialized its requests.
* The control window in congestion algorithms could be zero because of a
wrong calculation and could lead to a SIGFPE crash.
* Padding was missing in very short probe packets
* Possible memory leaks were fixed
And of course, some improvements were brought. The main one is the support
of the thread loadbalancing on accept. A series of changes allowed to use
the default mechanism to accept and handle connections. The less loaded
thread is now selected, improving the global performance of the QUIC
stack. In addition, a timer was added to delay the acknowledgments.
Recent refactoring about the stream-connector layer introduced a regression
since the 2.7.4. The read timer was no longer rearmed if the end of the
message was reached. This change was introduced to avoid server timeouts
when the server replies before the end of the request. But it revealed
several bugs, some was fixed, but some others pretty are hard to fix without
changing some internals. It is too sensitive for a stable version. Thus for
now we decided to revert this change, waiting for a better solution. Note
that it is not a big deal because we only restore a behavior that has been
there for ages.
On soft-stop or reload, idle DNS session are now killed. Since the 2.7.5,
these sessions were no longer killed, preventing the process to finish. In
addition, we now force the connect timeout for the DNS resolution. The
"resolve" timeout is used to set its value. Have no connect timeout was an
issue for resolution over TCP. Connection failures might take quite long to
report, leading to an excess of unusable DNS sessions in connecting
state. It was especially visible on soft-stop because this prevented the
process to quickly exit. Still on the DNS, errors are now properly handled
when a response is consumed. This was an issue for truncated responses
followed by an abort. The applet could ignore the abort and loop waiting for
more data until a timeout is triggered. A similar issue was fixed in the
syslog applet.
Several bugs in lua part were fixed. First, except for lua tasks, it is no
longer possible to register functions at runtime. It was clearly stated in
the documentation, but nothing forbidden it in the code. An error is now
triggered if this happens, preventing potential segfaults. Memory leaks on
references were fixed and the lua locking was simplified to be re-entrant to
prevent deadlocks.
Aurélien fixed several issues on the servers management. The "visible"
server list consistency was fixed. It was possible, at least in theory, to
access an invalid server if several dynamic server deletions were performed
while the list was accessed. For instance it might happen when the server
list was dumped in the stats. He also fixed wrong report for tracking
servers leaving drain state. Finally, he centralized proxy and server stats
updates on server state transition to be sure to not miss an update on some
transitions.
The issues that were occasionally met around the use of malloc_trim() that
had been addressed in 2.8 were finally backported after one month of
exposure in 2.8. The issue was that not only malloc_trim() could still
sometimes be used when jemalloc (or any other allocator) was used, but our
attempts at plugging these special cases didn't work when linking with
external libs that also explicitly call it. In the end, the opposite was
done: we redefine our own version of malloc_trim(), which contains the tests
for the presence of an alternate lib, and call the original if the allocator
comes from libc, or call the equivalent function from other allocators. This
way external libs that would use it are safe as well.
The mixed library version detection in dlopen() was still a bit sensitive,
and could sometimes detect anomalies related to an external lib depending on
libcrypto but not libssl for example, as well as libs that were linked with
an ABI-compatible version of the lib, but not exactly the same one. The
tests were improved to only validate the grouped presence of a combination
of relevant symbols that allow to distinguish between different ABI
versions. It manages to catch libs loaded from Lua that were compiled
against a totally different libcrypto version without being triggered when
the ABI is compatible, which was the initial purpose of the test. Typically,
loading luaossl built against openssl with haproxy build against quictls is
properly detected as an error, but loading luaossl built against a slightly
older but compatible libssl version than haproxy's (or conversely) is
OK. Similarly this has been in 2.8 for one month and a few -dev versions,
and some users continue to experience problems in 2.7 so it was about time
to backport this.
The remaining are the usual bunch of bug fixes:
* In the H2 multiplexer, connection errors are now properly detected
during handshake. This avoids to insert invalid connection into an idle
list. It can be an issue if such connection is the only idle
connection. If the traffic is too low to create new connections but
sufficient to always reuse it before purging it, no connection to the
server is possible.
* It was possible to trigger the watchdog purging stick-tables on
soft-strop. To not spend too much time purging expired entries, we now
enforce a budget limitation and the purge is performed in several
steps. In addition, memory is reclaimed only when entries are
released. Indeed, this operation involves a call to malloc_trim() on
glibc, which is rather expensive.
* NUMA topology detected on FreeBSD was fixed.
* It was not possible to use the lua filter API if used in conjunction
with a "wait-for-body" action. Switching the HTTP message in DATA state
preventing the call to most of lua filter functions. It was fixed by
keep the HTTP message in BODY state at this stage.
* The read expiration date is now updated on synchronous sends for all
streams except independent ones. This fixed an old bug when a filter is
configured. Write activities on synchronous sends were lost. With slow
clients uploading large object, it was possible to reach the server
timeout.
* More internal variables are unset from program section. More
specifically, HAPROXY_STARTUPLOGS_FD, HAPROXY_MWORKER_WAIT_ONLY and
HAPROXY_PROCESSES variables are now unset.
* ssl-min-ver and ss-max-ver parameters are now duplicated for bundles in
crt-list.
* A warning is now emitted during configuration parsing if a header field
name contains forbidden chars in an HTTP action. A special case is made
for the colon at the beginning so that it remains possible to place any
future pseudo-headers that may appear.
* An error is reported during configuration parsing if when the "len"
argument of a stick table type contains incorrect characters.
* DeviceAtlas compile options were updated to support the API v3 from
3.1.7 and onwards.
* The strict-sni documentation was updated to state it is possible to tart
without certificate on a bind line.
Thanks everyone for you help and your contributions !
Please find the usual URLs below :
Site index : https://www.haproxy.org/
Documentation : https://docs.haproxy.org/
Wiki : https://github.com/haproxy/wiki/wiki
Discourse : https://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Sources : https://www.haproxy.org/download/2.7/src/
Git repository : https://git.haproxy.org/git/haproxy-2.7.git/
Git Web browsing : https://git.haproxy.org/?p=haproxy-2.7.git
Changelog : https://www.haproxy.org/download/2.7/src/CHANGELOG
Dataplane API :
https://github.com/haproxytech/dataplaneapi/releases/latest
Pending bugs : https://www.haproxy.org/l/pending-bugs
Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs
Code reports : https://www.haproxy.org/l/code-reports
Latest builds : https://www.haproxy.org/l/dev-packages
---
Complete changelog :
Amaury Denoyelle (39):
MINOR: quic: derive first DCID from client ODCID
MINOR: quic: remove ODCID dedicated tree
MINOR: quic: remove address concatenation to ODCID
BUG/MINOR: task: allow to use tasklet_wakeup_after with tid -1
CLEANUP: quic: remove unused QUIC_LOCK label
CLEANUP: quic: remove unused scid_node
CLEANUP: quic: remove unused qc param on stateless reset token
CLEANUP: quic: rename quic_connection_id vars
MINOR: quic: remove uneeded tasklet_wakeup after accept
MINOR: quic: adjust Rx packet type parsing
MINOR: quic: adjust quic CID derive API
MINOR: quic: remove TID ref from quic_conn
BUG/MINOR: quic: transform qc_set_timer() as a reentrant function
BUG/MEDIUM: quic: prevent crash on Retry sending
BUG/MINOR: mux-quic: fix crash with app ops install failure
BUG/MINOR: mux-quic: properly handle STREAM frame alloc failure
BUG/MINOR: h3: fix crash on h3s alloc failure
BUG/MINOR: quic: prevent crash on qc_new_conn() failure
BUG/MINOR: quic: consume Rx datagram even on error
MEDIUM: quic: use a global CID trees list
MINOR: quic: remove TID encoding in CID
MEDIUM: quic: handle conn bootstrap/handshake on a random thread
MINOR: quic: do not proceed to accept for closing conn
MINOR: protocol: define new callback set_affinity
MINOR: quic: delay post handshake frames after accept
MINOR: fd: implement fd_migrate_on() to migrate on a non-local thread
MEDIUM: quic: implement thread affinity rebinding
MINOR: quic: properly finalize thread rebinding
MAJOR: quic: support thread balancing on accept
MINOR: listener: remove unneeded local accept flag
CLEANUP: quic: rename frame types with an explicit prefix
CLEANUP: quic: rename frame variables
BUG/MEDIUM: mux-quic: do not emit RESET_STREAM for unknown length
BUG/MEDIUM: mux-quic: improve streams fairness to prevent early timeout
BUG/MINOR: quic: prevent buggy memcpy for empty STREAM
MINOR: mux-quic: do not set buffer for empty STREAM frame
MINOR: mux-quic: do not allocate Tx buf for empty STREAM frame
MINOR: quic: finalize affinity change as soon as possible
BUG/MINOR: quic: fix race on quic_conns list during affinity rebind
Aurelien DARRAGON (23):
MINOR: proxy/pool: prevent unnecessary calls to pool_gc()
BUG/MEDIUM: proxy/sktable: prevent watchdog trigger on soft-stop
BUG/MINOR: backend: make be_usable_srv() consistent when stopping
MINOR: server: add SRV_F_DELETED flag
BUG/MINOR: server/del: fix srv->next pointer consistency
BUG/MINOR: stats: properly handle server stats dumping resumption
BUG/MINOR: sink: free forward_px on deinit()
BUG/MINOR: log: free log forward proxies on deinit()
BUG/MINOR: hlua: hook yield does not behave as expected
BUG/MINOR: hlua: enforce proper running context for register_x functions
CLEANUP: hlua: fix conflicting comment in hlua_ctx_destroy()
MINOR: hlua: add simple hlua reference handling API
BUG/MINOR: hlua: fix reference leak in core.register_task()
BUG/MINOR: hlua: fix reference leak in hlua_post_init_state()
BUG/MINOR: hlua: prevent function and table reference leaks on errors
MINOR: hlua: simplify lua locking
BUG/MEDIUM: hlua: prevent deadlocks with main lua lock
BUG/MINOR: errors: invalid use of memprintf in startup_logs_init()
BUG/MINOR: server: incorrect report for tracking servers leaving drain
MINOR: server: explicitly commit state change in srv_update_status()
BUG/MINOR: server: don't miss proxy stats update on server state
transitions
BUG/MINOR: server: don't miss server stats update on server state
transitions
BUG/MINOR: server: don't use date when restoring last_change from state
file
Christopher Faulet (14):
Revert "BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was
reached"
BUG/MEDIUM: mux-h2: Be able to detect connection error during handshake
BUG/MEDIUM: channel: Improve reports for shut in co_getblk()
BUG/MEDIUM: dns: Properly handle error when a response consumed
MINOR: http-ana: Add a HTTP_MSGF flag to state the Expect header was
checked
BUG/MINOR: http-ana: Don't switch message to DATA when waiting for payload
BUG/MEDIUM: dns: Kill idle DNS sessions during stopping stage
BUG/MINOR: resolvers: Wakeup DNS idle task on stopping
BUG/MEDIUM: resolvers: Force the connect timeout for DNS resolutions
BUG/MINOR: stream: Fix test on SE_FL_ERROR on the wrong entity
REGTESTS: fix the race conditions in log_uri.vtc
BUG/MEDIUM: log: Properly handle client aborts in syslog applet
CLEANUP: backend: Remove useless debug message in assign_server()
BUG/MEDIUM: Update read expiration date on synchronous send
David Carlier (1):
BUILD: da: extends CFLAGS to support API v3 from 3.1.7 and onwards.
Frédéric Lécaille (56):
BUG/MINOR: quic: Missing padding in very short probe packets
BUG/MINOR: quic: Wrong use of now_ms timestamps (cubic algo)
MINOR: quic: Add recovery related information to "show quic"
BUG/MINOR: quic: Wrong use of now_ms timestamps (newreno algo)
BUG/MINOR: quic: Missing max_idle_timeout initialization for the
connection
MINOR: quic: Implement cubic state trace callback
MINOR: quic: Adjustments for generic control congestion traces
MINOR: quic: Traces adjustments at proto level.
MEDIUM: quic: Ack delay implementation
BUG/MINOR: quic: Wrong rtt variance computing
BUG/MINOR: quic: Remaining useless statements in cubic slow start callback
BUG/MINOR: quic: Cubic congestion control window may wrap
MINOR: quic: Add missing traces in cubic algorithm implementation
BUG/MAJOR: quic: Congestion algorithms states shared between the
connection
BUG/MINOR: quic: Remove useless BUG_ON() in newreno and cubic algo
implementation
MINOR: quic: Add trace to debug idle timer task issues
BUG/MINOR: quic: Unexpected connection closures upon idle timer task
execution
BUG/MINOR: quic: Wrong idle timer expiration (during 20s)
BUILD: quic: 32bits compilation issue in cli_io_handler_dump_quic()
BUG/MINOR: quic: Possible wrong PTO computing
BUG/MINOR: quic: Possible crashes in qc_idle_timer_task()
MINOR: quic: Trace fix in quic_pto_pktns() (handshaske status)
BUG/MINOR: quic: Wrong packet number space probing before confirmed
handshake
MINOR: quic: Modify qc_try_rm_hp() traces
MINOR: quic: Dump more information at proto level when building packets
MINOR: quic: Add a trace for packet with an ACK frame
MINOR: quic: Add packet loss and maximum cc window to "show quic"
BUG/MINOR: quic: Ignored less than 1ms RTTs
MINOR: quic: Add connection flags to traces
BUG/MEDIUM: quic: Code sanitization about acknowledgements requirements
BUG/MINOR: quic: Possible wrapped values used as ACK tree purging limit.
BUG/MINOR: quic: SIGFPE in quic_cubic_update()
MINOR: quic: Display the packet number space flags in traces
MINOR: quic: Remove a useless test about probing in qc_prep_pkts()
BUG/MINOR: quic: Wrong Application encryption level selection when probing
BUG/MINOR: quic: Do not use ack delay during the handshakes
BUG/MINOR: quic: Stop removing ACK ranges when building packets
MINOR: quic: Do not allocate too much ack ranges
BUG/MINOR: quic: Unchecked buffer length when building the token
BUG/MINOR: quic: Wrong Retry token generation timestamp computing
MINOR: quic: Move traces at proto level
BUG/MINOR: quic: Possible memory leak from TX packets
BUG/MINOR: quic: Possible leak during probing retransmissions
BUG/MINOR: quic: Useless probing retransmission in draining or killing
state
BUG/MINOR: quic: Useless I/O handler task wakeups (draining, killing
state)
CLEANUP: quic: Remove useless parameters passes to qc_purge_tx_buf()
CLEANUP: quic: Rename <buf> variable to <token> in
quic_generate_retry_token()
CLEANUP: quic: Rename <buf> variable into quic_padding_check()
CLEANUP: quic: Rename <buf> variable into quic_rx_pkt_parse()
CLEANUP: quic: Rename <buf> variable for several low level functions
CLEANUP: quic: Make qc_build_pkt() be more readable
CLEANUP: quic: Rename quic_get_dgram_dcid() <buf> variable
CLEANUP: quic: Rename several <buf> variables at low level
CLEANUP: quic: Rename <buf> variable into quic_packet_read_long_header()
CLEANUP: quic: Rename <buf> variable into qc_parse_hd_form()
CLEANUP: quic: Rename several <buf> variables into quic_sock.c
Ilya Shipitsin (2):
CI: bump "actions/checkout" to v3 for cross zoo matrix
CI: cirrus-ci: bump FreeBSD image to 13-1
Marcos de Oliveira (1):
DOC/MINOR: reformat configuration.txt's "quoting and escaping" table
Miroslav Zagorac (1):
BUG/MINOR: illegal use of the malloc_trim() function if jemalloc is used
Olivier Houchard (1):
BUG/MEDIUM: fd: don't wait for tmask to stabilize if we're not in it.
Remi Tricot-Le Breton (1):
BUG/MINOR: ssl: ssl-(min|max)-ver parameter not duplicated for bundles in
crt-list
William Lallemand (4):
DOC: config: strict-sni allows to start without certificate
BUG/MINOR: mworker: unset more internal variables from program section
BUG/MINOR: stick_table: alert when type len has incorrect characters
MINOR: ssl: remove OpenSSL 1.0.2 mention into certificate loading error
Willy Tarreau (20):
MINOR: http-act: emit a warning when a header field name contains
forbidden chars
BUILD: compiler: fix __equals_1() on older compilers
MINOR: activity: add a line reporting the average CPU usage to "show
activity"
BUG/MINOR: cfgparse: make sure to include openssl-compat
MINOR: fd: optimize fd_claim_tgid() for use in fd_insert()
MINOR: fd: add a lock bit with the tgid
BUG/MINOR: cli: clarify error message about stats bind-process
BUG/MINOR: config: fix NUMA topology detection on FreeBSD
BUILD: sock_inet: forward-declare struct receiver
BUILD: proto_tcp: export the correct names for proto_tcpv[46]
MINOR: pools: make sure 'no-memory-trimming' is always used
MINOR: pools: intercept malloc_trim() instead of trying to plug holes
MEDIUM: pools: move the compat code from trim_all_pools() to malloc_trim()
MINOR: pools: export trim_all_pools()
MINOR: pattern: use trim_all_pools() instead of a conditional
malloc_trim()
MINOR: tools: relax dlopen() on malloc/free checks
MEDIUM: tools: further relax dlopen() checks too consider grouped symbols
BUG/MINOR: pools: restore detection of built-in allocator
MINOR: pools: report a replaced memory allocator instead of just
malloc_trim()
BUG/MINOR: tools: check libssl and libcrypto separately
--
Christopher Faulet