Hi, HAProxy 3.0.7 was released on 2024/12/12. It added 54 new commits after version 3.0.6.
Two major bugs were fixed in this released. The first one concerned the H1 multiplexer. The function responsible of the first-line formatting of a request or a response did not properly handle the wrapping of the output buffer. This could lead to a corruption of data in the buffer. AFAIK, it is only an issue with the response formatting when the HTTP request pipelining is in-use. And the corruption is limited to a client. It is not possible to exploit this issue to retrieve sensitive data, there is no overflow. The second major fixed issue concerned the QUIC. It was possible to provoke a crash if a ACK was received just before the frames retransmit. In that case, the packet build was not interrupted, leading to build an empty packet which should be ack-elicting. The crash itself was the result of a BUG_ON() which detected the issue. The way to deal with too many headers in received H2/H3 messages was fixed. In H2, the maximum number of headers allowed in HEADERS frames on sending path was lower than on receiving path. This could lead to report sending errors while the message was accepted. It could be confusing. In H3, the number of headers was tested before the decoding. However, pseudo headers and cookie headers consumed extra slots. So in practice, this lowered the maximum number of headers that could be received. To workaround these issues, The number of headers in received messages is now always tested after the decoding stage. In addition, unlike H1, the number of headers must be limited when H2/H3 messages are sent to comply to limitation imposed by the protocols. This limit was increased to support headers rewriting without issue. An issue with the parsing of QUIC packets containing too many out-of-order CRYPTO frames, leading to reject and unacknowledge the whole packet, was fixed. Indeed, these CRYPTO frames must be buffered to be handled sequentially. But CRYPTO frames too heavily split with small fragments could reach a limit and be rejected. Now, the packet parsing is repeated to be able to reassemble the CRYPTO frames. In addition, the QUIC multiplexer was fixed to properly implement the wait-for-handshake action. The alert message about the 'socket-owner connection' support was replaced by a diagnostic warning because there is an automatic fallback mechanism. Finally, the extra empty H3 DATA frame at the end of the message is now longer sent when zero-copy data forwarding is used. It is not invalid but useless. On the H2 multiplexer, on server side, it was possible to send RST_STREAM frame for streams with unassigned ID, so before the formatting of the HEADERS frame, because the session was aborted during the connection stage. It was an issue if this happened before the H2 PREFACE was sent because this prevent the servers to recognize it as a H2 connection, leading to an early connection closure. We now take care to not emit RST_STREAM frame in that case. The code was reviewed to never use now_ms as an expiration date for a task. It was an internal issue that could lead to unexpected behavior when now_ms variable wrapped and was exactly equal to 0. We now always make sure to apply an offsets to now_ms when an expiration data must be set. Four issues with the L7 retries were fixed. First, the server status was not adjusted at each retry, while it should be. Only the last connection attempt was considered. Then, the buffer used to save the request to be able to perform a L7 retry was released to early in some rare cases and the request could be lost. It is of course unexpected and this could lead to crash. The request state was not properly reset on L7 retry. The request channel flag stating some data were sent was not reset on retry. This could lead to consider a subsequent connection error as a L7 error while the request was never sent. In that case too, the request could be lost, leading to crash. Finally, the L7 retries could be ignored if a server abort was detected during the request forwarding when the request was already in DONE state. In that case, the server abort must be handled on the response analysis side to be able to properly handle the L7 retries. In logs, the server response time (%Tr) was erroneously reported as -1 when it was intercepted by HAProxy. -1 is reserved to the case where response headers were not fully received. An old issue with the watchdog was fixed. It was possible to consider a thread as stuck by error because it was flagged in the debug handler. The issue was really visible on the 3.1, but on older versions, it was possible to encounter it if a "show threads" command executed while the watchdog timer was about to trigger before going back to the scheduler. Now, only the watchdog is responsible to detect stuck threads. In addition, the recently added mechanism to emit warnings when stuck threads are detected was fixed to work as expected instead of dying too early. The reason was missing in H2 responses forwarded to H1 clients while it was stated in the configuration manual that HAProxy should add one that matched the status code. It is now fixed. Random crap reports or even crashes could be experienced during a "show profiling memory" because some pools are replaced during startup and pool pointers were not properly unreferenced. It is now fixed. In addition, some high negative values could randomly be shown on the DSO lines in "show profiling memory" output. This was fixed too. The expiration date for the task responsible to clean up the server resolution status when outdated info were inherited from the state file was fixed. "hold.timeout" was initially used. But is was not accurate, especially because it could be set to a high value or 0. Now the expiration date is based on the resolver "resolve" and "retry" timeouts. In addition, when a resolver was woken up to process DNS resolutions, it was possible to trigger an infinite loop on the resolver's wait list because delayed resolutions were always reinserted at the end of this list. This led the watchdog to kill the process. By re-inserting them in front of the list fixed the issue. A weird issue was fixed about the epoll poller. Over the last two years, there were few reports about immediate closes spuriously happening on connections where network captures proved that the server had not closed at all (and sometimes even received the request and responded to it after HAProxy had closed). The logs shown that a successful connection was immediately reported on error after the request was sent. After investigations, it appeared that a EPOLLUP, or eventually a EPOLLRDHUP, can be reported by epool_wait() during the connect() but in sock_conn_check(), the connect() reports a success. So the connection was validated but the HUP was handled on the first receive and an error was reported. So, to workaround the issue, we have decided to remove FD_POLL_HUP flag on the FD during the connection establishment if FD_POLL_ERR is not reported too in sock_conn_check(). This way, the call to connect() is able to validate or reject the connection. At the end, if the HUP or RDHUP flags were valid, either connect() would report the error itself, or the next recv() would return 0 confirming the closure that the poller tried to report. EPOLL_RDHUP is only an optimization to save a syscall anyway, and this pattern is so rare that nobody will ever notice the extra call to recv(). Please note that at least one reporter confirmed that using poll() instead of epoll() also addressed the problem, so that can also be a temporary workaround for those discovering the problem without the ability to immediately upgrade. Insertion and suppression of the GUIDs assigned to proxies, listeners and servers where not tread-safe. While it is not an issue for proxies and listeners, for servers, which can be added or removed at runtime, it could be an issue. To fix the issue, a read-write lock was added to protect accesses to the GUID tree. An issue about the log message formatting was introduced during the 3.0 dev cycle. When a log-format alias ended up printing nothing, the whole log-format evaluation was stopped. This was fixed. The SIGINT signal could be missed by HAProxy when it was started in background in a subshell. It is the root cause of some unexpected timeouts with Vtest scripts. To fix the issue, the default signal handler is registered for the SIGINT signal during init. HAPROXY_CLI and HAPROXY_MASTER_CLI could exposed the internal sockpairs which should be only used for the master CLI. These internal sockpairs are now always hidden. To finish, some issues in the configuration manual were fixed, the documention of "%Tr" and the "tune.http.maxhdr" directive were improved and an explanation about quotes and spaces in conditional blocks was added, Thanks everyone for your help ! Please find the usual URLs below : Site index : https://www.haproxy.org/ Documentation : https://docs.haproxy.org/ Wiki : https://github.com/haproxy/wiki/wiki Discourse : https://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Sources : https://www.haproxy.org/download/3.0/src/ Git repository : https://git.haproxy.org/git/haproxy-3.0.git/ Git Web browsing : https://git.haproxy.org/?p=haproxy-3.0.git Changelog : https://www.haproxy.org/download/3.0/src/CHANGELOG Dataplane API : https://github.com/haproxytech/dataplaneapi/releases/latest Pending bugs : https://www.haproxy.org/l/pending-bugs Reviewed bugs : https://www.haproxy.org/l/reviewed-bugs Code reports : https://www.haproxy.org/l/code-reports Latest builds : https://www.haproxy.org/l/dev-packages --- Complete changelog : Amaury Denoyelle (11): MINOR: quic: notify connection layer on handshake completion BUG/MINOR: stream: unblock stream on wait-for-handshake completion BUG/MEDIUM: quic: support wait-for-handshake MINOR: quic: simplify qc_parse_pkt_frms() return path MINOR: quic: use dynamically allocated frame on parsing MINOR: quic: extend return value of CRYPTO parsing BUG/MINOR: quic: repeat packet parsing to deal with fragmented CRYPTO CLEANUP: guid: remove global tree export BUG/MINOR: guid/server: ensure thread-safety on GUID insert/delete BUG/MEDIUM: quic: prevent crash due to CRYPTO parsing error BUG/MINOR: quic: remove startup alert if conn socket-owner unsupported Aurelien DARRAGON (4): BUG/MEDIUM: pattern: prevent uninitialized reads in pat_match_{str,beg} DOC: lua: fix yield-dependent methods expected contexts BUG/MINOR: log: fix lf_text() behavior with empty string BUG/MEDIUM: event_hdl: fix uninitialized value in async mode when no data is provided Christopher Faulet (23): BUG/MEDIUM: resolvers: Insert a non-executed resulution in front of the wait list BUG/MEDIUM: mux-h2: Don't send RST_STREAM frame for streams with no ID BUG/MINOR: Don't report early srv aborts on request forwarding in DONE state DOC: config: A a space before ':' for {bs,fs}.aborted and {bs,fs}.rst_code DOC: config: Fix a typo in "1.3.1. The Request line" BUG/MINOR: http_ana: Report -1 for %Tr for invalid response only DOC: config: Slightly improve the %Tr documentation DOC: config: Move wait_end in section about internal samples DOC: config: Move fs.* and bs.* in section about L5 samples BUG/MINOR: http-ana: Adjust the server status before the L7 retries BUG/MEDIUM: mux-h2: Increase max number of headers when encoding HEADERS frames BUG/MEDIUM: mux-h2: Check the number of headers in HEADERS frame after decoding BUG/MEDIUM: h3: Properly limit the number of headers received BUG/MEDIUM: h3: Increase max number of headers when sending headers DOC: config: Improve documentation of tune.http.maxhdr directive BUG/MAJOR: mux-h1: Properly handle wrapping on obuf when dumping the first-line DEV: lags/show-sess-to-flags: Properly handle fd state on server side BUG/MEDIUM: http-ana: Don't release too early the L7 buffer BUG/MEDIUM: sock: Remove FD_POLL_HUP during connect() if FD_POLL_ERR is not set MINOR: mux-quic: Don't send an emtpy H3 DATA frame during zero-copy forwarding BUG/MEDIUM: http-ana: Reset request flag about data sent to perform a L7 retry BUG/MINOR: h1-htx: Use default reason if not set when formatting the response BUG/MINOR: server-state: Fix expiration date of srvrq_check tasks Frederic Lecaille (1): BUG/MAJOR: quic: fix wrong packet building due to already acked frames Valentine Krasnobaeva (2): BUG/MINOR: cli: don't show sockpairs in HAPROXY_CLI and HAPROXY_MASTER_CLI BUG/MINOR: signal: register default handler for SIGINT in signal_init() Willy Tarreau (13): BUG/MEDIUM: checks: make sure to always apply offsets to now_ms in expiration BUG/MEDIUM: mailers: make sure to always apply offsets to now_ms in expiration BUG/MINOR: mux_quic: make sure to always apply offsets to now_ms in expiration BUG/MINOR: peers: make sure to always apply offsets to now_ms in expiration DOC: configuration: explain quotes and spaces in conditional blocks DOC: configuration: wrap long line for "strstr()" conditional expression BUG/MEDIUM: debug: don't set the STUCK flag from debug_handler() BUG/MEDIUM: wdt: fix the stuck detection for warnings BUG/MINOR: activity/memprofile: reinitialize the free calls on DSO summary MINOR: activity/memprofile: offer a function to unregister stale info BUG/MEDIUM: pools/memprofile: always clean stale pool info on pool_destroy() MINOR: mux-h2/traces: add a missing trace on negative initial window size CLEANUP: mux-h2/traces: reword certain ambiguous traces -- Christopher Faulet