Hi,

HAProxy 2.6.10 was released on 2023/03/10. It added 78 new commits
after version 2.6.9.

A bit more than half of the commits are HTTP3/QUIC fixes. However, as
indicated in the 2.8-dev5 announce, a concurrency bug introduced in 2.5
was fixed in this version, that may cause freezes and crashes when some
HTTP/1 backend connections are closed by the server exactly at the same
time they're going to be reused by another thread. Another different bug
also affecting idle connections since 2.2 was fixed, possibly causing an
occasional crash. One possible work-around if you've faced such issues
recently is to disable inter-thread connection reuse with this directive
in the global section:

   tune.idle-pool.shared off

But beware that this may increase the total number of connections kept
established with your backend servers depending the reuse frequency and
the number of threads.

I want to be clear on one point: the issue is structural, and trying to
port these fixes to 2.6 just made the situation much worse! The only
solution we found to address it relies on some facilities that were
integrated in 2.7 and that offer the guarantees we need during certain
critical transitions of a file descriptor state (refcount and owning
thread group). I have not found any workable solution to the problem
without these facilities, so this required that I backported the strict
minimum amount of patches (18) to bring these facilities there. I hate
having to do that but this time there was no other option. And that made
me realize that instead of keeping 2.6 on its own half-way architecture
like it was, it's not a bad thing that it ressembles next versions to
make backports of fixes more reliable in the future. Despite the large
amount of tests in legacy and master-worker modes, with/without threads,
with reloads, FD passing, saturated listeners etc, it remains possible
that I failed on a corner case. So please watch a little bit more than
usual after you update, and do not hesitate to report any issue you
think you might face consecutive to this.

Other, less critical, issues are described below.

In master-worker mode, when performing an upgrade from an old version
(before 1.9) to a newer version (>=2.5) the HAPROXY_PROCESSES environment
variable was missing, and this combined with a missing element in an
internal structure representing old processes will result in a null-deref
which will crash the master process after the reload. It's very unlikely
to hit this one, except during migration attempts where it can make one
think the new version doesn't work, and encourage to roll back to the
older one. The reported uptime for processes was also fixed so that wall
clock time is used instead of the internal timer.

A few issues affecting the Lua mapping of the HTTP client were addressed;
one of them is a small memory leak by which a few bytes could leak per
request, which could become problematic if used heavily. Another one is
a concurrency issue with Lua's garbage collector that didn't sufficiently
lock other threads' items while trying to free them.

It was found that the low-latency scheduling of TLS handshakes can
degenerate during extreme loads, and take a long time to recover. The
problem is that in order to prevent TLS handshakes from causing high
latency spikes to the rest of the traffic, they're placed in a dedicated
scheduling class that executes one of them per polling loop. But if there
are too many pending due to a big burst, the extra latency caused to the
pending ones can make clients give up and try again, reaching the point
where none of the processed tasks yields anything useful since they were
already abandonned. Now the number of handshakes per loop will grow as
the number of pending ones grows, and this addresses the problem without
adding extra latency even under extreme loads.

There were various QUIC fixes aiming at addressing some issues reported
by users and tests.

The cache failed to cache a response for a request that had the "no-cache"
directive (typically a forced reload). This prevented from refreshing the
cache this way, this is now fixed.

In some rare cases it was possible to freeze a compressing stream if there
was exactly one byte left at the end of the buffer, which was insufficient
to place a new HTX block and prevented any progress from being made. This
has been the case since 2.5 so it doesn't seem easy to trigger!

Layer7 retries did not work anymore on the "empty-response" condition due
to a change that was made in 2.4.

The dump of the supported config language keywords with -dK incorrectly
attributed some of the crt-list specific keywords to "bind ... ssl", which
could cause confusion for those designing config parsers or generators by
regularly checking for new stuff there. Now an explicit "crt-list" sub-
section is dumped and "bind ssl" only dumps keywords really supported on
"bind" lines.

The global directive "no numa-cpu-mapping" that forces haproxy to bind to
multiple CPU sockets even if it should result in lower performance was lost
across reloads in master-worker mode, because the master in wait mode
doesn't see it, thus applies the restriction to itself, and that one is
inherited by subsequent masters that pass it to their workers.

And a few other minor updates aside, that's about all. Those with high
request rates or who already noticed crashes or strange errors are strongly
encouraged to update and try again.

Also one point regarding 2.5, it also requires the fixes mentioned above,
but we need to keep in mind that it's about to reach end of life. Thus I
prefer to delay a last version a little bit so as to encourage the last
users of 2.5 to switch to 2.6 and still have a 2.5 fallback without the
fixes above in the unlikely event something's wrong with them. We'll
probably do a last one with these fixes by the end of the month.

Please find the usual URLs below :
   Site index       : https://www.haproxy.org/
   Documentation    : https://docs.haproxy.org/
   Wiki             : https://github.com/haproxy/wiki/wiki
   Discourse        : https://discourse.haproxy.org/
   Slack channel    : https://slack.haproxy.org/
   Issue tracker    : https://github.com/haproxy/haproxy/issues
   Sources          : https://www.haproxy.org/download/2.6/src/
   Git repository   : https://git.haproxy.org/git/haproxy-2.6.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy-2.6.git
   Changelog        : https://www.haproxy.org/download/2.6/src/CHANGELOG
   Dataplane API    : 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs     : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs    : https://www.haproxy.org/l/reviewed-bugs
   Code reports     : https://www.haproxy.org/l/code-reports
   Latest builds    : https://www.haproxy.org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (9):
      MINOR: h3/hq-interop: handle no data in decode_qcs() with FIN set
      BUG/MINOR: mux-quic: transfer FIN on empty STREAM frame
      MINOR: quic: adjust request reject when MUX is already freed
      BUG/MINOR: quic: also send RESET_STREAM if MUX released
      BUG/MINOR: quic: acknowledge STREAM frame even if MUX is released
      BUG/MINOR: h3: prevent hypothetical demux failure on int overflow
      BUG/MEDIUM: quic: properly handle duplicated STREAM frames
      BUG/MEDIUM: quic: do not crash when handling STREAM on released MUX
      BUG/MINOR: mux-quic: properly init STREAM frame as not duplicated

Aurelien DARRAGON (2):
      BUG/MINOR: lua/httpclient: missing free in hlua_httpclient_send()
      BUG/MEDIUM: httpclient/lua: fix a race between lua GC and hlua_ctx_destroy

Christopher Faulet (12):
      BUG/MEDIUM: stconn: Don't rearm the read expiration date if EOI was 
reached
      REGTESTS: Fix ssl_errors.vtc script to wait for connections close
      BUG/MEDIUM: h1-htx: Never copy more than the max data allowed during 
parsing
      DOC: config: Fix description of options about HTTP connection modes
      DOC: config: Add the missing tune.fail-alloc option from global listing
      DOC: config: Clarify the meaning of 'hold' in the 'resolvers' section
      BUG/MEDIUM: connection: Clear flags when a conn is removed from an idle 
list
      BUG/MINOR: http-check: Don't set HTX_SL_F_BODYLESS flag with a log-format 
body
      BUG/MINOR: http-check: Skip C-L header for empty body when it's not 
mandatory
      BUG/MINOR: http-ana: Don't increment conn_retries counter before the L7 
retry
      BUG/MINOR: http-ana: Do a L7 retry on read error if there is no response
      BUG/MINOR: fd: Properly init the fd state in fd_insert()

Frédéric Lécaille (16):
      BUILD: thead: Fix several 32 bits compilation issues with uint64_t 
variables
      BUG/MINOR: quic: Possible unexpected counter incrementation on send*() 
errors
      BUG/MINOR: quic: Really cancel the connection timer from qc_set_timer()
      BUG/MINOR: quic: Missing call to task_queue() in qc_idle_timer_do_rearm()
      BUG/MINOR: quic: Do not probe with too little Initial packets
      BUG/MINOR: quic: Wrong initialization for io_cb_wakeup boolean
      BUG/MINOR: quic: Do not drop too small datagrams with Initial packets
      BUG/MINOR: quic: Missing padding for short packets
      BUG/MINOR: quic: Do not send too small datagrams (with Initial packets)
      BUG/MINOR: quic: Ensure to be able to build datagrams to be retransmitted
      BUG/MINOR: quic: Remove force_ack for Initial,Handshake packets
      BUG/MINOR: quic: Ensure not to retransmit packets with no ack-eliciting 
frames
      BUG/MINOR: quic: Do not resend already acked frames
      MINOR: quic: Move code to wakeup the timer task to avoid anti-amplication 
deadlock
      BUG/MINOR: quic: Missing detections of amplification limit reached
      BUG/MINOR: quic: Missing listener accept queue tasklet wakeups

Michael Prokop (1):
      DOC/CLEANUP: fix typos

Remi Tricot-Le Breton (3):
      BUG/MINOR: cache: Cache response even if request has "no-cache" directive
      BUG/MINOR: cache: Check cache entry is complete in case of Vary
      BUG/MINOR: ssl: Use 'date' instead of 'now' in ocsp stapling callback

William Lallemand (8):
      BUG/MINOR: mworker: stop doing strtok directly from the env
      BUG/MEDIUM: mworker: prevent inconsistent reload when upgrading from old 
versions
      BUG/MEDIUM: mworker: don't register mworker_accept_wrapper() when master 
FD is wrong
      MINOR: startup: HAPROXY_STARTUP_VERSION contains the version used to start
      BUG/MINOR: mworker: prevent incorrect values in uptime
      MINOR: ssl: rename confusing ssl_bind_kws
      BUG/MINOR: config: crt-list keywords mistaken for bind ssl keywords
      BUG/MINOR: mworker: use MASTER_MAXCONN as default maxconn value

Willy Tarreau (27):
      MINOR: fd/cli: report the polling mask in "show fd"
      BUG/MINOR: sched: properly report long_rq when tasks remain in the queue
      BUG/MEDIUM: sched: allow a bit more TASK_HEAVY to be processed when needed
      MINOR: mux-h2/traces: do not log h2s pointer for dummy streams
      MINOR: mux-h2/traces: add a missing TRACE_LEAVE() in 
h2s_frt_handle_headers()
      BUG/MINOR: ring: do not realign ring contents on resize
      BUG/MINOR: init: properly detect NUMA bindings on large systems
      BUG/MEDIUM: master: force the thread count earlier
      BUG/MINOR: init: make sure to always limit the total number of threads
      BUG/MINOR: thread: report thread and group counts in the correct order
      BUG/MINOR: ring: release the backing store name on exit
      MEDIUM: epoll: don't synchronously delete migrated FDs
      MEDIUM: poller: program the update in fd_update_events() for a migrated FD
      MAJOR: fd: remove pending updates upon real close
      MINOR: fd: delete unused updates on close()
      MEDIUM: fd: add the tgid to the fd and pass it to fd_insert()
      MINOR: cli/fd: show fd's tgid and refcount in "show fd"
      MINOR: fd: add functions to manipulate the FD's tgid
      MINOR: fd: add fd_get_running() to atomically return the running mask
      MAJOR: fd: grab the tgid before manipulating running
      MINOR: fd: make fd_clr_running() return the previous value instead
      MEDIUM: fd: make fd_insert/fd_delete atomically update fd.tgid
      MEDIUM: fd: quit fd_update_events() when FD is closed
      MAJOR: poller: only touch/inspect the update_mask under tgid protection
      MEDIUM: fd: support broadcasting updates for foreign groups in 
updt_fd_polling
      BUG/MAJOR: fd/thread: fix race between updates and closing FD
      BUG/MAJOR: fd/threads: close a race on closing connections after takeover

---

Reply via email to