[ANNOUNCE] haproxy-2.9-dev3

Willy Tarreau Sat, 12 Aug 2023 12:00:57 -0700

Hi,

HAProxy 2.9-dev3 was released on 2023/08/12. It added 105 new commits
after version 2.9-dev2. It got a bit delayed by the last-minute bugs that
all had to be fixed at the same time last week.


A number of bugs were addressed, essentially the same as those that landed
in 2.8.2 and other. A new one was just fixed, which affects configurations
mixing "lua-load" and "lua-load-per-thread", these can crash if some HTTP
rules reference sample fetches or actions from both of them due to the stack
not being reset between calls.

This version starts to include more sensitive changes, so please test it
with care:

- QUIC: unused parts of connections are released ASAP now. This is
  important at the end of a connection, when there's nothing anymore
  to send, while it may lay there hanging for a while, it's now possible
  to release multiple kB of RAM and only keep what's really essential.
  This is comparable to the TCP TIME_WAIT state which is only a minimal
  state. This should improve memory usage for those dealing with many
  QUIC connections.

- stick-tables and peers: the locking was still particularly heavy and
  almost not revisited since 1.8, and it didn't scale at all with threads.
  Worse, the performance would quickly collapse, to the point of causing
  me a problem to decide whether or not to enable more threads when
  detected. In my tests with 80 threads, the request rate was divided by
  10 as soon as one peer was declared in the configuration (even if not
  used, such as the local one to deal with reloads). The locking was
  carefully refined by disentangling lookups from peer updates, and fixing
  some undesired cache line sharing, which resulted in the performance
  raising from 277k to 4.4M req/s over several patches. There is still
  room for improvement for small systems due to the way the updates are
  managed, which could be significantly cheaper, but for now that was
  out of the radar given that the goal was to regain scalability.

- pools: on large systems we've seen several times H2 and H3 underperform
  due to extreme contention on the shared pools and/or on the allocation
  counter. Here the approach is different because it's related to some
  common information that it accessed by many cores at once, and sometimes
  with a huge inter-core latency (yes I'm watching you EPYC). The approach
  taken here to resolve this contention consists in splitting counters
  into multiple shards. They're not indexed on the threads (since objects
  can migrate between threads) but on their pointers themselves. The
  results are particularly good, seeing H2 request rate jump from
  1.5M to 6.7M requests/s on 80-threads without the shared cache, and from
  368k to 4.7M requests/s with the shared cache (that one can even reach
  5.5M with a 32-wide shard but I consider that it's not worth doing it
  by default as it consumes a bit more memory). Here again it seems like
  the cluster management that unlocked pools in the past could be improved
  to preserve locality, but this is probably for later.

- threads in general: the upgrade of the locking code now applies
  exponential backoff during lock upgrades and when a writer is waiting
  for all readers to leave, and makes use of some refined barriers on
  ARM systems, bringing a 2-4% gain on x86 and 14-33% on ARM.

- samples: we'd like to have all of the log-format tags available as
  samples but we've found that some of them are so specific or specially
  formatted that it's unlikely all of them will have a perfect match. But
  at least we're trying to make sure that information available as log
  tags is also available as samples that can be used to build conditions
  or serve in calculations. At the moment, the following ones were added:

  - the various timers available as T* tags in log-format now have an
    equivalent sample fetch function. This can allow for example to pass
    some info to the server, to perform some calculations such as avg
    download rate, or change the log level based on certain thresholds
    being met.

  - accept_date/request_date return respectively the accept date and
    the request receipt date, in seconds/milliseconds/microseconds
    depending on the argument.

  - a few other ones such as "pid" which returns the process' ID, and
    "act_conn" returning the number of active connections on the process.

- new converters "ms_utime", "ms_ltime", "us_utime", "us_ltime" that
  take on input respectively a timestamp in millisecond or microsecond
  and produce an strftime-compatible format with support for a milli
  and/or microsecond tag in it (%3N/%6N). To be used with accept_date
  and request_date.

- a new sample fetch called "acl" is used to evaluate ACLs and return
  their combined result. The principle is that it allows to perform a
  logical AND between several of them, possibly inverted, and deliver
  this result (possibly to be used in turn in ACLs). This helps reducing
  the number of ACLs by using exclusion. For example it becomes possible
  to create an ACL of the source addresses of all corporate offices, and
  an ACL of the guest networks, and use "acl(offices,!guest)" as a check
  for only offices but not guest networks, and possibly create an ACL
  "trusted_networks" from it that can be used everywhere else in the
  config.

- and the usual CI, doc, cleanups


I'm not doubting that those interested in saving CPU cycles on large
machines or saving memory with QUIC will want to give it a try (and
they'd be right). It should work (haproxy.org currently runs on it),
just don't deploy this a friday before going to vacation :-)

Please find the usual URLs below :
   Site index       : https://www.haproxy.org/
   Documentation    : https://docs.haproxy.org/
   Wiki             : https://github.com/haproxy/wiki/wiki
   Discourse        : https://discourse.haproxy.org/
   Slack channel    : https://slack.haproxy.org/
   Issue tracker    : https://github.com/haproxy/haproxy/issues
   Sources          : https://www.haproxy.org/download/2.9/src/
   Git repository   : https://git.haproxy.org/git/haproxy.git/
   Git Web browsing : https://git.haproxy.org/?p=haproxy.git
   Changelog        : https://www.haproxy.org/download/2.9/src/CHANGELOG
   Dataplane API    : 
https://github.com/haproxytech/dataplaneapi/releases/latest
   Pending bugs     : https://www.haproxy.org/l/pending-bugs
   Reviewed bugs    : https://www.haproxy.org/l/reviewed-bugs
   Code reports     : https://www.haproxy.org/l/code-reports
   Latest builds    : https://www.haproxy.org/l/dev-packages

Willy
---
Complete changelog :
Amaury Denoyelle (5):
      BUG/MEDIUM: quic: consume contig space on requeue datagram
      BUG/MINOR: quic: reappend rxbuf buffer on fake dgram alloc error
      BUILD: quic: fix wrong potential NULL dereference
      MINOR: h3: abort request if not completed before full response
      BUG/MEDIUM: quic: fix tasklet_wakeup loop on connection closing

Aurelien DARRAGON (3):
      BUG/MINOR: hlua: fix invalid use of lua_pop on error paths
      MINOR: hlua: add hlua_stream_ctx_prepare helper function
      BUG/MEDIUM: hlua: streams don't support mixing lua-load with 
lua-load-per-thread

Christopher Faulet (10):
      BUG/MEDIUM: h3: Properly report a C-L header was found to the HTX 
start-line
      BUG/MEDIUM: h3: Be sure to handle fin bit on the last DATA frame
      BUG/MEDIUM: bwlim: Reset analyse expiration date when then channel 
analyse ends
      MEDIUM: stream: Reset response analyse expiration date if there is no 
analyzer
      BUG/MINOR: htx/mux-h1: Properly handle bodyless responses when splicing 
is used
      BUG/MINOR: http-client: Don't forget to commit changes on HTX message
      CLEANUP: stconn: Move comment about sedesc fields on the field line
      REGTESTS: http: Create a dedicated script to test spliced bodyless 
responses
      REGTESTS: Test SPLICE feature is enabled to execute script about splicing
      BUG/MAJOR: http-ana: Get a fresh trash buffer for each header value 
replacement

Dragan Dosen (1):
      BUG/MINOR: chunk: fix chunk_appendf() to not write a zero if buffer is 
full

Frédéric Lécaille (25):
      BUG/MINOR: quic: Possible crash when acknowledging Initial v2 packets
      MINOR: quic: Export QUIC traces code from quic_conn.c
      MINOR: quic: Export QUIC CLI code from quic_conn.c
      MINOR: quic: Move TLS related code to quic_tls.c
      MINOR: quic: Add new "QUIC over SSL" C module.
      MINOR: quic: Add a new quic_ack.c C module for QUIC acknowledgements
      CLEANUP: quic: Defined but no more used function 
(quic_get_tls_enc_levels())
      MINOR: quic: Split QUIC connection code into three parts
      CLEANUP: quic: quic_conn struct cleanup
      MINOR: quic; Move the QUIC frame pool to its proper location
      BUG/MINOR: quic+openssl_compat: Non initialized TLS encryption levels
      CLEANUP: quic: Remove quic_path_room().
      MINOR: quic: Amplification limit handling sanitization.
      MINOR: quic: Move some counters from [rt]x quic_conn anonymous struct
      MEDIUM: quic: Send CONNECTION_CLOSE packets from a dedicated buffer.
      MINOR: quic: Use a pool for the connection ID tree.
      MEDIUM: quic: Allow the quic_conn memory to be asap released.
      MINOR: quic: Release asap quic_conn memory (application level)
      MINOR: quic: Release asap quic_conn memory from ->close() xprt callback.
      MINOR: quic: Warning for OpenSSL wrapper QUIC bindings without 
"limited-quic"
      BUG/MINOR: quic: mux started when releasing quic_conn
      BUG/MINOR: quic: Possible crash in quic_cc_conn_io_cb() traces.
      MINOR: quic: Add a trace for QUIC conn fd ready for receive
      BUG/MINOR: quic: Possible crash when issuing "show fd/sess" CLI commands
      BUG/MINOR: quic: Missing tasklet (quic_cc_conn_io_cb) memory release 
(leak)

Ilya Shipitsin (2):
      CI: do not use "groupinstall" for Fedora Rawhide builds
      CI: get rid of travis-ci wrapper for Coverity scan

Patrick Hemmer (3):
      CLEANUP: acl: remove cache_idx from acl struct
      REORG: cfgparse: extract curproxy as a global variable
      MINOR: acl: add acl() sample fetch

Remi Tricot-Le Breton (1):
      BUG/MINOR: ssl: OCSP callback only registered for first SSL_CTX

William Lallemand (9):
      MINOR: sample: add pid sample
      MINOR: sample: implement act_conn sample fetch
      MINOR: sample: accept_date / request_date return %Ts / %tr timestamp 
values
      MEDIUM: sample: implement us and ms variant of utime and ltime
      BUG/MINOR: sample: check alloc_trash_chunk() in conv_time_common()
      DOC: configuration: describe Td in Timing events
      MINOR: sample: implement the T* timer tags from the log-format as fetches
      DOC: configuration: add sample fetches for timing events
      DOC: configuration: rework the custom log format table

Willy Tarreau (46):
      BUILD: cfgparse: keep a single "curproxy"
      REORG: http: move has_forbidden_char() from h2.c to http.h
      BUG/MAJOR: h3: reject header values containing invalid chars
      MINOR: mux-h2/traces: also suggest invalid header upon parsing error
      MINOR: ist: add new function ist_find_range() to find a character range
      MINOR: http: add new function http_path_has_forbidden_char()
      MINOR: h2: pass accept-invalid-http-request down the request parser
      REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri 
tests
      BUG/MINOR: h1: do not accept '#' as part of the URI component
      BUG/MINOR: h2: reject more chars from the :path pseudo header
      BUG/MINOR: h3: reject more chars from the :path pseudo header
      REGTESTS: http-rules: verify that we block '#' by default for 
normalize-uri
      DOC: clarify the handling of URL fragments in requests
      BUG/MAJOR: http: reject any empty content-length header value
      BUG/MINOR: http: skip leading zeroes in content-length values
      BUG/MEDIUM: mux-h1: fix incorrect state checking in h1_process_mux()
      BUG/MEDIUM: mux-h1: do not forget EOH even when no header is sent
      BUILD: mux-h1: shut a build warning on clang from previous commit
      DEV: makefile: add a new "range" target to iteratively build all commits
      MAJOR: threads/plock: update the embedded library again
      MINOR: stick-table: move the task_queue() call outside of the lock
      MINOR: stick-table: move the task_wakeup() call outside of the lock
      MEDIUM: stick-table: change the ref_cnt atomically
      MINOR: stick-table: better organize the struct stktable
      MEDIUM: peers: update ->commitupdate out of the lock using a CAS
      MEDIUM: peers: drop then re-acquire the wrlock in peer_send_teachmsgs()
      MEDIUM: peers: only read-lock peer_send_teachmsgs()
      MEDIUM: stick-table: use a distinct lock for the updates tree
      MEDIUM: stick-table: touch updates under an upgradable read lock
      MEDIUM: peers: drop the stick-table lock before entering 
peer_send_teachmsgs()
      MINOR: stick-table: move the update lock into its own cache line
      CLEANUP: stick-table: slightly reorder the stktable struct
      BUILD: defaults: use __WORDSIZE not LONGBITS for MAX_THREADS_PER_GROUP
      MINOR: tools: make ptr_hash() support 0-bit outputs
      MINOR: tools: improve ptr hash distribution on 64 bits
      OPTIM: tools: improve hash distribution using a better prime seed
      OPTIM: pools: use exponential back-off on shared pool allocation/release
      OPTIM: pools: make pool_get_from_os() / pool_put_to_os() not update 
->allocated
      MINOR: pools: introduce the use of multiple buckets
      MEDIUM: pools: spread the allocated counter over a few buckets
      MEDIUM: pools: move the used counter over a few buckets
      MEDIUM: pools: move the needed_avg counter over a few buckets
      MINOR: pools: move the failed allocation counter over a few buckets
      MAJOR: pools: move the shared pool's free_list over multiple buckets
      MINOR: pools: make pool_evict_last_items() use pool_put_to_os_no_dec()
      BUILD: pools: fix build error on clang with inline vs forceinline

---

[ANNOUNCE] haproxy-2.9-dev3

Reply via email to