[ANNOUNCE] haproxy-2.3-dev6

Willy Tarreau Sat, 10 Oct 2020 03:01:35 -0700

Hi,

HAProxy 2.3-dev6 was released on 2020/10/10. It added 141 new commits
after version 2.3-dev5.


Around 20 bugs were fixed since dev5, hopefully we added less! A nice set
of build cleanups on BSD platforms was brought by Brad Smith, lifting their
feature support to match the currently supported versions. We're seeing
accept4(), closefrom() or getaddrinfo() appear on some of them, and
DragonFlyBSD made its entrance into the party.

Amaury joined the team and started by extending the stats so that various
modules can now register their own counters. For example it will now be
possible for the various muxes (H1, H2,FCGI), Lua, or the SSL layer to
register their own stats on proxies, servers, or listeners. Till now it was
particularly difficult because these parts are optional in the build
process, and reserving some entries for them in the stats structures as
well as some fields in the output would be quite cumbersome. Now the idea
is that in the stats dump there is a delimiter ("-") after which the stats
fields are not necessarily stable across restarts. As such, simple
monitoring systems which look at the core stats whose field positions are
documented in management.txt can stop on "-" and the smarter ones which can
match a column according to its name can get everything. For now no new
stats were added but that will allow anyone to much more easily add some
over time (typically I'm really missing H2 stats right now). In addition to
this he implemented support for stats domain. Each domain corresponds to a
certain type of stats. The default domain is proxies, DNS was added. We
could imagine adding peers, spoe, lua etc in the future, and maybe even
more (polling, sched, threads etc for example). Overall I really count on
this to make sure there is no excuse anymore for not adding stats to a new
subsystem being developed.

Emeric completed the new syslog load balancing feature. Now log-forward
supports receiving TCP syslog in addition to UDP or UNIX logs, and can
forward them to servers using same or different protocols. And since this
is done using the standard communication layers, the usual bind options
also apply. As such we can for example receive TCP logs over SSL with
cert-based client authentication and forward them to a pair of local UDP
servers each taking 50% of the traffic and duplicate that to another TCP
server using a different format. I'm pretty sure that some requests to
turn these logs to other formats (like JSON) will soon come :-)

While trying to clean up the connection layers, Christopher identified a
long-time issue with our abuse of the "tcp-request content" rule sets. It
revolves around the track-sc feature. A long time ago, when TCP was always
processed before HTTP, it used to be the only way to perform tracking.
When keep-alive started to appear we had to make a choice, and in order
to maintain compatibility with L7 fetches that appeared here and there in
TCP rules, it was decided that "tcp-request content" is per-request in HTTP,
so that these rulesets still see the request before HTTP processing starts.
But later with the introduction of muxes required to support H2, these TCP
rules started to become really ambigous because they only see already valid,
parsed requests and not incomplete nor invalid ones. Worse, in H2, we have
to cheat and re-encode them in H1 for the time it needs to analyse them. Now
that there are http-request rule sets, all these hacks make no sense anymore,
but they've been kept for compatibility with old configs. And sadly our
fingers have been trained to type them, even if we know they won't 
certain things and will not behave similarly in H1 and H2 at all. Now all
the hacks added to continue to support them are causing quite some trouble
in the architecture. To give just one example, when the H1 mux detects a
bad request, it has to instanciante a stream just to execute them and emit
the error! The problem in fact only exists when such requests are used in
an HTTP proxy for things that do not depend on HTTP. For this reason a new
warning was added for this case which recommends to migrate the rule to
"tcp-request session" instead, indicating that the current behavior is not
reliable and no more guaranteed for the long term.

Finally, let's talk about the horror movie. I've been spending more than one
month on something I initially imagined would take only 3 days: splitting
the listeners in two parts, one for the socket layer and one for the stream
layer. The goal is to support QUIC which will have its own stack and yet
will depend on a socket that the listeners must not manipulate but must
configure. This experience was, hmmm, particular. Many parts of my body still
feel sore! It's obvious we've been slowly accumulating crap over crap for more
than a decade there, and for a good reason: listeners very rarely need to be
touched, so nobody feels like going through one month of rework when only a
1-hour hack can solve a problem. But my problem here was the the code was
entirely made of 1-hour hacks. I couldn't go as far as I wanted, but at least
FDs are not manipulated anymore by the listener layer and there should be
everything needed to register a QUIC protocol layer based on UDPv4/v6 (or
even UNIX dgram sockets if we want to). The cleanup phase will continue with
hopefully less difficulties now. I've gone through very extensive tests and
fell into a number of traps causing (or revealing) bugs, but for now I can't
find any remaining corner case not working as expected. One observable side
effect here is that the process doesn't report "proxy foo started" anymore
on startup. I had to adjust a few regtest for this, and I guess those running
with thousands of backends will appreciate.

I couldn't remove some old hacks for two reasons:
  - support of the "grace" directive: it's a very old trick indicating that
    a frontend must remain up for some time *after* receiving the soft-stop
    signal. This was used in combination with "mode health" in environments
    where an L4 LB was present in front of haproxy, to give it the time to
    detect the failure of the health check port while still serving traffic.
    It's not really compatible with soft reloads and is responsible for a
    good part of the remaining hacks in the code (e.g. stopping all listeners
    first, then again through their proxies). Given the obsolescence of the
    feature, I marked it deprecated. I want it gone by 2.4 if nobody raises
    their hand before, and in any case by 2.5. If you use it, please raise
    your hand now and explain your use case!

  - nbproc. What a mess... Thinking that a frontend exists in a process that
    must not listen to it, but that must keep the FD open but not visible,
    just for the sake of being able to pass it to a new process over the CLI
    for a seamless reload explains a good part of the trouble met there. It's
    almost impossible to close a listening socket now because of this, and
    we must carefully disable them and be careful not to affect them during
    reload attempts. And I discovered that some bugs were hiding others:
    sockets which are shared between several processes fail to unbind and to
    rebind on the majority of the processes, but that error was lost...

Ah and the best, nbproc will not work with QUIC at all (packets of a same
connection delivered to random processes), will be terribly inefficient for
DNS and confusing for logs, thus adding to the arm-long list of incompatible
features (stick-tables, peers, map updates, health checks, ...).

As such, now that we have well-working threads, I'm asking this question:
"who really NEEDS nbproc nowadays, and why ?". I mean, I'm fine with
reasons like "it would be a pain to update my configuration". I'd even say
that I think it is the only valid one for not switching but not a valid one
for me not killing this feature. So I'm really opening the discussion
here. My wish is to remove nbproc after 2.4, with more or less ease to the
transition (we could even imagine providing a translation script to assist
in these, and the earlier we get them, the better). So I was thinking about
starting in 2.3 by emitting a warning when nbproc is used, inviting the
user to turn it off or report their use case here. Any objection ?

Another point, looking at deprecated keywords, I found that "http-tunnel"
was marked as such since 2.1-dev2 and is ignored. No objection against it
being removed in 2.3 as well ?

And finally, we added the directive "zero-warning" in 2.2 to refuse to start
when there are any warnings. With modern service managers, users generally
don't see their warnings anymore. As such I was thinking about making this
option the default at some point so that they know that something might be
wrong, read the advice, and can possibly decide to pass over by disabling
the warnings. What's your opinion on this ? Should we put that into 2.3, or
postpone for 2.4 ? Any other choice ? Note that I was also thinking about
adding a diagnostic mode which would report suspicious cases that are valid
and currently don't report a warning (e.g. two servers on the same address
or with the same cookie value). This could also help detect anomalies before
they hit one.

As previously stated, we've now reached the end of the development phase and
are not going to merge new features for this release. Fixes, cleanups, tests
and doc are the priority now, and new stuff is welcome for the next branch.
If we're reasonable we might be able to produce something good for the end of
the month or beginning of November and stick to initial expectations.

Please find the usual URLs below :
   Site index       : http://www.haproxy.org/
   Discourse        : http://discourse.haproxy.org/
   Slack channel    : https://slack.haproxy.org/
   Issue tracker    : https://github.com/haproxy/haproxy/issues
   Wiki             : https://github.com/haproxy/wiki/wiki
   Sources          : http://www.haproxy.org/download/2.3/src/
   Git repository   : http://git.haproxy.org/git/haproxy.git/
   Git Web browsing : http://git.haproxy.org/?p=haproxy.git
   Changelog        : http://www.haproxy.org/download/2.3/src/CHANGELOG
   Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/

Willy
---
Complete changelog :
Amaury Denoyelle (19):
      MINOR: tools: support for word expansion of environment in parse_line
      MINOR: counters: fix a typo in comment
      BUG/MINOR: stats: fix validity of the json schema
      REORG: stats: export some functions
      MINOR: stats: add stats size as a parameter for csv/json dump
      MINOR: stats: hide px/sv/li fields in applet struct
      REORG: stats: extract proxy json dump
      REORG: stats: extract proxies dump loop in a function
      MINOR: stats: define the concept of domain for statistics
      MINOR: stats: define additional flag px cap on domain
      MEDIUM: stats: add delimiter for static proxy stats on csv
      MEDIUM: stats: define an API to register stat modules
      MEDIUM: stats: add abstract type to store counters
      MEDIUM: stats: integrate static proxies stats in new stats
      MINOR: stats: support clear counters for dynamic stats
      MINOR: stats: display extra proxy stats on the html page
      MINOR: stats: add config "stats show modules"
      MINOR: dns/stats: integrate dns counters in stats
      MINOR: stats: remove for loop declaration

Brad Smith (8):
      BUILD: makefile: Update feature flags for OpenBSD
      BUILD: makefile: Update feature flags for FreeBSD
      BUILD: makefile: Fix building with closefrom() support enabled
      BUILD: makefile: Enable closefrom() support on Solaris
      DOC: update INSTALL with supported OpenBSD / FreeBSD versions
      BUILD: Add a DragonFlyBSD target
      BUILD: makefile: Update feature flags for NetBSD
      BUILD: makefile: Enable getaddrinfo() on OS/X

Christopher Faulet (14):
      DOC: tcp-rules: Refresh details about L7 matching for tcp-request content 
rules
      MEDIUM: tcp-rules: Warn if a track-sc* content rule doesn't depend on 
content
      BUG/MINOR: tcpcheck: Set socks4 and send-proxy flags before the connect 
call
      MINOR: hlua: Display debug messages on stderr only in debug mode
      BUG/MINOR: proto_tcp: Report warning messages when listeners are bound
      CLEANUP: ssl: Release cached SSL sessions on deinit
      BUG/MINOR: mux-h1: Be sure to only set CO_RFL_READ_ONCE for the first read
      BUG/MINOR: mux-h1: Always set the session on frontend h1 stream
      MINOR: mux-h1: Don't wakeup the H1C when output buffer become available
      CLEANUP: sock-unix: Remove an unreachable goto clause
      BUG/MEDIUM: mux-fcgi: Don't handle pending read0 too early on streams
      BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams
      BUG/MINOR: http: Fix content-length of the default 500 error
      BUG/MINOR: http-htx: Expect no body for 204/304 internal HTTP responses

Emeric Brun (7):
      BUG/MINOR: proxy: inc req counter on new syslog messages.
      BUG/MEDIUM: log: old processes with log foward section don't die on soft 
stop.
      MINOR: stats: inc req counter on listeners.
      MINOR: channel: new getword and getchar functions on channel.
      MEDIUM: log: syslog TCP support on log forward section.
      BUG/MINOR: proxy/log: frontend/backend and log forward names must differ
      DOC: re-work log forward bind statement documentation.

Eric Salama (1):
      BUG/MINOR: Fix several leaks of 'log_tag' in init().

Frédéric Lécaille (2):
      BUG/MINOR: peers: Inconsistency when dumping peer status codes.
      MINOR: peers: heartbeat, collisions and handshake information for "show 
peers" command.

Ilya Shipitsin (2):
      REGTESTS: use "command" instead of "which" for better POSIX compatibility
      CI: travis-ci: help Coverity to detect BUG_ON() as a real stop

Pierre Cheynier (1):
      DOC: Add missing stats fields in the management doc

Sébastien Gross (1):
      DOC: Fix typos in configuration.txt

Tim Duesterhus (3):
      CLEANUP: ssl: Use structured format for error line report during crt-list 
parsing
      MINOR: ssl: Add error if a crt-list might be truncated
      CLEANUP: cache: Fix leak of cconf->c.name during config check

William Dauchy (4):
      DOC: agent-check: fix typo in "fail" word expected reply
      DOC: crt: advise to move away from cert bundle
      MINOR: ssl: remove uneeded check in crtlist_parse_file
      DOC: ssl: fix typo about ocsp files

William Lallemand (3):
      BUG/MINOR: ssl/crt-list: exit on warning out of crtlist_parse_line()
      DOC: ssl: new "cert bundle" behavior
      CLEANUP: ssl: "bundle" is not an OpenSSL wording

Willy Tarreau (76):
      REGTEST: fix host part in balance-uri-path-only.vtc
      REGTEST: make ssl_client_samples and ssl_server_samples requiret to 2.3
      REGTEST: the iif converter test requires 2.3
      REGTEST: make agent-check.vtc require 1.8
      REGTEST: make abns_socket.vtc require 1.8
      REGTEST: make map_regm_with_backref require 1.7
      OPTIM: backend/random: never queue on the server, always on the backend
      OPTIM: backend: skip LB when we know the backend is full
      BUILD: makefile: add an EXTRAVERSION variable to ease local naming
      BUILD: tools: fix minor build issue on isspace()
      BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe
      DOC: fix a confusing typo on a regsub example
      BUG/MINOR: makefile: fix a tiny typo in the target list
      REGTESTS: mark abns_socket as broken
      MEDIUM: fd: always wake up one thread when enabling a foreing FD
      MEDIUM: listeners: don't bounce listeners management between queues
      MEDIUM: init: stop disabled proxies after initializing fdtab
      MEDIUM: listeners: make unbind_listener() converge if needed
      MEDIUM: deinit: close all receivers/listeners before scanning proxies
      MEDIUM: listeners: remove the now unused ZOMBIE state
      MINOR: listeners: do not uselessly try to close zombie listeners in 
soft_stop()
      CLEANUP: proxy: remove the first_to_listen hack in zombify_proxy()
      MINOR: listeners: introduce listener_set_state()
      MINOR: proxy: maintain per-state counters of listeners
      MEDIUM: proxy: remove the unused PR_STFULL state
      MEDIUM: proxy: remove the PR_STERROR state
      MEDIUM: proxy: remove state PR_STPAUSED
      MINOR: startup: don't rely on PR_STNEW to check for listeners
      CLEANUP: peers: don't use the PR_ST* states to mark enabled/disabled
      MEDIUM: proxy: replace proxy->state with proxy->disabled
      MEDIUM: proxy: remove start_proxies()
      MEDIUM: proxy: merge zombify_proxy() with stop_proxy()
      MINOR: listeners: check the current listener state in pause_listener()
      MINOR: listeners: check the current listener earlier state in 
resume_listener()
      MEDIUM: listener/proxy: make the listeners notify about proxy pause/resume
      MINOR: protocol: introduce protocol_{pause,resume}_all()
      MAJOR: signals: use protocol_pause_all() and protocol_resume_all()
      CLEANUP: proxy: remove the now unused pause_proxies() and resume_proxies()
      MEDIUM: proto_tcp: make the pause() more robust in multi-process
      BUG/MEDIUM: listeners: correctly report pause() errors
      MINOR: listeners: move fd_stop_recv() to the receiver's socket code
      CLEANUP: protocol: remove the ->disable_all method
      CLEANUP: listeners: remove unused disable_listener and 
disable_all_listeners
      MINOR: listeners: export enable_listener()
      MINOR: protocol: directly call enable_listener() from 
protocol_enable_all()
      CLEANUP: protocol: remove the ->enable_all method
      CLEANUP: listeners: remove the now unused enable_all_listeners()
      MINOR: protocol: rename the ->listeners field to ->receivers
      MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver)
      MINOR: protocol: implement an ->rx_resume() method
      MINOR: listener: use the protocol's ->rx_resume() method when available
      MINOR: sock: provide a set of generic enable/disable functions
      MINOR: protocol: add a new pair of rx_enable/rx_disable methods
      MINOR: protocol: add a new pair of enable/disable methods for listeners
      MEDIUM: listeners: now use the listener's ->enable/disable
      MINOR: listeners: split delete_listener() in two versions
      MINOR: listeners: count unstoppable jobs on creation, not deletion
      MINOR: listeners: add a new stop_listener() function
      MEDIUM: proxy: make stop_proxy() now use stop_listener()
      MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends
      MEDIUM: proxy: centralize proxy status update and reporting
      MINOR: protocol: add protocol_stop_now() to instant-stop listeners
      MEDIUM: proxy: make soft_stop() stop most listeners using 
protocol_stop_now()
      MEDIUM: udp: implement udp_suspend() and udp_resume()
      MINOR: listener: add a few BUG_ON() statements to detect inconsistencies
      MEDIUM: listeners: always close master vs worker listeners
      BROKEN/MEDIUM: listeners: rework the unbind logic to make it idempotent
      MEDIUM: listener: let do_unbind_listener() decide whether to close or not
      CLEANUP: listeners: remove the do_close argument to unbind_listener()
      MINOR: listeners: move the LI_O_MWORKER flag to the receiver
      MEDIUM: receivers: add an rx_unbind() method in the protocols
      MINOR: listeners: split do_unbind_listener() in two
      MEDIUM: listeners: implement protocol level ->suspend/resume() calls
      MEDIUM: config: mark "grace" as deprecated
      MEDIUM: config: remove the deprecated and dangerous global "debug" 
directive
      BUG/MINOR: proxy: respect the proper format string in sig_pause/sig_listen

---

[ANNOUNCE] haproxy-2.3-dev6

Reply via email to