Hi,
HAProxy 2.3-dev6 was released on 2020/10/10. It added 141 new commits
after version 2.3-dev5.
Around 20 bugs were fixed since dev5, hopefully we added less! A nice set
of build cleanups on BSD platforms was brought by Brad Smith, lifting their
feature support to match the currently supported versions. We're seeing
accept4(), closefrom() or getaddrinfo() appear on some of them, and
DragonFlyBSD made its entrance into the party.
Amaury joined the team and started by extending the stats so that various
modules can now register their own counters. For example it will now be
possible for the various muxes (H1, H2,FCGI), Lua, or the SSL layer to
register their own stats on proxies, servers, or listeners. Till now it was
particularly difficult because these parts are optional in the build
process, and reserving some entries for them in the stats structures as
well as some fields in the output would be quite cumbersome. Now the idea
is that in the stats dump there is a delimiter ("-") after which the stats
fields are not necessarily stable across restarts. As such, simple
monitoring systems which look at the core stats whose field positions are
documented in management.txt can stop on "-" and the smarter ones which can
match a column according to its name can get everything. For now no new
stats were added but that will allow anyone to much more easily add some
over time (typically I'm really missing H2 stats right now). In addition to
this he implemented support for stats domain. Each domain corresponds to a
certain type of stats. The default domain is proxies, DNS was added. We
could imagine adding peers, spoe, lua etc in the future, and maybe even
more (polling, sched, threads etc for example). Overall I really count on
this to make sure there is no excuse anymore for not adding stats to a new
subsystem being developed.
Emeric completed the new syslog load balancing feature. Now log-forward
supports receiving TCP syslog in addition to UDP or UNIX logs, and can
forward them to servers using same or different protocols. And since this
is done using the standard communication layers, the usual bind options
also apply. As such we can for example receive TCP logs over SSL with
cert-based client authentication and forward them to a pair of local UDP
servers each taking 50% of the traffic and duplicate that to another TCP
server using a different format. I'm pretty sure that some requests to
turn these logs to other formats (like JSON) will soon come :-)
While trying to clean up the connection layers, Christopher identified a
long-time issue with our abuse of the "tcp-request content" rule sets. It
revolves around the track-sc feature. A long time ago, when TCP was always
processed before HTTP, it used to be the only way to perform tracking.
When keep-alive started to appear we had to make a choice, and in order
to maintain compatibility with L7 fetches that appeared here and there in
TCP rules, it was decided that "tcp-request content" is per-request in HTTP,
so that these rulesets still see the request before HTTP processing starts.
But later with the introduction of muxes required to support H2, these TCP
rules started to become really ambigous because they only see already valid,
parsed requests and not incomplete nor invalid ones. Worse, in H2, we have
to cheat and re-encode them in H1 for the time it needs to analyse them. Now
that there are http-request rule sets, all these hacks make no sense anymore,
but they've been kept for compatibility with old configs. And sadly our
fingers have been trained to type them, even if we know they won't
certain things and will not behave similarly in H1 and H2 at all. Now all
the hacks added to continue to support them are causing quite some trouble
in the architecture. To give just one example, when the H1 mux detects a
bad request, it has to instanciante a stream just to execute them and emit
the error! The problem in fact only exists when such requests are used in
an HTTP proxy for things that do not depend on HTTP. For this reason a new
warning was added for this case which recommends to migrate the rule to
"tcp-request session" instead, indicating that the current behavior is not
reliable and no more guaranteed for the long term.
Finally, let's talk about the horror movie. I've been spending more than one
month on something I initially imagined would take only 3 days: splitting
the listeners in two parts, one for the socket layer and one for the stream
layer. The goal is to support QUIC which will have its own stack and yet
will depend on a socket that the listeners must not manipulate but must
configure. This experience was, hmmm, particular. Many parts of my body still
feel sore! It's obvious we've been slowly accumulating crap over crap for more
than a decade there, and for a good reason: listeners very rarely need to be
touched, so nobody feels like going through one month of rework when only a
1-hour hack can solve a problem. But my problem here was the the code was
entirely made of 1-hour hacks. I couldn't go as far as I wanted, but at least
FDs are not manipulated anymore by the listener layer and there should be
everything needed to register a QUIC protocol layer based on UDPv4/v6 (or
even UNIX dgram sockets if we want to). The cleanup phase will continue with
hopefully less difficulties now. I've gone through very extensive tests and
fell into a number of traps causing (or revealing) bugs, but for now I can't
find any remaining corner case not working as expected. One observable side
effect here is that the process doesn't report "proxy foo started" anymore
on startup. I had to adjust a few regtest for this, and I guess those running
with thousands of backends will appreciate.
I couldn't remove some old hacks for two reasons:
- support of the "grace" directive: it's a very old trick indicating that
a frontend must remain up for some time *after* receiving the soft-stop
signal. This was used in combination with "mode health" in environments
where an L4 LB was present in front of haproxy, to give it the time to
detect the failure of the health check port while still serving traffic.
It's not really compatible with soft reloads and is responsible for a
good part of the remaining hacks in the code (e.g. stopping all listeners
first, then again through their proxies). Given the obsolescence of the
feature, I marked it deprecated. I want it gone by 2.4 if nobody raises
their hand before, and in any case by 2.5. If you use it, please raise
your hand now and explain your use case!
- nbproc. What a mess... Thinking that a frontend exists in a process that
must not listen to it, but that must keep the FD open but not visible,
just for the sake of being able to pass it to a new process over the CLI
for a seamless reload explains a good part of the trouble met there. It's
almost impossible to close a listening socket now because of this, and
we must carefully disable them and be careful not to affect them during
reload attempts. And I discovered that some bugs were hiding others:
sockets which are shared between several processes fail to unbind and to
rebind on the majority of the processes, but that error was lost...
Ah and the best, nbproc will not work with QUIC at all (packets of a same
connection delivered to random processes), will be terribly inefficient for
DNS and confusing for logs, thus adding to the arm-long list of incompatible
features (stick-tables, peers, map updates, health checks, ...).
As such, now that we have well-working threads, I'm asking this question:
"who really NEEDS nbproc nowadays, and why ?". I mean, I'm fine with
reasons like "it would be a pain to update my configuration". I'd even say
that I think it is the only valid one for not switching but not a valid one
for me not killing this feature. So I'm really opening the discussion
here. My wish is to remove nbproc after 2.4, with more or less ease to the
transition (we could even imagine providing a translation script to assist
in these, and the earlier we get them, the better). So I was thinking about
starting in 2.3 by emitting a warning when nbproc is used, inviting the
user to turn it off or report their use case here. Any objection ?
Another point, looking at deprecated keywords, I found that "http-tunnel"
was marked as such since 2.1-dev2 and is ignored. No objection against it
being removed in 2.3 as well ?
And finally, we added the directive "zero-warning" in 2.2 to refuse to start
when there are any warnings. With modern service managers, users generally
don't see their warnings anymore. As such I was thinking about making this
option the default at some point so that they know that something might be
wrong, read the advice, and can possibly decide to pass over by disabling
the warnings. What's your opinion on this ? Should we put that into 2.3, or
postpone for 2.4 ? Any other choice ? Note that I was also thinking about
adding a diagnostic mode which would report suspicious cases that are valid
and currently don't report a warning (e.g. two servers on the same address
or with the same cookie value). This could also help detect anomalies before
they hit one.
As previously stated, we've now reached the end of the development phase and
are not going to merge new features for this release. Fixes, cleanups, tests
and doc are the priority now, and new stuff is welcome for the next branch.
If we're reasonable we might be able to produce something good for the end of
the month or beginning of November and stick to initial expectations.
Please find the usual URLs below :
Site index : http://www.haproxy.org/
Discourse : http://discourse.haproxy.org/
Slack channel : https://slack.haproxy.org/
Issue tracker : https://github.com/haproxy/haproxy/issues
Wiki : https://github.com/haproxy/wiki/wiki
Sources : http://www.haproxy.org/download/2.3/src/
Git repository : http://git.haproxy.org/git/haproxy.git/
Git Web browsing : http://git.haproxy.org/?p=haproxy.git
Changelog : http://www.haproxy.org/download/2.3/src/CHANGELOG
Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/
Willy
---
Complete changelog :
Amaury Denoyelle (19):
MINOR: tools: support for word expansion of environment in parse_line
MINOR: counters: fix a typo in comment
BUG/MINOR: stats: fix validity of the json schema
REORG: stats: export some functions
MINOR: stats: add stats size as a parameter for csv/json dump
MINOR: stats: hide px/sv/li fields in applet struct
REORG: stats: extract proxy json dump
REORG: stats: extract proxies dump loop in a function
MINOR: stats: define the concept of domain for statistics
MINOR: stats: define additional flag px cap on domain
MEDIUM: stats: add delimiter for static proxy stats on csv
MEDIUM: stats: define an API to register stat modules
MEDIUM: stats: add abstract type to store counters
MEDIUM: stats: integrate static proxies stats in new stats
MINOR: stats: support clear counters for dynamic stats
MINOR: stats: display extra proxy stats on the html page
MINOR: stats: add config "stats show modules"
MINOR: dns/stats: integrate dns counters in stats
MINOR: stats: remove for loop declaration
Brad Smith (8):
BUILD: makefile: Update feature flags for OpenBSD
BUILD: makefile: Update feature flags for FreeBSD
BUILD: makefile: Fix building with closefrom() support enabled
BUILD: makefile: Enable closefrom() support on Solaris
DOC: update INSTALL with supported OpenBSD / FreeBSD versions
BUILD: Add a DragonFlyBSD target
BUILD: makefile: Update feature flags for NetBSD
BUILD: makefile: Enable getaddrinfo() on OS/X
Christopher Faulet (14):
DOC: tcp-rules: Refresh details about L7 matching for tcp-request content
rules
MEDIUM: tcp-rules: Warn if a track-sc* content rule doesn't depend on
content
BUG/MINOR: tcpcheck: Set socks4 and send-proxy flags before the connect
call
MINOR: hlua: Display debug messages on stderr only in debug mode
BUG/MINOR: proto_tcp: Report warning messages when listeners are bound
CLEANUP: ssl: Release cached SSL sessions on deinit
BUG/MINOR: mux-h1: Be sure to only set CO_RFL_READ_ONCE for the first read
BUG/MINOR: mux-h1: Always set the session on frontend h1 stream
MINOR: mux-h1: Don't wakeup the H1C when output buffer become available
CLEANUP: sock-unix: Remove an unreachable goto clause
BUG/MEDIUM: mux-fcgi: Don't handle pending read0 too early on streams
BUG/MEDIUM: mux-h2: Don't handle pending read0 too early on streams
BUG/MINOR: http: Fix content-length of the default 500 error
BUG/MINOR: http-htx: Expect no body for 204/304 internal HTTP responses
Emeric Brun (7):
BUG/MINOR: proxy: inc req counter on new syslog messages.
BUG/MEDIUM: log: old processes with log foward section don't die on soft
stop.
MINOR: stats: inc req counter on listeners.
MINOR: channel: new getword and getchar functions on channel.
MEDIUM: log: syslog TCP support on log forward section.
BUG/MINOR: proxy/log: frontend/backend and log forward names must differ
DOC: re-work log forward bind statement documentation.
Eric Salama (1):
BUG/MINOR: Fix several leaks of 'log_tag' in init().
Frédéric Lécaille (2):
BUG/MINOR: peers: Inconsistency when dumping peer status codes.
MINOR: peers: heartbeat, collisions and handshake information for "show
peers" command.
Ilya Shipitsin (2):
REGTESTS: use "command" instead of "which" for better POSIX compatibility
CI: travis-ci: help Coverity to detect BUG_ON() as a real stop
Pierre Cheynier (1):
DOC: Add missing stats fields in the management doc
Sébastien Gross (1):
DOC: Fix typos in configuration.txt
Tim Duesterhus (3):
CLEANUP: ssl: Use structured format for error line report during crt-list
parsing
MINOR: ssl: Add error if a crt-list might be truncated
CLEANUP: cache: Fix leak of cconf->c.name during config check
William Dauchy (4):
DOC: agent-check: fix typo in "fail" word expected reply
DOC: crt: advise to move away from cert bundle
MINOR: ssl: remove uneeded check in crtlist_parse_file
DOC: ssl: fix typo about ocsp files
William Lallemand (3):
BUG/MINOR: ssl/crt-list: exit on warning out of crtlist_parse_line()
DOC: ssl: new "cert bundle" behavior
CLEANUP: ssl: "bundle" is not an OpenSSL wording
Willy Tarreau (76):
REGTEST: fix host part in balance-uri-path-only.vtc
REGTEST: make ssl_client_samples and ssl_server_samples requiret to 2.3
REGTEST: the iif converter test requires 2.3
REGTEST: make agent-check.vtc require 1.8
REGTEST: make abns_socket.vtc require 1.8
REGTEST: make map_regm_with_backref require 1.7
OPTIM: backend/random: never queue on the server, always on the backend
OPTIM: backend: skip LB when we know the backend is full
BUILD: makefile: add an EXTRAVERSION variable to ease local naming
BUILD: tools: fix minor build issue on isspace()
BUG/MEDIUM: queue: make pendconn_cond_unlink() really thread-safe
DOC: fix a confusing typo on a regsub example
BUG/MINOR: makefile: fix a tiny typo in the target list
REGTESTS: mark abns_socket as broken
MEDIUM: fd: always wake up one thread when enabling a foreing FD
MEDIUM: listeners: don't bounce listeners management between queues
MEDIUM: init: stop disabled proxies after initializing fdtab
MEDIUM: listeners: make unbind_listener() converge if needed
MEDIUM: deinit: close all receivers/listeners before scanning proxies
MEDIUM: listeners: remove the now unused ZOMBIE state
MINOR: listeners: do not uselessly try to close zombie listeners in
soft_stop()
CLEANUP: proxy: remove the first_to_listen hack in zombify_proxy()
MINOR: listeners: introduce listener_set_state()
MINOR: proxy: maintain per-state counters of listeners
MEDIUM: proxy: remove the unused PR_STFULL state
MEDIUM: proxy: remove the PR_STERROR state
MEDIUM: proxy: remove state PR_STPAUSED
MINOR: startup: don't rely on PR_STNEW to check for listeners
CLEANUP: peers: don't use the PR_ST* states to mark enabled/disabled
MEDIUM: proxy: replace proxy->state with proxy->disabled
MEDIUM: proxy: remove start_proxies()
MEDIUM: proxy: merge zombify_proxy() with stop_proxy()
MINOR: listeners: check the current listener state in pause_listener()
MINOR: listeners: check the current listener earlier state in
resume_listener()
MEDIUM: listener/proxy: make the listeners notify about proxy pause/resume
MINOR: protocol: introduce protocol_{pause,resume}_all()
MAJOR: signals: use protocol_pause_all() and protocol_resume_all()
CLEANUP: proxy: remove the now unused pause_proxies() and resume_proxies()
MEDIUM: proto_tcp: make the pause() more robust in multi-process
BUG/MEDIUM: listeners: correctly report pause() errors
MINOR: listeners: move fd_stop_recv() to the receiver's socket code
CLEANUP: protocol: remove the ->disable_all method
CLEANUP: listeners: remove unused disable_listener and
disable_all_listeners
MINOR: listeners: export enable_listener()
MINOR: protocol: directly call enable_listener() from
protocol_enable_all()
CLEANUP: protocol: remove the ->enable_all method
CLEANUP: listeners: remove the now unused enable_all_listeners()
MINOR: protocol: rename the ->listeners field to ->receivers
MINOR: protocol: replace ->pause(listener) with ->rx_suspend(receiver)
MINOR: protocol: implement an ->rx_resume() method
MINOR: listener: use the protocol's ->rx_resume() method when available
MINOR: sock: provide a set of generic enable/disable functions
MINOR: protocol: add a new pair of rx_enable/rx_disable methods
MINOR: protocol: add a new pair of enable/disable methods for listeners
MEDIUM: listeners: now use the listener's ->enable/disable
MINOR: listeners: split delete_listener() in two versions
MINOR: listeners: count unstoppable jobs on creation, not deletion
MINOR: listeners: add a new stop_listener() function
MEDIUM: proxy: make stop_proxy() now use stop_listener()
MEDIUM: proxy: add mode PR_MODE_PEERS to flag peers frontends
MEDIUM: proxy: centralize proxy status update and reporting
MINOR: protocol: add protocol_stop_now() to instant-stop listeners
MEDIUM: proxy: make soft_stop() stop most listeners using
protocol_stop_now()
MEDIUM: udp: implement udp_suspend() and udp_resume()
MINOR: listener: add a few BUG_ON() statements to detect inconsistencies
MEDIUM: listeners: always close master vs worker listeners
BROKEN/MEDIUM: listeners: rework the unbind logic to make it idempotent
MEDIUM: listener: let do_unbind_listener() decide whether to close or not
CLEANUP: listeners: remove the do_close argument to unbind_listener()
MINOR: listeners: move the LI_O_MWORKER flag to the receiver
MEDIUM: receivers: add an rx_unbind() method in the protocols
MINOR: listeners: split do_unbind_listener() in two
MEDIUM: listeners: implement protocol level ->suspend/resume() calls
MEDIUM: config: mark "grace" as deprecated
MEDIUM: config: remove the deprecated and dangerous global "debug"
directive
BUG/MINOR: proxy: respect the proper format string in sig_pause/sig_listen
---