Thanks Willy for these updates. While skimming the result on the interop website, I was surprised that haproxy is always more than 50% slower than its competitor. Is it because you've enable lots of traces as part of your debugging process for the runs ?
Ionel ----- Mail original ----- De: "Willy Tarreau" <w...@1wt.eu> À: "haproxy" <haproxy@formilux.org> Envoyé: Samedi 26 Mars 2022 10:22:02 Objet: [*EXT*] [ANNOUNCE] haproxy-2.6-dev4 Hi, HAProxy 2.6-dev4 was released on 2022/03/26. It added 80 new commits after version 2.6-dev3. The activity started to calm down a bit, which is good because we're roughly 2 months before the release and it will become important to avoid introducing last-minute regressions. This version mostly integrates fixes for various bugs in various places like stream-interfaces, QUIC, the HTTP client or the trace subsystem. The remaining patches are mostly QUIC improvements and code cleanups. In addition the MQTT protocol parser was extended to also support MQTTv3.1. A change discussed around previous announce was made in the H2 mux: the "timeout http-keep-alive" and "timeout http-request" are now respected and work as documented, so that it will finally be possible to force such connections to be closed when no request comes even if they're seeing control traffic such as PING frames. This can typically happen in some server-to-server communications whereby the client application makes use of PING frames to make sure the connection is still alive. I intend to backport this after some time, probably to 2.5 and later 2.4, as I've got reports about stable versions currently posing this problem. I'm expecting to see another batch of stream-interface code refactoring that Christopher is still working on. This is a very boring and tedious task that should significantly lower the long-term maintenance effort, so I'm willing to wait a little bit for such changes to be ready. What this means for users is a reduction of the bugs we've seen over the last 2-3 years alternating between truncated responses and never-dying connections and that result from the difficulty to propagate certain events across multiple layers. Also William still has some updates to finish on the HTTP client (connection retries, SSL cert verification and host name resolution mainly). On the paper, each of them is relatively easy, but practically, since the HTTP client is the first one of its category, each attempt to progress is stopped by the discovery of a shortcoming or bug that were not visible before. Thus the progress takes more time than desired but as a side effect, the core code gets much more reliable by getting rid of these old issues. One front that made impressive progress over the last few months is QUIC. While a few months ago we were counting the number of red boxes on the interop tests at https://interop.seemann.io/ to figure what to work on as a top priority, now we're rather counting the number of tests that report a full-green state, and haproxy is now on par with other servers in these tests. Thus the idea emerged, in order to continue to make progress on this front, to start to deploy QUIC on haproxy.org so that interoperability issues with browsers and real-world traffic can be spotted. A few attempts were made and already revealed issues so for now it's disabled again. Be prepared to possibly observe a few occasional hiccups when visiting the site (and if so, please do complain to us). The range of possible issues would likely be frozen transfers and truncated responses, but these should not happen. >From a technical point, the way it's done is by having a separate haproxy process listening to QUIC on UDP port 1443, and forwarding HTTP requests to the existing process. The main process constantly checks the QUIC one, and when it's seen as operational, it appends an Alt-Svc header that indicates the client that an HTTP/3 implementation is available on port 1443, and that this announce is valid for a short time (we'll leave it to one minute only so that issues can resolve quickly, but for now it's only 10s so that quick tests cause no harm): http-response add-header alt-svc 'h3=":1443"; ma=60' if \ { var(txn.host) -m end haproxy.org } { nbsrv(quic) gt 0 } As such, compatible browsers are free to try to connect there or not. Other tools (such as git clone) will not use it. For those impatient to test it, the QUIC process' status is reported at the bottom of the stats page here: http://stats.haproxy.org/. The "quic" socket in the frontend at the top reports the total traffic received from the QUIC process, so if you're seeing it increase while you reload the page it's likely that you're using QUIC to read it. In Firefox I'm having this little plugin loaded: https://addons.mozilla.org/en-US/firefox/addon/http2-indicator/ It displays a small flash on the URL bar with different colors depending on the protocol used to load the page (H1/SPDY/H2/H3). When that works it's green (H3), otherwise it's blue (H2). At this point I'd still say "do not reproduce these experiments at home". Amaury and Fred are still watching the process' traces very closely to spot bugs and stop it as soon as a problem is detected. But it's still too early for being operated by non-developers. The hope is that by 2.6 we'll reach the point where enthousiasts can deploy a few instances on not-too-sensitive sites with sufficient confidence and a little dose of monitoring. Finally, there's another topic I'd like to bring on the table now about stuff for post-2.6. Several of us had discussions recently around service discovery in general, and the conclusions that will probably not surprise many people are that: - using DNS for service discovery is a disaster. The protocol was never designed for this and presents many shortcomings, starting with partial responses that are the cause of flapping servers that users only work around by artificially extending the hold parameter; several users have experienced process stuck in O(n^5) resolution loops that were fortunately interrupted by the watchdog(!). Even if the complexity was since reduced to something like O(n^3), it's still not something I would recommend to anyone because you start small at 4 servers and one day you figure that your business has grown to a few tens to hundreds of servers and you don't have the time to make the switch to a different solution anymore. - alternatives to DNS for service discovery are mature (various HTTP- based APIs) but the need to update the config file and/or to occasionally reload haproxy when needed makes that incompatible with a location inside the haproxy process itself; in addition, such new mechanisms tend to come with ready-to-use libs for high-level languages that would require a lot of time to reimplement inside haproxy anyway. - that would typically be a perfect job for the dataplane-API but it can currently only use the CLI to communicate with haproxy, and this CLI was designed for humans, thus adopting every new command (such as server addition/removal) still takes quite some time, because specific work has to be done for each and every single new command. As such I find it important that for the long term we'd focus on: - improving the communication between the dataplane-API and haproxy; we've had discussions with the dataplane-API team and figured some points that would make their life much better (ability to dump the list of supported keywords, supporting REST/JSON etc). This also means that we need to be even more careful when extending existing directives with new keywords, to use the existing keyword registration subsystems as much as possible and rely less and less on strcmp() and other parser tricks. - getting rid of the unreliable DNS-based discovery once it can be done differently (i.e. no more SRV records nor spraying of random IPs to IP-less servers). The issues in this area managed to keep 3 people busy for 2 months during 2.5 development and it will never fully satisfy users because the concept is fundamentally flawed. To be honest, I have no idea how long all the stuff above could take, especially if we want to design it correctly and not reproduce mistakes from the past. I would have liked to be able to say "no more DNS-based discovery after 2.6" so that we would start to warn users about possible deprecation, and it would probably be reasonable to think that it would be the last LTS version with this. I'm interested in opinions and feedback about this. And the next question will obviously be "how could we detect such users and warn them?". Using DNS to resolve server names to IPs is fine (it was initially done for use within AWS). I was thinking about a few possible approaches like detecting the combined use of server-templates and nameservers, or maybe just asking that an acknowledgement keyword is added in nameserver sections referenced by multiple servers to confirm that the warning was read, or detect the service name syntax using the underscore (not sure it's sufficient). Opinions and ideas are welcome here. Thanks to those who've read me this far, and have a nice week-end! Please find the usual URLs below : Site index : http://www.haproxy.org/ Discourse : http://discourse.haproxy.org/ Slack channel : https://slack.haproxy.org/ Issue tracker : https://github.com/haproxy/haproxy/issues Wiki : https://github.com/haproxy/wiki/wiki Sources : http://www.haproxy.org/download/2.6/src/ Git repository : http://git.haproxy.org/git/haproxy.git/ Git Web browsing : http://git.haproxy.org/?p=haproxy.git Changelog : http://www.haproxy.org/download/2.6/src/CHANGELOG Cyril's HTML doc : http://cbonte.github.io/haproxy-dconv/ Willy --- Complete changelog : Amaury Denoyelle (20): CLEANUP: mux-quic: change comment style to not mess with git conflict CLEANUP: mux-quic: adjust comment for coding-style MINOR: mux-quic: complete trace when stream is not found MINOR: mux-quic: add comments for send functions MINOR: mux-quic: use shorter name for flow-control fields MEDIUM: mux-quic: respect peer bidirectional stream data limit MEDIUM: mux-quic: respect peer connection data limit MINOR: mux-quic: support MAX_STREAM_DATA frame parsing MINOR: mux-quic: support MAX_DATA frame parsing MINOR: mux-quic: convert fin on push-frame as boolean BUILD: quic: add missing includes REORG: quic: use a dedicated quic_loss.c MINOR: mux-quic: declare the qmux trace module MINOR: mux-quic: replace printfs by traces MINOR: mux-quic: add trace event for frame sending MINOR: mux-quic: add trace event for qcs_push_frame MINOR: mux-quic: activate qmux traces on stdout via macro BUILD: qpack: fix unused value when not using DEBUG_HPACK CLEANUP: qpack: suppress by default stdout traces CLEANUP: h3: suppress by default stdout traces Christopher Faulet (7): REGTESTS: fix the race conditions in be2hex.vtc BUG/MEDIUM: applet: Don't call .release callback function twice BUG/MEDIUM: cli/debug: Properly get the stream-int in all debug I/O handlers BUG/MEDIUM: sink: Properly get the stream-int in appctx callback functions BUG/MINOR: rules: Initialize the list element when allocating a new rule BUG/MINOR: http-rules: Don't free new rule on allocation failure DOC: config: Explictly add supported MQTT versions Dhruv Jain (1): MEDIUM: mqtt: support mqtt_is_valid and mqtt_field_value converters for MQTTv3.1 Frédéric Lécaille (19): BUG/MEDIUM: quic: Blocked STREAM when retransmitted BUG/MAJOR: quic: Possible crash with full congestion control window MINOR: quic: Code factorization (TX buffer reuse) CLEANUP: quic: "largest_acked_pn" pktns struc member moving MEDIUM: quic: Limit the number of ACK ranges MEDIUM: quic: Rework of the TX packets memory handling BUG/MINOR: quic: Possible crash in parse_retry_token() BUG/MINOR: quic: Possible leak in quic_build_post_handshake_frames() BUG/MINOR: quic: Unsent frame because of qc_build_frms() BUG/MINOR: mux-quic: Access to empty frame list from qc_send_frames() BUG/MINOR: mux-quic: Missing I/O handler events initialization BUG/MINOR: quic: Missing TX packet initializations BUG/MINOR: quic: 1RTT packets ignored after mux was released BUG/MINOR: quic: Incorrect peer address validation BUG/MINOR: quic: Non initialized variable in quic_build_post_handshake_frames() BUG/MINOR: quic: Wrong TX packet related counters handling MINOR: quic: Add traces about stream TX buffer consumption MINOR: quic: Add traces in qc_set_timer() (scheduling) BUG/MINOR: quic: Wrong buffer length passed to generate_retry_token() Ilya Shipitsin (1): CI: github actions: switch to LibreSSL-3.5.1 Tim Duesterhus (5): DEV: coccinelle: Fix incorrect replacement in ist.cocci CLEANUP: Reapply ist.cocci with `--include-headers-for-types --recursive-includes` DEV: coccinelle: Add a new pattern to ist.cocci CLEANUP: Reapply ist.cocci REGTESTS: Do not use REQUIRE_VERSION for HAProxy 2.5+ William Lallemand (15): BUG/MEDIUM: httpclient: don't consume data before it was analyzed CLEANUP: htx: remove unused co_htx_remove_blk() BUG/MINOR: httpclient: consume partly the blocks when necessary BUG/MINOR: httpclient: remove the UNUSED block when parsing headers BUG/MEDIUM: httpclient: must manipulate head, not first BUG/MINOR: httpclient/lua: stuck when closing without data MINOR: server: export server_parse_sni_expr() function BUG/MINOR: httpclient: send the SNI using the host header BUILD: httpclient: fix build without SSL BUG/MINOR: server/ssl: free the SNI sample expression BUG/MINOR: httpclient: only check co_data() instead of HTTP_MSG_DATA BUG/MINOR: httpclient: process the response when received before the end of the request BUG/MINOR: httpclient: CF_SHUTW_NOW should be tested with channel_is_empty() BUG/MINOR: tools: fix url2sa return value with IPv4 BUG/MINOR: tools: url2sa reads too far when no port nor path Willy Tarreau (12): DEV: udp: switch parser to getopt() instead of positional arguments DEV: udp: add support for random packet corruption BUG/MINOR: logs: fix logsrv leaks on clean exit MINOR: actions: add new function free_act_rule() to free a single rule BUG/MINOR: tcp-rules: completely free incorrect TCP rules on error BUG/MINOR: http-rules: completely free incorrect TCP rules on error BUG/MEDIUM: mux-h1: only turn CO_FL_ERROR to CS_FL_ERROR with empty ibuf BUG/MEDIUM: stream-int: do not rely on the connection error once established BUG/MEDIUM: trace: avoid race condition when retrieving session from conn->owner MEDIUM: mux-h2: slightly relax timeout management rules BUG/MEDIUM: mux-h2: make use of http-request and keep-alive timeouts BUILD: stream-int: avoid a build warning when DEBUG is empty --- -- 232 avenue Napoleon BONAPARTE 92500 RUEIL MALMAISON Capital EUR 219 300,00 - RCS Nanterre B 408 832 301 - TVA FR 09 408 832 301