Hi,

I haven't been paying much attention to the list lately, but I am wondering what the current status of http/2 support is in 1.8-(dev|snapshot).

Is it in a usable-but-needs testing state? Or more like stay-away-because-it-kills-kittens state?

Greets,

Sander

On 2017-08-18 16:49, Willy Tarreau wrote:
...well, I think everything is in the subject :-)

Hi, by the way!

I'm able to gateway http/2 traffic to www.haproxy.org and am getting logs
to prove it :

   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43740
[18/Aug/2017:15:56:51.282] www~ www/<H2CFRT> -1/13/0/-1/18 0 15 - -
---- 1/1/0/0/0 0/0 http=1 "<BADREQ>"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.302] www~ www/www 0/0/58/18/104 200 36300 - -
CD-- 1/1/0/0/0 0/0 http=2 "GET / HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.415] www~ www/www 0/0/30/16/46 200 504 - - CD--
1/1/0/0/0 0/0 http=2 "GET /size.js HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.466] www~ www/www 0/0/30/16/46 200 215 - - CD--
12/12/11/11/0 0/0 http=2 "GET /size.css?1509x761 HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.491] www~ www/www 0/0/25/19/44 200 11198 - -
CD-- 13/13/12/12/0 0/0 http=2 "GET /img/HAProxy_mini_pub.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.492] www~ www/www 0/0/26/19/45 200 10443 - -
CD-- 12/12/11/11/0 0/0 http=2 "GET /img/POM_mini_pub.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.491] www~ www/www 0/0/28/19/47 200 7772 - - CD--
11/11/10/10/0 0/0 http=2 "GET /img/ALOHA_mini_pub.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.491] www~ www/www 0/0/29/22/51 200 1731 - - CD--
10/10/9/9/0 0/0 http=2 "GET /img/btn_donate_SM_eur.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.489] www~ www/www 0/0/29/24/53 200 3743 - - CD--
9/9/8/8/0 0/0 http=2 "GET /img/logo-med.png HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.490] www~ www/www 0/0/29/23/52 200 1729 - - CD--
8/8/7/7/0 0/0 http=2 "GET /img/btn_donate_SM_usd.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.500] www~ www/www 0/0/26/18/44 200 3220 - - CD--
7/7/6/6/0 0/0 http=2 "GET /img/haproxy-pmode.png HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.501] www~ www/www 0/0/26/18/44 200 2261 - - CD--
6/6/5/5/0 0/0 http=2 "GET /img/pwby.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.492] www~ www/www 0/0/26/31/58 200 19247 - -
CD-- 5/5/4/4/0 0/0 http=2 "GET /img/World_IPv6_launch_banner_256.png
HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.501] www~ www/www 0/0/25/24/49 200 396 - - CD--
4/4/3/3/0 0/0 http=2 "GET /img/fr-off.png HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.501] www~ www/www 0/0/25/25/50 200 441 - - CD--
3/3/2/2/0 0/0 http=2 "GET /img/en-off.png HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.514] www~ www/www 0/0/28/15/43 200 850 - - CD--
2/2/1/1/0 0/0 http=2 "GET /img/ipv6nok.gif HTTP/1.1"
   <134>Aug 18 15:56:51 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.525] www~ www/www 0/0/30/232/262 200 376 - -
CD-- 1/1/0/0/0 0/0 http=2 "GET /img/ipv6back.png HTTP/1.1"
   <134>Aug 18 15:57:11 haproxy[6566]: 127.0.0.1:43746
[18/Aug/2017:15:56:51.300] www~ www/<H2CFRT> -1/2/0/-1/20489 0 99131 -
- cD-- 0/0/0/0/0 0/0 http=1 "<BADREQ>"

Look at the accept dates for the request, many of them are grouped, and
there's this "http=2" field in the log indicating the on-wire format.

But you'll also note all the "CD--" flags, the "<BADREQ>" etc...

The code more or less works. There are still some race conditions that
will occasionally cause some requests to time out, especially if you
build with "-DDEBUG_H2" which will emit a lot of printf.

At least now with this code in place I could understand what is wrong
and how it should be re-architected. There's still a lot of work to do
in this area (there are some design notes and contradictory thoughts
in doc/internals/h2.txt) but I thought that now that it's more or less
working and that I'm going to break it and restart it from scratch
differently, it could be nice that I share it for those curious who
want to play with it a bit.

DON'T PUT THIS IN PRODUCTION!!! There are a lot of unhandled errors,
there are occasional leaks due to certain races not being caught etc.
I'm not even going to put it myself in front of haproxy.org nor at
home. It may start a fire in your house, attract UFOs full of man-eating aliens, or even make me temporarily smart, nothing you want to experience!

The design for now consists in demultiplexing the H2 streams from
the incoming connection, translating them into H1 and processing H1
requests just like before. That's why the logs still report "HTTP/1.1",
it's what is presented into the version string.

Among the things that are still limitations that could possibly be
overcome before the 1.8 release, I can cite :

  - header field names don't have any single upper case letter (as is
    the case in H2), so it might be possible that some bogus hard-coded
products don't match "Host" when "host:" is present for example. It's not very hard to place an upper case after each "-" but for now it's
    not done ; I'm interested in interoperability issue reports if any.

- no server-side keep-alive for now. The cause is simple : the streams in HTTP/2 only transport a single request so there is no reuse by the same stream and for now we have nowhere to place our idle connection, meaning that they have to be closed all the time ; we definitely need to address this by having per-session lists, but it will not be trivial so I preferred to focus on protocol processing for now, which already
    isn't the easiest thing to implement ;

- no data upload yet, DATA frames are simply ignoread, so POST, PUT and
    CONNECT will not work. It's not a huge work to get them to work now
    but doing it in an architecture that's going to die is pointless.

- CONTINUATION frames are not supported for now, it's due to the current
    architecture making this very complicated.

- it's mandatory that the buffer size (tune.bufsize) is at least 16384
    bytes. There's no such control for now during the config parsing so
it starts and H2 connections are simply aborted when it's discovered
    that the buffer is too small and the browser retries using HTTP/1.

  - trailers are not forwarded from the server to the client yet. Not
    critical for most tests.

- logs also report the H2 front connection and bad requests. This will
    have to be addressed later.

- request/response aborts are not translated to RST_STREAM frames yet, so I guess it's the browser which will detect some inconsistency and
    break the connection, or we'll send a GOAWAY frame to break it.

  - no graceful shutdown of the connection (will probably not exist for
    the release, we'll see).

  - the HTTP/1 to 2 upgrade method is not supported yet, though I'm not
    much worried about it at the moment.

  - no trivial way to report HTTP/2 in the logs. I'm using a sample
    fetch function reporting the on-wire format as 1 or 2 for now. I
    considered replacing "HTTP/1.1" with "HTTP/2.0" in the logs but
    that's inaccurate since we really process "1.1" so it might be
    confusing to those dealing with regex which don't seem to match,
    and in addition "HTTP/2.0" is not the correct version string, the
    correct one is "HTTP/2". But writing this without the dot and the
    minor version is going to break some log processing tools. Thus I
    was thinking about having some optional fields that are supposed
    to be easy to use. Note that we had the same issue with SSL long
    ago, ending with "~" after the frontend's name in the logs...
    Better avoid this for H2. Ideas are welcome.

There are some good points as well, proving we're not too far. For
example, the stats work over H2, even when enabling compression!

I'm pretty sure you'll find a ton of bugs there, and don't look at
the code too closely, sometimes I really had to put a few lines just
to plug a hole through which water was leaking.

I'm more interested in observations, like "oh too bad you didn't do
this" or "why not report the version this way" etc. You'll all see
that it's easierto think about it when playing with it.

Currently I'm running it locally on my laptop, listening on 127.0.0.2
for traffic that my browser sends there when I want to access
h2.haproxy.org or h2.haproxy.com (then the Host header gets rewritten).
I could notice already that loading haproxy.com through this is a bit
faster thanks to the higher parallelism allowing more objects to be
loaded at once.

Those who want to test will have to retrieve branch 20170818-h2-2
from the git development repository :

    git clone -b 20170818-h2-2 http://git.haproxy.org/git/haproxy.git/

It builds as usual. For the config, you have to specify "alpn h2" on
the "bind" line. This requires openssl-1.0.2. If you have an older
version you can try with "npn h2", which some browsers support as well.
The first failed connection in the logs above was made with ALPN then a
fallback to NPN was done. Or maybe both were attempted in parallel, I
haven't checked much.

The working config I'm using here is the following :

    global
        log 127.0.0.1:5514 local0
        stats socket /tmp/sock1 mode 666 level admin
        stats timeout 1h
        tune.ssl.default-dh-param 1024
        ssl-server-verify none
        tune.bufsize 16384

    defaults
        mode http
        timeout client 20s
        timeout server 20s
        timeout connect 1s

    listen www
        log global
        log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B
%CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs http=%[http_major]
%{+Q}r"
        bind 127.0.0.2:443 ssl crt rsa+dh2048.pem alpn h2
        server www www.haproxy.org:443 ssl verify none
        http-request set-header host www.haproxy.org
        stats uri /stat

And in my /etc/hosts, I have :

   127.0.0.2   h2.haproxy.org

Then I can aim by browser at h2.haproxy.org, get over the cert warning, and continue from there. It's interesting to see how certain sites with many objects react, even those which are far away, because the browser visibly
allows much more parallelism than on HTTP/1. If some want to try on
www.haproxy.com (many more objects, it's a modern site :-)) you'll need
two such instances, one for "h2." replacing "www" and another one for
"cdn." running on a different address. And you'll have to modify your
/etc/hosts and later complain that you can't access the site anymore after
you forget your test (it happened to me already). So it's better if you
find another site where most of the objects are on the same host matching
the URL bar, it's much easier to configure and to test.

I'm not really asking for bug reports at this step, I expect a lot. However if you manage to find 100% reproducible cases for connection hangs, timeouts
or stuff like this, it can still indicate I missed something, so do not
hesitate. Just don't be sad if I only respond "yeah I know".

I'm still getting hopes for something working (a bit better) by mid-end of october, which means we could have a very fresh H2 support in haproxy 1.8,
very likely still tagged experimental.

Have fun,
Willy

Attachment: 0x2E78FBE8.asc
Description: application/pgp-keys

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to