> Actually that's not the case and that may explain the situation. The machine > runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static > build of openssl 1.0.2 to see if the ratio increases.
That might very well be the case, I know for sure that Chrome dropped support NPN and requires 1.0.2 for http2 to work, and I’m suspecting the same with recent Firefox versions (at least on Windows and Mac) seems to only work with ALPN, I’ll check haproxy.org later to see if it then works, and see if I can replicate the issue in Firefox with GET requests being “lost” sometimes. > That's interesting because if the client managed to display the error, it > means the stream was not reset and the connection not aborted. So in fact I > suspect a side effect of the work done to better support the abortonclose > option. At least in 1.8.2 I see a huge amount of failed POST requests when http2 is enabled, like 30%+ sometimes even a lot higher. And those requests do give a 400 Bad Request, nothing “odd” other than the fact that haproxy believes the request got aborted by the client (CH). >> Which is the case for 1.8.2 and latest master. >> >> If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the >> "empty" GET requests. > Just to be clear, do you mean you *only* get the BADREQ with GET (ie it never > happens with POST) or that you *also* get the BADREQ with GET ? I mean, given > that we're fixing bugs, I'm not interested in seeing if patches also work as > alternatives to bugs we've already fixed, but if you're observing > regressions, that's different. 1.8.1: Only <BADREQ> (CR) on GET requests occasionally 1.8.2 + master: Bad Request (CH) on POST/PUT 30%+ of the time and <BADREQ> (CR) on GET requests occasionally Best Regards, Lucas Rolff On 29/12/2017, 11.13, "Willy Tarreau" <w...@1wt.eu> wrote: On Fri, Dec 29, 2017 at 08:46:18AM +0000, Lucas Rolff wrote: > > Yep. For what it's worth, it's been enabled for about one month on haproxy.org and till now we didn't get any bad report, which is pretty encouraging. > > Can I ask where? The negotiated protocol I get on https://haproxy.org/ is > http/1.1 in both Google Chrome and Firefox as an example. That's getting funny, as I'm having H2 here on firefox, as can be seen in the attached capture :-) Looking at the server's logs of the last hour, I'm clearly seeing *some* H2 traffic, about 5.5% of the HTTPS traffic : -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org | wc -l 1804 -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org | grep HTTP/2 | wc -l 108 I think this ratio is rather low and there are bots. If I focus only on the home page, it's slightly better but still low : -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' | grep HTTP/2 | wc -l 21 -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' | wc -l 298 And the fact that your browser doesn't negociate it certainly implies than a number of other browsers do not either. > If I use curl, I can see it has ALPN enabled with http2 - however, mentioned > browsers doesn't seem to actually establish the http2 connection, but rather > a 1.1 connection. Actually that's not the case and that may explain the situation. The machine runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static build of openssl 1.0.2 to see if the ratio increases. > For POST requests, it's not resolved with the patch, what I did find when > getting the 400 Bad Request on POST, it sometimes actually arrive at the > backend: > > So haproxy: > > Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 [29/Dec/2017:08:01:36.800] https_frontend~ cdn-backend/mycdn 0/0/1/-1/3 400 187 - - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1" > Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 [29/Dec/2017:08:01:36.806] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 187 - - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1" We definitely need to understand what's causing this. I'm currently thinking how to add more information to improve observability inside the mux to match what can currently be done outside (ie: we don't have the equivalent of the "show errors" nor error counters inside the mux). > https://snaps.hcdn.dk/AONBkWojDFwuErMjxF66Mg4VxJl9jz4KUM61jPpgcL.png - > however the request is ~ 68-69 ms, and that browser is Google Chrome, so in > that regard, the 400 Bad Request for POST/PUT also happens in Google Chrome. That's interesting because if the client managed to display the error, it means the stream was not reset and the connection not aborted. So in fact I suspect a side effect of the work done to better support the abortonclose option. > Which is the case for 1.8.2 and latest master. > > If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the "empty" GET requests. Just to be clear, do you mean you *only* get the BADREQ with GET (ie it never happens with POST) or that you *also* get the BADREQ with GET ? I mean, given that we're fixing bugs, I'm not interested in seeing if patches also work as alternatives to bugs we've already fixed, but if you're observing regressions, that's different. In short what I'm interested in is : - bugs that are (still) present in supposedly fixed (hence recent) versions ; - among them, those that were not present in older versions (which are thus regressions) > On nginx backend it looks like this: > > https://gist.github.com/lucasRolff/5eb0ae277c97b7457fbe546a1118e34f > > I do see a few connection resets, however these happen also when things work. OK. > However, that also confirms that "connection" header isn't actually sent, > despite Firefox says it's sent in their dev tools. Great, this means that we *have* to consider errors reported by this version as it's supposed to be trusted then. > So, just so no misunderstandings happen, it seems there's two things going wrong: > > ======= > 1.8.1: > - Firefox sometimes has issues with GET requests failing under http2 (Confirmed by Maximilian B) > - Haproxy log shows: Dec 29 08:17:50 localhost haproxy[4881]: 92.70.20.xx:56726 [29/Dec/2017:08:17:17.178] https_frontend~ https_frontend/<NOSRV> -1/-1/-1/-1/33388 400 0 - - CR-- 24/1/0/0/0 0/0 "<BADREQ>" (CR == The client aborted before sending a full HTTP request) > > 1.8.2 and git master: > - All browsers seem to suffer with occasional issues with POST/PUT requests, the GET requests issue still persist). (POST confirmed by Lukas T) > - Haproxy log shows (for POST requests): Dec 29 08:09:27 localhost haproxy[4432]: 92.70.20.xx:54951 [29/Dec/2017:08:09:27.167] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 187 - - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1" (CH == The client aborted while waiting for the server to start responding), while backend server will see the request as aborted by the client as well. > - - Client in scope of haproxy == browser > - - Client in scope of backend == haproxy I really suspect all of these are still related to the fact that the stream is half-closed during the transfer to H1 and that this continues to trigger some more sensitive code paths that were not used to see this happen in the past. > Semi conclusion: > - Connection headers are not sent by Firefox, Safari or Webkit, despite > webdev tools say so (in current versions at least) > - Not able to replicate GET request failures in other browsers than Firefox > despite doing thousands of requests (Tested Chrome, Safari, Opera, Webkit) OK, this is quite useful. > - Some bug seems like it got introduced between 1.8.1 and 1.8.2 causing > POST/PUT requests to fail in some cases I'm particularly interested in figuring exactly what changed here because most of the H2 changes there were made to actually *fix* the POST that simply didn't work in 1.8.1. But it's possible that some corner cases remain and that these ones were actually the only ones working by then :-/ > What I'd like to know: > - URL on haproxy.org that negotiates http2 correctly for Chrome and Firefox > (not exactly sure why it doesn't do it already?) ( > https://snaps.hcdn.dk/JFsEzPmxspw9hnXuFyM5G4QGYst7Q6R2zXmkZbRRjz.png ) For me it's https://www.haproxy.org/ but as indicated above it may be a matter of NPN vs ALPN so I'll go rebuild haproxy there with a static openssl 1.0.2 to see if it changes anything. Thanks! Willy