subject:"Re\: HTTP\/2 Termination vs. Firefox Quantum"

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-31 Thread Willy Tarreau

On Sun, Dec 31, 2017 at 07:22:26AM +, Lucas Rolff wrote:
> I've tested the 1.8.3 build, and I can indeed confirm it works like charm!

Great, thank you for confirming. We're making progress :-)

> @Willy, thanks for the extensive time you spend on debugging and
> investigating this as well!

You're welcome!

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-30 Thread Lucas Rolff

I’ve tested the 1.8.3 build, and I can indeed confirm it works like charm!

@Willy, thanks for the extensive time you spend on debugging and investigating 
this as well!

Best Regards,
Lucas Rolff

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-30 Thread Willy Tarreau

Just a quick update for all those following this thread. Thanks to Lucas'
traces, we could finally fix the problem in the hpack decoder. I'm now
releasing 1.8.3 which should be much more usable with H2.

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 08:45:57PM +, Lucas Rolff wrote:
> So, this Wx1ZSI usually should be "cookie" - however it's somehow turned into
> garbage.

Ah, this is what I was wondering.

> Repeated - now it's s3U2JV - but still supposed to be "cookie":

Great, so the number of characters is correct but not the decoding. I'll
instrument the huffman decoder to dump the original code. It's possible
that we have a bug there. It passed all the 26k tests I found, but it
doesn't mean it never breaks! Another possibility would be that this
one is passed as "never index" and that we'd have a bug there.

Here comes a patch that will dump the raw hpack frame so that I can run
it by hand and spot of we're getting something wrong. Beware, it *will*
reveal every raw header your browser sends including the domain name
or authentication tokens if any. If you don't feel comfortable with
sending this to the list, please just send this to me in private. It
should be compatible with the current patch, helping to sort it out.

Thanks a lot for your help!
Willy
diff --git a/src/hpack-dec.c b/src/hpack-dec.c
index 454f55c..d5d282c 100644
--- a/src/hpack-dec.c
+++ b/src/hpack-dec.c
@@ -155,6 +155,8 @@ int hpack_decode_frame(struct hpack_dht *dht, const uint8_t 
*raw, uint32_t len,
int must_index;
int ret;

+   debug_hexdump(stderr, "[HPACK-DEC] ", raw, 0, len);
+
chunk_reset(tmp);
ret = 0;
while (len) {

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

<<
:authority: dashboard.domain.com
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) 
Gecko/20100101 Firefox/59.0
accept: text/css,*/*;q=0.1
accept-language: da,en-US;q=0.8,en;q=0.6,es;q=0.4,tr;q=0.2
accept-encoding: gzip, deflate, br
referer: https://dashboard.domain.com/stats/6
cookie: _ga=GA1.2.2085297229.1474098197
Wx1ZSI: XSRF-TOKEN=SECURE_TOKEN%3D
cookie: laravel_session=SECURE_SESSION%3D%3D
pragma: no-cache
cache-control: no-cache
#

So, this Wx1ZSI usually should be “cookie” – however it’s somehow turned into 
garbage.

Repeated – now it’s s3U2JV – but still supposed to be “cookie”:

<<
:authority: dashboard.domain.com
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) 
Gecko/20100101 Firefox/59.0
accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
accept-language: da,en-US;q=0.8,en;q=0.6,es;q=0.4,tr;q=0.2
accept-encoding: gzip, deflate, br
referer: https://dashboard.domain.com/stats/1
cookie: _ga=GA1.2.2085297229.1474098197
s3U2JV: XSRF-TOKEN=SECURE_TOKEN%3D
cookie: laravel_session=SECURE_SESSION%3D%3D
upgrade-insecure-requests: 1
pragma: no-cache
cache-control: no-cache
#

It’s consistently the cookie header that fails.

Some repeated requests, all related to cookie where header field became:
6InNEa
InVMdk

Best Regards,
Lucas Rolff

On 29/12/2017, 21.21, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 06:56:36PM +, Lucas Rolff wrote:
> h2_make_h1_request:153
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643
> 
> /* this can be any type of header */
> /* RFC7540#8.1.2: upper case not allowed in header field names */
> for (i = 0; i < list[idx].n.len; i++)
> if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 'A')
> goto fail;
> 
> That's an interesting place to fail

OK I can propose the attached patch which will dump all the requests to
stderr, as they are received or extracted from the dynamic headers table.
The patch needs to be applied without the previous ones. This will look
like this :

  <<
  :authority: 127.0.0.1:4443
  user-agent: curl/7.57.0
  accept: */*
  >
  <<
  :authority: 127.0.0.1:4443
  user-agent: curl/7.57.0
  accept: */*
  aaa: AaA
  >

The '<<<' and '>>>' enclose a request. The final one will instead use "###"
to indicate that at least one bad char was received, or '!!!' to indicate
that another error was met. Please note that it will silently let the 
request
pass through so you need to check the output to see if these "###" happen.

Maybe we'll find a bug in the dynamic headers table causing some crap to
be returned. Or maybe we'll find that a given browser occasionally sends
a bad header.

Cheers,
willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 06:56:36PM +, Lucas Rolff wrote:
> h2_make_h1_request:153
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643
> 
> /* this can be any type of header */
> /* RFC7540#8.1.2: upper case not allowed in header field names */
> for (i = 0; i < list[idx].n.len; i++)
> if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 'A')
> goto fail;
> 
> That's an interesting place to fail

OK I can propose the attached patch which will dump all the requests to
stderr, as they are received or extracted from the dynamic headers table.
The patch needs to be applied without the previous ones. This will look
like this :

  <<
  :authority: 127.0.0.1:4443
  user-agent: curl/7.57.0
  accept: */*
  >
  <<
  :authority: 127.0.0.1:4443
  user-agent: curl/7.57.0
  accept: */*
  aaa: AaA
  >

The '<<<' and '>>>' enclose a request. The final one will instead use "###"
to indicate that at least one bad char was received, or '!!!' to indicate
that another error was met. Please note that it will silently let the request
pass through so you need to check the output to see if these "###" happen.

Maybe we'll find a bug in the dynamic headers table causing some crap to
be returned. Or maybe we'll find that a given browser occasionally sends
a bad header.

Cheers,
willy
diff --git a/src/h2.c b/src/h2.c
index 43ed7f3..9ca60de 100644
--- a/src/h2.c
+++ b/src/h2.c
@@ -31,6 +31,8 @@
 #include 
 #include 
 
+#include 
+#include 
 
 /* Prepare the request line into <*ptr> (stopping at ) from pseudo headers
  * stored in .  indicates what was found so far. This should be
@@ -134,7 +136,9 @@ int h2_make_h1_request(struct http_hdr *list, char *out, 
int osize)
int phdr;
int ret;
int i;
+   int bad=0;
 
+   fprintf(stderr, "<<\n");
lck = ck = -1; // no cookie for now
fields = 0;
for (idx = 0; list[idx].n.len != 0; idx++) {
@@ -145,9 +149,13 @@ int h2_make_h1_request(struct http_hdr *list, char *out, 
int osize)
else {
/* this can be any type of header */
/* RFC7540#8.1.2: upper case not allowed in header 
field names */
+
+   fprintf(stderr, "%s: ", istpad(trash.str, 
list[idx].n).ptr);
+   fprintf(stderr, "%s\n", istpad(trash.str, 
list[idx].v).ptr);
+
for (i = 0; i < list[idx].n.len; i++)
if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 
'A')
-   goto fail;
+   bad=1;
 
phdr = h2_str_to_phdr(list[idx].n);
}
@@ -296,8 +304,13 @@ int h2_make_h1_request(struct http_hdr *list, char *out, 
int osize)
*(out++) = '\n';
ret = out + osize - out_end;
  leave:
+   if (!bad)
+   fprintf(stderr, ">\n");
+   else
+   fprintf(stderr, "#\n");
return ret;
 
  fail:
+   fprintf(stderr, "!\n");
return -1;
 }

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 06:56:36PM +, Lucas Rolff wrote:
> h2_make_h1_request:153
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643
> 
> /* this can be any type of header */
> /* RFC7540#8.1.2: upper case not allowed in header field names */
> for (i = 0; i < list[idx].n.len; i++)
> if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 'A')
> goto fail;
> 
> That's an interesting place to fail

Ah excellent! This also shows a minor bug by which the upper case
'Z' is not correctly caught above. I think I'll add a bit of code
to emit on stderr when in debug mode the header name when this
situation happens. I'll see what can be done, and if we can
consider adding an option to work around this (at least forcing
all names to be lower case).

Thanks,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

h2_make_h1_request:153
h2_frt_decode_headers:2621
h2_frt_decode_headers:2643

/* this can be any type of header */
/* RFC7540#8.1.2: upper case not allowed in header field names */
for (i = 0; i < list[idx].n.len; i++)
if ((uint8_t)(list[idx].n.ptr[i] - 'A') < 'Z' - 'A')
goto fail;

That’s an interesting place to fail

- Lucas R

On 29/12/2017, 19.36, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 06:18:00PM +, Lucas Rolff wrote:
> I think you forgot to attach the patch

Grrr common mistake, sorry.

> I did try to add the continue patch for the "blacklisted" headers, and 
same result (now in this case, it happened on the website request itself:

Thanks, I'll take a look once I'm home.

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 06:18:00PM +, Lucas Rolff wrote:
> I think you forgot to attach the patch

Grrr common mistake, sorry.

> I did try to add the continue patch for the "blacklisted" headers, and same 
> result (now in this case, it happened on the website request itself:

Thanks, I'll take a look once I'm home.

Willy
diff --git a/src/h2.c b/src/h2.c
index 43ed7f3..1f8c7c8 100644
--- a/src/h2.c
+++ b/src/h2.c
@@ -32,6 +32,9 @@
 #include 
 
 
+#include 
+#define goto while (fprintf(stderr, "%s:%d\n", __FUNCTION__, __LINE__),1) goto
+
 /* Prepare the request line into <*ptr> (stopping at ) from pseudo headers
  * stored in .  indicates what was found so far. This should be
  * called once at the detection of the first general header field or at the end

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

I think you forgot to attach the patch

I did try to add the continue patch for the “blacklisted” headers, and same 
result (now in this case, it happened on the website request itself:

POST Request to site:
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067

GET Request to site:
h2_frt_decode_headers:2621
h2_frt_decode_headers:2643

A bit background info on why you see post and get:

Basically it’s a dropdown with automatic submit which then does a 302 redirect 
– so I post a bit of data e.g. following:
_token  TApEQSj4V3D3TZCUYauWoPan1mKhrk
date12/2017
zone1

The application will then do a redirect to a specific page based on that input 
data.

On 29/12/2017, 19.11, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 06:02:15PM +, Lucas Rolff wrote:
> POST Request (to website):
> h2s_frt_make_resp_data:3180
> h2s_frt_make_resp_data:3067
> -
> GET Request (to website):
> h2s_frt_make_resp_data:3180
> h2s_frt_make_resp_data:3067
> -
> Get Request (app.css)
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643

Excellent, it's this one :

/* OK now we have our header list in  */
outlen = h2_make_h1_request(list, bi_end(buf), try);

if (outlen < 0) {
h2c_error(h2c, H2_ERR_COMPRESSION_ERROR);
goto fail;
}

Now I'm starting to wonder whether it's true or not that the
Connection header is *never* sent... But it might also be that
something else violates the rules.

Would you want to retest with the extra patch attached ? It will do
the same with h2.c which is responsible for h2_make_h1_request() so
that we know better. And after this you could try again by applying
the patch I sent this morning which silently skips the connection
headers (replaces a goto fail with a continue). Do not hesitate to
ask me to redo it if you've lost it!

Cheers,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 06:02:15PM +, Lucas Rolff wrote:
> POST Request (to website):
> h2s_frt_make_resp_data:3180
> h2s_frt_make_resp_data:3067
> -
> GET Request (to website):
> h2s_frt_make_resp_data:3180
> h2s_frt_make_resp_data:3067
> -
> Get Request (app.css)
> h2_frt_decode_headers:2621
> h2_frt_decode_headers:2643

Excellent, it's this one :

/* OK now we have our header list in  */
outlen = h2_make_h1_request(list, bi_end(buf), try);

if (outlen < 0) {
h2c_error(h2c, H2_ERR_COMPRESSION_ERROR);
goto fail;
}

Now I'm starting to wonder whether it's true or not that the
Connection header is *never* sent... But it might also be that
something else violates the rules.

Would you want to retest with the extra patch attached ? It will do
the same with h2.c which is responsible for h2_make_h1_request() so
that we know better. And after this you could try again by applying
the patch I sent this morning which silently skips the connection
headers (replaces a goto fail with a continue). Do not hesitate to
ask me to redo it if you've lost it!

Cheers,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

Working page load (total of 4 requests), and we see 4x 3180|3067

POST Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (app.css)
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (app.js)
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067

Not working page load:

POST Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
GET Request (to website):
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067
-
Get Request (app.css)
h2_frt_decode_headers:2621
h2_frt_decode_headers:2643
-
Get Request (app.js)
h2s_frt_make_resp_data:3180
h2s_frt_make_resp_data:3067

Best Regards,
Lucas Rolff

On 29/12/2017, 18.21, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 04:48:13PM +, Lucas Rolff wrote:
> > If you're willing to run another test, I can prepare a debugging patch 
which will try to report every single error path in the H2 and HPACK code so 
that we can try to understand where the code was upset
> 
> I'd love to run another test or 10 - in the end, we'll all benefit from 
it (hopefully)

OK great, let's start with an easy one. This patch will output a line on
stderr for every function and line number where we go through a goto (most
goto in the code are error handling and most error handling is unrolled
using goto). For me it seldom prints :

   h2_process_demux:1759

When the client doesn't immediately send because it sees a partial request.
But with FF I'm not even seeing this one.

With a bit of luck you'll find one or a few lines that only happen when you
observe the problem, and the sequence will help us figure what code path
we're following.

If it doesn't work we'll try to do better.

Thanks!
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 04:48:13PM +, Lucas Rolff wrote:
> > If you're willing to run another test, I can prepare a debugging patch 
> > which will try to report every single error path in the H2 and HPACK code 
> > so that we can try to understand where the code was upset
> 
> I'd love to run another test or 10 - in the end, we'll all benefit from it 
> (hopefully)

OK great, let's start with an easy one. This patch will output a line on
stderr for every function and line number where we go through a goto (most
goto in the code are error handling and most error handling is unrolled
using goto). For me it seldom prints :

   h2_process_demux:1759

When the client doesn't immediately send because it sees a partial request.
But with FF I'm not even seeing this one.

With a bit of luck you'll find one or a few lines that only happen when you
observe the problem, and the sequence will help us figure what code path
we're following.

If it doesn't work we'll try to do better.

Thanks!
Willy
diff --git a/src/mux_h2.c b/src/mux_h2.c
index 71660f8..8e1c821 100644
--- a/src/mux_h2.c
+++ b/src/mux_h2.c
@@ -167,6 +167,9 @@ enum h2_ss {
 #define H2_SF_HEADERS_SENT  0x1000  // a HEADERS frame was sent for 
this stream
 #define H2_SF_OUTGOING_DATA 0x2000  // set whenever we've seen 
outgoing data

+
+#define goto while (fprintf(stderr, "%s:%d\n", __FUNCTION__, __LINE__),1) goto
+
 /* H2 stream descriptor, describing the stream as it appears in the H2C, and as
  * it is being processed in the internal HTTP representation (H1 for now).
  */

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

> If you're willing to run another test, I can prepare a debugging patch which 
> will try to report every single error path in the H2 and HPACK code so that 
> we can try to understand where the code was upset

I’d love to run another test or 10 – in the end, we’ll all benefit from it 
(hopefully)

Best Regards,
Lucas Rolff

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 04:10:12PM +, Lucas Rolff wrote:
> > Lucas, Maximilian, can you check the situation with this patch? The POST 
> > issue should definitely be gone, please also verify the GET issue with this 
> > patch (as I was unable to reproduce it).
> 
> Sadly didn't fix the GET request issue for me in Firefox:
> 
> https://snaps.hcdn.dk/h1Oz3G950oepEb30AEoK.png

OK, I didn't expect much of it to be honnest!

> the app.css?v=1 request still fails in this case (sometimes it's also the 
> HTML itself), log:
> 
> Dec 29 15:42:25 localhost haproxy[7708]: 80.61.160.xxx:65155 
> [29/Dec/2017:15:42:13.362] https_frontend~ https_frontend/ 
> -1/-1/-1/-1/12575 400 0 - - CR-- 1/1/0/0/0 0/0 ""
> Dec 29 15:42:30 localhost haproxy[7708]: 80.61.160.xxx:65236 
> [29/Dec/2017:15:42:25.981] https_frontend~ cdn-backend/mycdn 0/0/0/1/5001 200 
> 995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

This clearly shows that the request was aborted by the H2 gateway and
after creating the stream. If you're willing to run another test, I
can prepare a debugging patch which will try to report every single
error path in the H2 and HPACK code so that we can try to understand
where the code was upset.

> First entry is app.css, 2nd is the javascript.
> 
> As you can see in the screenshot the webdev tools line is kinda malformed,
> however, the actual file is served perfectly, with response headers and what
> not.
> The only thing that is missing in webdev tools for the app.js is the status
> code, remote address and "protocol" (Which is why you see +h2 and not
> HTTP/2.0+h2) - it manages to detect the actual protocol by an internal
> Firefox header (X-Firefox-Spdy), the HTTP/2.0 part is a "Version" attribute
> that Firefox manages as well, and this one never gets populated
> 
> I'm still trying to figure out if I can find why that information is missing 
> however.

Don't spend too much time trying to figure the cause of side effects.
Sometimes it's simply that the tool was not tested in such conditions
and doesn't deal cleanly with such a bug.

> When I do a tcpdump from the haproxy server (a simple tcpdump host
> 80.61.160.xxx and port 443) where 80.61.160.xxx is my home connection:
> 
> - When everything works, client port stays the same
> - When I visit a page where one GET request fails, I see the client port
> changing all of a sudden with a reset flag coming first from the server, and
> then the client:
> 
> https://gist.github.com/lucasRolff/1b8f29ed61fd8ae443894d28c7efff95#file-tcpdump-pcap-L28-L32
> 
> 178.63.183.xx == server (haproxy)
> 80.61.160.xxx == client (browser)

Given the small packet size here, I strongly suspect a GOAWAY followed by
a shutdown :

16:03:31.690396 IP 178.63.183.xx.443 > 80.61.160.xxx.56078: Flags [P.], seq 
250148:250194, ack 16947, win 622, options [nop,nop,TS val 1776018172 ecr 
510066715], length 46
16:03:31.691182 IP 178.63.183.xx.443 > 80.61.160.xxx.56078: Flags [F.], seq 
250194, ack 16947, win 622, options [nop,nop,TS val 1776018173 ecr 510066715], 
length 0

So there's definitely something that haproxy didn't accept. Here I'd more
easily blame the H2 gateway than the H1 code though. I'll see if I can
find anything related to this.

Thanks,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

> Lucas, Maximilian, can you check the situation with this patch? The POST 
> issue should definitely be gone, please also verify the GET issue with this 
> patch (as I was unable to reproduce it).

Sadly didn’t fix the GET request issue for me in Firefox:

https://snaps.hcdn.dk/h1Oz3G950oepEb30AEoK.png

the app.css?v=1 request still fails in this case (sometimes it’s also the HTML 
itself), log:

Dec 29 15:42:25 localhost haproxy[7708]: 80.61.160.xxx:65155 
[29/Dec/2017:15:42:13.362] https_frontend~ https_frontend/ 
-1/-1/-1/-1/12575 400 0 - - CR-- 1/1/0/0/0 0/0 ""
Dec 29 15:42:30 localhost haproxy[7708]: 80.61.160.xxx:65236 
[29/Dec/2017:15:42:25.981] https_frontend~ cdn-backend/mycdn 0/0/0/1/5001 200 
995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

First entry is app.css, 2nd is the javascript.

As you can see in the screenshot the webdev tools line is kinda malformed, 
however, the actual file is served perfectly, with response headers and what 
not.
The only thing that is missing in webdev tools for the app.js is the status 
code, remote address and “protocol” (Which is why you see +h2 and not 
HTTP/2.0+h2) – it manages to detect the actual protocol by an internal Firefox 
header (X-Firefox-Spdy), the HTTP/2.0 part is a “Version” attribute that 
Firefox manages as well, and this one never gets populated

I’m still trying to figure out if I can find why that information is missing 
however.

When I do a tcpdump from the haproxy server (a simple tcpdump host 
80.61.160.xxx and port 443) where 80.61.160.xxx is my home connection:

- When everything works, client port stays the same
- When I visit a page where one GET request fails, I see the client port 
changing all of a sudden with a reset flag coming first from the server, and 
then the client:

https://gist.github.com/lucasRolff/1b8f29ed61fd8ae443894d28c7efff95#file-tcpdump-pcap-L28-L32

178.63.183.xx == server (haproxy)
80.61.160.xxx == client (browser)

Best Regards,
Lucas Rolff

On 29/12/2017, 16.47, "lu...@ltri.eu on behalf of Lukas Tribus"  
wrote:

Hi Willy,

On Fri, Dec 29, 2017 at 3:58 PM, Willy Tarreau  wrote:
> On Fri, Dec 29, 2017 at 03:42:30PM +0100, Willy Tarreau wrote:
>> OK I managed to reproduce it with nghttp using --expect-continue to
>> force it to leave a pause before sending the data. And indeed there
>> the data are immediately followed by a shutdown. Getting closer...
>
> So here's what I found : when dealing with request forwarding, we used
> to let the close migrate from the client to the server with the last
> block. And this happens only once we switch to fast forwarding, which
> means that the last block from the request didn't fit in the buffer.
> Thus it would randomly impact large uploads (though timing would often
> protect them) and almost always impact small ones if sent in two parts
> as we could produce.
>
> The attached patch fixes it for me. Could you please give it a try ?

Confirmed, that patch fixes the issue for me.

The client was just a windows build of curl (with openssl 1.1.0), and
a tiny POST:
curl -kv https://localhost/111 -d "bla=bla" --http2

Port 443 SSH tunneled to a remote dev box running haproxy (but that
should not have affected the H2 stream).

I'm confident this fixes the POST issue reported in this thread, as
already confirmed by Lucas (by modifying nginx abort settings,
permitting in-flight half-close). Not sure if the GET issue is just
another vector of the same underlying issue, or if that is a
completely different issue altogether.

Lucas, Maximilian, can you check the situation with this patch? The
POST issue should definitely be gone, please also verify the GET issue
with this patch (as I was unable to reproduce it).

Also I think we can update the bugs filed with Firefox and Safari,
clearly they don't send the connection header (they just show it in
the dev-tools and debug logs).

> I'm currently thinking how to add more information to improve 
observability
> inside the mux to match what can currently be done outside (ie: we don't
> have the equivalent of the "show errors" nor error counters inside the 
mux).

Indeed I think that is a good idea. This morning I plastered the mux
code with printf's to figure out where exactly haproxy rejects the
request. Only to then find out the problem was not in the request :)
There are quite a few possible H2 failure code paths and the only
debug possibility right now is to manually are printf's.

Regards,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 04:46:57PM +0100, Lukas Tribus wrote:
> On Fri, Dec 29, 2017 at 3:58 PM, Willy Tarreau  wrote:
> > On Fri, Dec 29, 2017 at 03:42:30PM +0100, Willy Tarreau wrote:
> >> OK I managed to reproduce it with nghttp using --expect-continue to
> >> force it to leave a pause before sending the data. And indeed there
> >> the data are immediately followed by a shutdown. Getting closer...
> >
> > So here's what I found : when dealing with request forwarding, we used
> > to let the close migrate from the client to the server with the last
> > block. And this happens only once we switch to fast forwarding, which
> > means that the last block from the request didn't fit in the buffer.
> > Thus it would randomly impact large uploads (though timing would often
> > protect them) and almost always impact small ones if sent in two parts
> > as we could produce.
> >
> > The attached patch fixes it for me. Could you please give it a try ?
> 
> Confirmed, that patch fixes the issue for me.

Excellent, thank you (and I've just read Lucas' confirmation as well).

> The client was just a windows build of curl (with openssl 1.1.0), and
> a tiny POST:
> curl -kv https://localhost/111 -d "bla=bla" --http2
> 
> Port 443 SSH tunneled to a remote dev box running haproxy (but that
> should not have affected the H2 stream).

OK. That was my first attempt (albeit on a different OS) but what
matters is to find one case where it fails, so we're fine now.

> I'm confident this fixes the POST issue reported in this thread, as
> already confirmed by Lucas (by modifying nginx abort settings,
> permitting in-flight half-close). Not sure if the GET issue is just
> another vector of the same underlying issue, or if that is a
> completely different issue altogether.

I thought about it and can't imagine how it could be the same issue
since the one I just fixed is in the upload path, which will not happen
with a GET (or they're purposely sending a content-length to excite our
bugs but I'd rule this out :-)).

> Lucas, Maximilian, can you check the situation with this patch? The
> POST issue should definitely be gone, please also verify the GET issue
> with this patch (as I was unable to reproduce it).

I'll merge the patch to ease testing for others. I think that between
this and the recent mworker fixes, we're good to have another version
to progressively get rid of old bug reports.

> Also I think we can update the bugs filed with Firefox and Safari,
> clearly they don't send the connection header (they just show it in
> the dev-tools and debug logs).

Yes that would be nice to them.

> > I'm currently thinking how to add more information to improve observability
> > inside the mux to match what can currently be done outside (ie: we don't
> > have the equivalent of the "show errors" nor error counters inside the mux).
> 
> Indeed I think that is a good idea. This morning I plastered the mux
> code with printf's to figure out where exactly haproxy rejects the
> request. Only to then find out the problem was not in the request :)
> There are quite a few possible H2 failure code paths and the only
> debug possibility right now is to manually are printf's.

I know. During development I had to use a *lot* of printf, even with
colors and hexdumps, as the usual strace doesn't give useful info since
it's entirely done over SSL. The problem was that during the progressive
cleanups leading to the final code merge, this debugging code disappeared
and we were left with something much more opaque.

But while thinking about how to report more detailed info there, I'm
also seeing what will change once we rearrange the connection layer,
and the code will significantly move and be quite a bit reduced, with
less out-of-context errors. So it should become easier later to report
more precise information. That's why I'm trying to find the sweet spot
between absolutely needed elements and temporary stuff that will vanish
soon.

Cheers,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lukas Tribus

Hi Willy,

On Fri, Dec 29, 2017 at 3:58 PM, Willy Tarreau  wrote:
> On Fri, Dec 29, 2017 at 03:42:30PM +0100, Willy Tarreau wrote:
>> OK I managed to reproduce it with nghttp using --expect-continue to
>> force it to leave a pause before sending the data. And indeed there
>> the data are immediately followed by a shutdown. Getting closer...
>
> So here's what I found : when dealing with request forwarding, we used
> to let the close migrate from the client to the server with the last
> block. And this happens only once we switch to fast forwarding, which
> means that the last block from the request didn't fit in the buffer.
> Thus it would randomly impact large uploads (though timing would often
> protect them) and almost always impact small ones if sent in two parts
> as we could produce.
>
> The attached patch fixes it for me. Could you please give it a try ?

Confirmed, that patch fixes the issue for me.

The client was just a windows build of curl (with openssl 1.1.0), and
a tiny POST:
curl -kv https://localhost/111 -d "bla=bla" --http2

Port 443 SSH tunneled to a remote dev box running haproxy (but that
should not have affected the H2 stream).

I'm confident this fixes the POST issue reported in this thread, as
already confirmed by Lucas (by modifying nginx abort settings,
permitting in-flight half-close). Not sure if the GET issue is just
another vector of the same underlying issue, or if that is a
completely different issue altogether.

Lucas, Maximilian, can you check the situation with this patch? The
POST issue should definitely be gone, please also verify the GET issue
with this patch (as I was unable to reproduce it).

Also I think we can update the bugs filed with Firefox and Safari,
clearly they don't send the connection header (they just show it in
the dev-tools and debug logs).

> I'm currently thinking how to add more information to improve observability
> inside the mux to match what can currently be done outside (ie: we don't
> have the equivalent of the "show errors" nor error counters inside the mux).

Indeed I think that is a good idea. This morning I plastered the mux
code with printf's to figure out where exactly haproxy rejects the
request. Only to then find out the problem was not in the request :)
There are quite a few possible H2 failure code paths and the only
debug possibility right now is to manually are printf's.

Regards,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

Both in Firefox and Chrome my POST requests in 1.8.2 with the supplied patch, 
seem to do the trick (did about 300 post requests in each browser with no 
fails).

Best Regards,

On 29/12/2017, 15.58, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 03:42:30PM +0100, Willy Tarreau wrote:
> OK I managed to reproduce it with nghttp using --expect-continue to
> force it to leave a pause before sending the data. And indeed there
> the data are immediately followed by a shutdown. Getting closer...

So here's what I found : when dealing with request forwarding, we used
to let the close migrate from the client to the server with the last
block. And this happens only once we switch to fast forwarding, which
means that the last block from the request didn't fit in the buffer.
Thus it would randomly impact large uploads (though timing would often
protect them) and almost always impact small ones if sent in two parts
as we could produce.

The attached patch fixes it for me. Could you please give it a try ?

Thanks,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 03:42:30PM +0100, Willy Tarreau wrote:
> OK I managed to reproduce it with nghttp using --expect-continue to
> force it to leave a pause before sending the data. And indeed there
> the data are immediately followed by a shutdown. Getting closer...

So here's what I found : when dealing with request forwarding, we used
to let the close migrate from the client to the server with the last
block. And this happens only once we switch to fast forwarding, which
means that the last block from the request didn't fit in the buffer.
Thus it would randomly impact large uploads (though timing would often
protect them) and almost always impact small ones if sent in two parts
as we could produce.

The attached patch fixes it for me. Could you please give it a try ?

Thanks,
Willy
diff --git a/src/proto_http.c b/src/proto_http.c
index f585dee..64bd410 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -4963,8 +4963,13 @@ int http_request_forward_body(struct stream *s, struct 
channel *req, int an_bit)
 
/* When TE: chunked is used, we need to get there again to parse 
remaining
 * chunks even if the client has closed, so we don't want to set 
CF_DONTCLOSE.
+* And when content-length is used, we never want to let the possible
+* shutdown be forwarded to the other side, as the state machine will
+* take care of it once the client responds. It's also important to
+* prevent TIME_WAITs from accumulating on the backend side, and for
+* HTTP/2 where the last frame comes with a shutdown.
 */
-   if (msg->flags & HTTP_MSGF_TE_CHNK)
+   if (msg->flags & (HTTP_MSGF_TE_CHNK|HTTP_MSGF_CNT_LEN))
channel_dont_close(req);
 
/* We know that more data are expected, but we couldn't send more that

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 03:30:41PM +0100, Willy Tarreau wrote:
> On Fri, Dec 29, 2017 at 03:26:44PM +0100, Lukas Tribus wrote:
> > Indeed when the frontend connection is H1, a single send() call
> > contains both headers and payload and the issue does not occur. But
> > when the frontend connection is H2, then header and payload are in 2
> > distinct send calls and the issue does occur.
> 
> I'm trying to get curl to do that in h2 but for now I have not yet
> figured how to, even when fed from stdin it waits for my whole
> request before connecting :-/
> 
> What client did you use ?

OK I managed to reproduce it with nghttp using --expect-continue to
force it to leave a pause before sending the data. And indeed there
the data are immediately followed by a shutdown. Getting closer...

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 03:26:44PM +0100, Lukas Tribus wrote:
> Indeed when the frontend connection is H1, a single send() call
> contains both headers and payload and the issue does not occur. But
> when the frontend connection is H2, then header and payload are in 2
> distinct send calls and the issue does occur.

I'm trying to get curl to do that in h2 but for now I have not yet
figured how to, even when fed from stdin it waits for my whole
request before connecting :-/

What client did you use ?

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lukas Tribus

Hello,


On Fri, Dec 29, 2017 at 3:05 PM, Willy Tarreau  wrote:
>> Haproxy calls shutdown() after the HTTP payload was transmitted, nginx
>> in the default configuration or nc for that matter closes the
>> connection (we see recvfrom = 0) and then we close():
>
> I can't reproduce this one for now. I'm pretty sure that it's timing
> related.
>
>> 14:39:57.382142 sendto(9, "POST /111 HTTP/1.1\r\nuser-agent: "...,
>> 169, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 169
> (...)
>> 14:39:57.392212 read(8, "\27\3\3\0(", 5) = 5
>> 14:39:57.392458 read(8,
>> "tQ\325'wD\2375\222\202\1\241\277\35\347\213;\221&\211\303g\322\226[\334\10Z\20\332\36s"...,
>>  40) = 40
> (...)
>> 14:39:57.392868 sendto(9, "bla=bla", 7, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) 
>> = 7
>> 14:39:57.393178 shutdown(9, SHUT_WR)= 0
> (...)
>
> It's interesting to see that the request was sent in two parts, which doesn't
> happen in my case. That may be one of the differences. I'll try to play along
> this.

Indeed when the frontend connection is H1, a single send() call
contains both headers and payload and the issue does not occur. But
when the frontend connection is H2, then header and payload are in 2
distinct send calls and the issue does occur.



Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 02:52:39PM +0100, Lukas Tribus wrote:
> > For me it happens only when I have "option httpclose" in the configuration,
> > ie we end up in tunnel mode. I can't reproduce it with either keep-alive,
> > http-server-close nor forceclose. At least abortonclose is now safe
> > regarding this. Does this match your observations as well ?
> 
> No, I'm using the default http-keep-alive mode.

Then I'm still missing something. At least it's nice because it means
there is definitely something odd that remains to be fixed!

> Haproxy calls shutdown() after the HTTP payload was transmitted, nginx
> in the default configuration or nc for that matter closes the
> connection (we see recvfrom = 0) and then we close():

I can't reproduce this one for now. I'm pretty sure that it's timing
related.

> 14:39:57.382142 sendto(9, "POST /111 HTTP/1.1\r\nuser-agent: "...,
> 169, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 169
(...)
> 14:39:57.392212 read(8, "\27\3\3\0(", 5) = 5
> 14:39:57.392458 read(8,
> "tQ\325'wD\2375\222\202\1\241\277\35\347\213;\221&\211\303g\322\226[\334\10Z\20\332\36s"...,
>  40) = 40
(...)
> 14:39:57.392868 sendto(9, "bla=bla", 7, MSG_DONTWAIT|MSG_NOSIGNAL, NULL, 0) = 
> 7
> 14:39:57.393178 shutdown(9, SHUT_WR)= 0
(...)

It's interesting to see that the request was sent in two parts, which doesn't
happen in my case. That may be one of the differences. I'll try to play along
this.

thanks for the trace!
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lukas Tribus

Hello,



On Fri, Dec 29, 2017 at 2:31 PM, Willy Tarreau  wrote:
> On Fri, Dec 29, 2017 at 11:45:55AM +0100, Lukas Tribus wrote:
>> The FIN behavior comes from a48c141f4 ("BUG/MAJOR: connection: refine
>> the situations where we don't send shutw()"), which also hit 1.8.2, so
>> that explains the change in behavior between 1.8.1 and 1.8.2.
>
> For me it happens only when I have "option httpclose" in the configuration,
> ie we end up in tunnel mode. I can't reproduce it with either keep-alive,
> http-server-close nor forceclose. At least abortonclose is now safe
> regarding this. Does this match your observations as well ?

No, I'm using the default http-keep-alive mode.

Haproxy calls shutdown() after the HTTP payload was transmitted, nginx
in the default configuration or nc for that matter closes the
connection (we see recvfrom = 0) and then we close():


14:39:57.075589 gettimeofday({1514554797, 75637}, NULL) = 0
14:39:57.075706 gettimeofday({1514554797, 75746}, NULL) = 0
14:39:57.075809 epoll_wait(3, [{EPOLLIN, {u32=4, u64=4}}], 200, 1000) = 1
14:39:57.283293 gettimeofday({1514554797, 283332}, NULL) = 0
14:39:57.283453 accept4(4, {sa_family=AF_INET, sin_port=htons(50192),
sin_addr=inet_addr("127.0.0.1")}, [16], SOCK_NONBLOCK) = 8
14:39:57.283711 setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
14:39:57.283917 accept4(4, 0x7ffd16cfc9f0, 0x7ffd16cfc9e8,
SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
14:39:57.284197 read(8, 0x26f9713, 5)   = -1 EAGAIN (Resource
temporarily unavailable)
14:39:57.284399 epoll_ctl(3, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLRDHUP,
{u32=8, u64=8}}) = 0
14:39:57.284514 gettimeofday({1514554797, 284546}, NULL) = 0
14:39:57.284630 epoll_wait(3, [{EPOLLIN, {u32=8, u64=8}}], 200, 1000) = 1
14:39:57.297577 gettimeofday({1514554797, 297646}, NULL) = 0
14:39:57.297831 read(8, "\26\3\1\0\317", 5) = 5
14:39:57.298051 read(8,
"\1\0\0\313\3\0030\243\205\266H\202.\324\3417\262\33\25Bh\"_M\237\216\0169-\3778\4"...,
207) = 207
14:39:57.298283 gettimeofday({1514554797, 298324}, NULL) = 0
14:39:57.298500 gettimeofday({1514554797, 298539}, NULL) = 0
14:39:57.298893 gettimeofday({1514554797, 298932}, NULL) = 0
14:39:57.299143 gettimeofday({1514554797, 299181}, NULL) = 0
14:39:57.299274 gettimeofday({1514554797, 299318}, NULL) = 0
14:39:57.299394 gettimeofday({1514554797, 299427}, NULL) = 0
14:39:57.302972 write(8,
"\26\3\3\0j\2\0\0f\3\3a\323i\222I\33\21f\361\352\303\257\233\251\211y\315z\3310\254"...,
2938) = 2938
14:39:57.303303 read(8, 0x26fce93, 5)   = -1 EAGAIN (Resource
temporarily unavailable)
14:39:57.303549 gettimeofday({1514554797, 303585}, NULL) = 0
14:39:57.303762 epoll_wait(3, [{EPOLLIN, {u32=8, u64=8}}], 200, 1000) = 1
14:39:57.334310 gettimeofday({1514554797, 334383}, NULL) = 0
14:39:57.334572 read(8, "\26\3\3\0F", 5) = 5
14:39:57.334780 read(8,
"\20\0\0BA\4f\315w#\277\37\216\21W\0074\273\227`E\305\16\203\260O\216(\201\270g\305"...,
70) = 70
14:39:57.335413 read(8, "\24\3\3\0\1", 5) = 5
14:39:57.335709 read(8, "\1", 1)= 1
14:39:57.335975 read(8, "\26\3\3\0(", 5) = 5
14:39:57.336186 read(8,
"tQ\325'wD\2370\245\242<\362\247\300j\247b\303\367\24\rb\247\251\236\260GG\265(\22="...,
40) = 40
14:39:57.336395 gettimeofday({1514554797, 336431}, NULL) = 0
14:39:57.336589 write(8,
"\24\3\3\0\1\1\26\3\3\0(0\r\251\371\4X\3249\212Py\330\310\320\r\315h\10\376Oo"...,
51) = 51
14:39:57.336849 read(8, 0x26f50e3, 5)   = -1 EAGAIN (Resource
temporarily unavailable)
14:39:57.337071 gettimeofday({1514554797, 337107}, NULL) = 0
14:39:57.337264 epoll_wait(3, [{EPOLLIN, {u32=8, u64=8}}], 200, 1000) = 1
14:39:57.377717 gettimeofday({1514554797, 377794}, NULL) = 0
14:39:57.377942 read(8, "\27\3\3\", 5) = 5
14:39:57.378166 read(8,
"tQ\325'wD\2371\307\270^Y\233:\312A\232\355\346\247\35\235C\343\31_gR\272\307\355\362"...,
48) = 48
14:39:57.378439 read(8, "\27\3\3\0003", 5) = 5
14:39:57.378750 read(8,
"tQ\325'wD\2372\331\327\363N1\312\277H*${vDA\24w\346\216\211\273@\265R\344"...,
51) = 51
14:39:57.378972 read(8, "\27\3\3\0%", 5) = 5
14:39:57.379214 read(8,
"tQ\325'wD\2373@\300$\32\267\217@\354\226\224Z\246\253\216(\6\v=\213u\220\3730\237"...,
37) = 37
14:39:57.379431 read(8, "\27\3\3\0o", 5) = 5
14:39:57.379642 read(8,
"tQ\325'wD\2374&\223\263yW\0x\307\344\365\26\305\33q\355\t\314c'\322>\0\276\324"...,
111) = 111
14:39:57.379862 read(8, 0x26f50e3, 5)   = -1 EAGAIN (Resource
temporarily unavailable)
14:39:57.380070 setsockopt(8, SOL_TCP, TCP_QUICKACK, [1], 4) = 0
14:39:57.380273 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 9
14:39:57.380482 fcntl(9, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
14:39:57.380693 setsockopt(9, SOL_TCP, TCP_NODELAY, [1], 4) = 0
14:39:57.380911 connect(9, {sa_family=AF_INET, sin_port=htons(81),
sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now
in progress)
14:39:57.381209 gettimeofday({1514554797, 381268}, NULL) = 0
14:39:57.381435 epoll_wait(3, [], 200, 0) = 0
14:39:57.381640 gettimeofday({1514554797, 381680}, NULL) = 0
14:39:57.381865 write(8, "\27\3\3\0\r

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 11:45:55AM +0100, Lukas Tribus wrote:
> The FIN behavior comes from a48c141f4 ("BUG/MAJOR: connection: refine
> the situations where we don't send shutw()"), which also hit 1.8.2, so
> that explains the change in behavior between 1.8.1 and 1.8.2.

For me it happens only when I have "option httpclose" in the configuration,
ie we end up in tunnel mode. I can't reproduce it with either keep-alive,
http-server-close nor forceclose. At least abortonclose is now safe
regarding this. Does this match your observations as well ?

At least in this case it makes sense (even if not recommended) as the
httpclose mode ends up in tunnel mode and must forward the shutdown. If
it's the case, we can possibly document that httpclose should not be
used. But with H1 it rarely causes trouble, so I'm a bit hesitant about
what to do with it in H2 :-/

Here's the trace :

14:23:39.621361 sendto(15, "POST /s2 HTTP/1.1\r\nuser-agent: 
curl/7.54.1\r\naccept: */*\r\ncontent-length: 10\r\ncontent-type: 
application/x-www-form-urlencoded\r\nhost: 127.0.0.1:4443\r\nX-Forwarded-For: 
127.0.0.1\r\nConnection: close\r\n\r\nrob"..., 207, MSG_DONTWAIT|MSG_NOSIGNAL, 
NULL, 0) = 207
14:23:39.621467 shutdown(15, 1 /* send */) = 0
14:23:39.621531 epoll_wait(3, {}, 200, 0) = 0
14:23:39.621575 recvfrom(15, 0xacd164, 15354, 0, 0, 0) = -1 EAGAIN (Resource 
temporarily unavailable)
14:23:39.621617 epoll_ctl(3, EPOLL_CTL_ADD, 15, {EPOLLIN|0x2000, {u32=15, 
u64=15}}) = 0
14:23:39.621655 epoll_wait(3, {{EPOLLIN, {u32=14, u64=14}}}, 200, 1000) = 1
14:23:39.621695 read(14, "\27\3\3\0!", 5) = 5
14:23:39.621732 read(14, 
"\353&6\37b8\227\220\360\237\314?\255\353\253\324\244\210$1\366j\323Z7`\236\331\370N\357\10#",
 33) = 33
14:23:39.621779 read(14, 0xb17b83, 5)   = -1 EAGAIN (Resource temporarily 
unavailable)
14:23:39.621812 epoll_wait(3, {{EPOLLIN, {u32=15, u64=15}}}, 200, 1000) = 1
14:23:39.621955 recvfrom(15, "HTTP/1.1 200 OK\r\nContent-length: 
5\r\n\r\nazert", 15354, 0, NULL, NULL) = 43
14:23:39.622122 recvfrom(15, "", 15311, 0, NULL, NULL) = 0
14:23:39.622164 close(15)   = 0

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

Hi Lukas,

On Fri, Dec 29, 2017 at 11:45:55AM +0100, Lukas Tribus wrote:
> On Fri, Dec 29, 2017 at 11:22 AM, Lukas Tribus  wrote:
> > It's that:
> > - when sending the POST request to the backend server, haproxy sends a
> > FIN before the server responds
> > - nginx doesn't like that and closes the request (you will see nginx
> > error code 499 in nginx server logs)
> > - as there is a race on the backend server between receiving the FIN
> > and completing the response, this does not always happen
> > - haproxy returns "400 Bad Request" to the client, although the
> > request is fine and the response was empty (I consider this a bug)
> >
> >
> > The feature on nginx is basically what we call abortonclose, and can
> > be disabled by the following nginx directives (depending which backend
> > modules is used):
> > http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_client_abort
> > http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_ignore_client_abort
> >
> >
> > Howto reproduce the haproxy behavior:
> > - have a http backend pointing to nc
> > - make a POST request
> > - this is even reproducible with H1 clients, however H2 has to be
> > enabled on haproxy otherwise it doesn't send the FIN (strangely
> > enough)
> >
> >
> > Does this make sense?
> 
> The FIN behavior comes from a48c141f4 ("BUG/MAJOR: connection: refine
> the situations where we don't send shutw()"), which also hit 1.8.2, so
> that explains the change in behavior between 1.8.1 and 1.8.2.

That's very nice, thanks for the detailed info. So we're indeed still
fighting with some of the stuff inherited from H1 that is hard to
reproduce without H2. I've made a lot of tests with nc as a server,
others as well with thttpd and a CGI. I couldn't observe this phenomenon
anymore, but I'll recheck. I remember having played with certain areas
in the code that looked suspicious to me but which didn't appear to
improve anything so I refrained from taking the risk to break even more.
Maybe this needs to be revisited.

I'll take all of you updated regarding this.

BTW, I've updated haproxy.org to 1.8-latest and am seeing a much higher
ratio of HTTP/2 in the logs (~50%). Thus it seems that lack of ALPN was
definitely the reason for it being so limited. I anticipated that it
could be the case when deploying it but since I immediately received
some traffic, I considered that it was apparently not an issue (and I
was wrong).

Cheers,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

> Lucas, can you check my previous mail and see if you can enable ignoring 
> client aborts in your backend, assuming you are using nginx?

I can confirm that ignoring client aborts in my backend using 
fastcgi_ignore_client_abort “resolves” the issue regarding POST requests.

Best Regards,
Lucas R

On 29/12/2017, 11.46, "lu...@ltri.eu on behalf of Lukas Tribus"  
wrote:

Hello,


On Fri, Dec 29, 2017 at 11:22 AM, Lukas Tribus  wrote:
> It's that:
> - when sending the POST request to the backend server, haproxy sends a
> FIN before the server responds
> - nginx doesn't like that and closes the request (you will see nginx
> error code 499 in nginx server logs)
> - as there is a race on the backend server between receiving the FIN
> and completing the response, this does not always happen
> - haproxy returns "400 Bad Request" to the client, although the
> request is fine and the response was empty (I consider this a bug)
>
>
> The feature on nginx is basically what we call abortonclose, and can
> be disabled by the following nginx directives (depending which backend
> modules is used):
> 
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_client_abort
> 
http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_ignore_client_abort
>
>
> Howto reproduce the haproxy behavior:
> - have a http backend pointing to nc
> - make a POST request
> - this is even reproducible with H1 clients, however H2 has to be
> enabled on haproxy otherwise it doesn't send the FIN (strangely
> enough)
>
>
> Does this make sense?

The FIN behavior comes from a48c141f4 ("BUG/MAJOR: connection: refine
the situations where we don't send shutw()"), which also hit 1.8.2, so
that explains the change in behavior between 1.8.1 and 1.8.2.

Lucas, can you check my previous mail and see if you can enable
ignoring client aborts in your backend, assuming you are using nginx?



Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lukas Tribus

Hello,


On Fri, Dec 29, 2017 at 11:22 AM, Lukas Tribus  wrote:
> It's that:
> - when sending the POST request to the backend server, haproxy sends a
> FIN before the server responds
> - nginx doesn't like that and closes the request (you will see nginx
> error code 499 in nginx server logs)
> - as there is a race on the backend server between receiving the FIN
> and completing the response, this does not always happen
> - haproxy returns "400 Bad Request" to the client, although the
> request is fine and the response was empty (I consider this a bug)
>
>
> The feature on nginx is basically what we call abortonclose, and can
> be disabled by the following nginx directives (depending which backend
> modules is used):
> http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_client_abort
> http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_ignore_client_abort
>
>
> Howto reproduce the haproxy behavior:
> - have a http backend pointing to nc
> - make a POST request
> - this is even reproducible with H1 clients, however H2 has to be
> enabled on haproxy otherwise it doesn't send the FIN (strangely
> enough)
>
>
> Does this make sense?

The FIN behavior comes from a48c141f4 ("BUG/MAJOR: connection: refine
the situations where we don't send shutw()"), which also hit 1.8.2, so
that explains the change in behavior between 1.8.1 and 1.8.2.

Lucas, can you check my previous mail and see if you can enable
ignoring client aborts in your backend, assuming you are using nginx?



Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

> Actually that's not the case and that may explain the situation. The machine 
> runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static 
> build of openssl 1.0.2 to see if the ratio increases.

That might very well be the case, I know for sure that Chrome dropped support 
NPN and requires 1.0.2 for http2 to work, and I’m suspecting the same with 
recent Firefox versions (at least on Windows and Mac) seems to only work with 
ALPN, I’ll check haproxy.org later to see if it then works, and see if I can 
replicate the issue in Firefox with GET requests being “lost” sometimes.

> That's interesting because if the client managed to display the error, it 
> means the stream was not reset and the connection not aborted. So in fact I 
> suspect a side effect of the work done to better support the abortonclose 
> option.

At least in 1.8.2 I see a huge amount of failed POST requests when http2 is 
enabled, like 30%+ sometimes even a lot higher.

And those requests do give a 400 Bad Request, nothing “odd” other than the fact 
that haproxy believes the request got aborted by the client (CH).

>> Which is the case for 1.8.2 and latest master.
>> 
>> If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the 
>> "empty" GET requests.
> Just to be clear, do you mean you *only* get the BADREQ with GET (ie it never 
> happens with POST) or that you *also* get the BADREQ with GET ? I mean, given 
> that we're fixing bugs, I'm not interested in seeing if patches also work as 
> alternatives to bugs we've already fixed, but if you're observing 
> regressions, that's different.

1.8.1: Only  (CR) on GET requests occasionally
1.8.2 + master: Bad Request (CH) on POST/PUT 30%+ of the time and  (CR) 
on GET requests occasionally

Best Regards,
Lucas Rolff

On 29/12/2017, 11.13, "Willy Tarreau"  wrote:

On Fri, Dec 29, 2017 at 08:46:18AM +, Lucas Rolff wrote:
> > Yep. For what it's worth, it's been enabled for about one month on 
haproxy.org and till now we didn't get any bad report, which is pretty 
encouraging.
> 
> Can I ask where? The negotiated protocol I get on https://haproxy.org/ is
> http/1.1 in both Google Chrome and Firefox as an example.

That's getting funny, as I'm having H2 here on firefox, as can be seen
in the attached capture :-)

Looking at the server's logs of the last hour, I'm clearly seeing *some*
H2 traffic, about 5.5% of the HTTPS traffic : 

  -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org | wc -l
  1804
  -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org |  grep HTTP/2 | 
wc -l
  108

I think this ratio is rather low and there are bots. If I focus only on the 
home
page, it's slightly better but still low :

  -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' |  grep 
HTTP/2 | wc -l
  21
  -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' | wc -l
  298

And the fact that your browser doesn't negociate it certainly implies than a
number of other browsers do not either.

> If I use curl, I can see it has ALPN enabled with http2 - however, 
mentioned
> browsers doesn't seem to actually establish the http2 connection, but 
rather
> a 1.1 connection.

Actually that's not the case and that may explain the situation. The machine
runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static
build of openssl 1.0.2 to see if the ratio increases.

> For POST requests, it's not resolved with the patch, what I did find when
> getting the 400 Bad Request on POST, it sometimes actually arrive at the
> backend:
> 
> So haproxy:
> 
> Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.800] https_frontend~ cdn-backend/mycdn 0/0/1/-1/3 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"
> Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.806] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"

We definitely need to understand what's causing this. I'm currently thinking
how to add more information to improve observability inside the mux to match
what can currently be done outside (ie: we don't have the equivalent of the
"show errors" nor error counters inside the mux).

> https://snaps.hcdn.dk/AONBkWojDFwuErMjxF66Mg4VxJl9jz4KUM61jPpgcL.png -
> however the request is ~ 68-69 ms, and that browser is Google Chrome, so 
in
> that regard, the 400 Bad Request for POST/PUT also happens in Google 
Chrome.

That's interesting because if the client managed to display the error,
it means the stream was not reset and the connection not aborted. So
in fact I suspect a side effect of the work done to better support the
abortonclose option.

> Which is the case for 1.8.2 and latest master.
>

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 11:22:56AM +0100, Lukas Tribus wrote:
> Hello,
> 
> 
> On Fri, Dec 29, 2017 at 8:13 AM, Willy Tarreau  wrote:
> > Yep. For what it's worth, it's been enabled for about one month on 
> > haproxy.org
> > and till now we didn't get any bad report, which is pretty encouraging.
> 
> It appears to run 1.7.5 though:
> http://demo.haproxy.org/

No not this one. In fact demo.haproxy.org runs on my home server. The
main site is co-hosted with the formilux site (also hosting the mailing
list) and you can check the stats here :

   http://stats.formilux.org/

Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lukas Tribus

Hello,


On Fri, Dec 29, 2017 at 8:13 AM, Willy Tarreau  wrote:
> Yep. For what it's worth, it's been enabled for about one month on haproxy.org
> and till now we didn't get any bad report, which is pretty encouraging.

It appears to run 1.7.5 though:
http://demo.haproxy.org/




>> For now, I'll personally leave http2 support disabled - since it's breaking
>> my applications for a big percentage of my users, and I'll have to find an
>> intermediate solution until at least the bug in regards to Firefox losing
>> connections (this thing):
>>
>> Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 
>> [28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/ 
>> -1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 ""
>> Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 
>> [28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 
>> 200 995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"
>
> If this is met in production it definitely is a problem that we have to
> address. Could you please try the attached patch to see if it fixes the
> issue for you ? If so, I would at least like that we can keep some
> statistics on it, and maybe even condition it.

I've been able to pinpoint the POST issue affecting 20% of the requests.
It has nothing to do with connection headers, invalid requests or browsers.

It's that:
- when sending the POST request to the backend server, haproxy sends a
FIN before the server responds
- nginx doesn't like that and closes the request (you will see nginx
error code 499 in nginx server logs)
- as there is a race on the backend server between receiving the FIN
and completing the response, this does not always happen
- haproxy returns "400 Bad Request" to the client, although the
request is fine and the response was empty (I consider this a bug)


The feature on nginx is basically what we call abortonclose, and can
be disabled by the following nginx directives (depending which backend
modules is used):
http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_ignore_client_abort
http://nginx.org/en/docs/http/ngx_http_fastcgi_module.html#fastcgi_ignore_client_abort


Howto reproduce the haproxy behavior:
- have a http backend pointing to nc
- make a POST request
- this is even reproducible with H1 clients, however H2 has to be
enabled on haproxy otherwise it doesn't send the FIN (strangely
enough)


Does this make sense?



cheers,
lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Willy Tarreau

On Fri, Dec 29, 2017 at 08:46:18AM +, Lucas Rolff wrote:
> > Yep. For what it's worth, it's been enabled for about one month on 
> > haproxy.org and till now we didn't get any bad report, which is pretty 
> > encouraging.
> 
> Can I ask where? The negotiated protocol I get on https://haproxy.org/ is
> http/1.1 in both Google Chrome and Firefox as an example.

That's getting funny, as I'm having H2 here on firefox, as can be seen
in the attached capture :-)

Looking at the server's logs of the last hour, I'm clearly seeing *some*
H2 traffic, about 5.5% of the HTTPS traffic : 

  -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org | wc -l
  1804
  -bash-4.2$ fgrep 'public~' h.log | grep www.haproxy.org |  grep HTTP/2 | wc -l
  108

I think this ratio is rather low and there are bots. If I focus only on the home
page, it's slightly better but still low :

  -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' |  grep HTTP/2 | 
wc -l
  21
  -bash-4.2$ fgrep 'public~' h.log | fgrep 'www.haproxy.org/ ' | wc -l
  298

And the fact that your browser doesn't negociate it certainly implies than a
number of other browsers do not either.

> If I use curl, I can see it has ALPN enabled with http2 - however, mentioned
> browsers doesn't seem to actually establish the http2 connection, but rather
> a 1.1 connection.

Actually that's not the case and that may explain the situation. The machine
runs OpenSSL 1.0.1 so only NPN is used, ALPN isn't. I'll try with a static
build of openssl 1.0.2 to see if the ratio increases.

> For POST requests, it's not resolved with the patch, what I did find when
> getting the 400 Bad Request on POST, it sometimes actually arrive at the
> backend:
> 
> So haproxy:
> 
> Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
> [29/Dec/2017:08:01:36.800] https_frontend~ cdn-backend/mycdn 0/0/1/-1/3 400 
> 187 - - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"
> Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
> [29/Dec/2017:08:01:36.806] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 
> 187 - - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"

We definitely need to understand what's causing this. I'm currently thinking
how to add more information to improve observability inside the mux to match
what can currently be done outside (ie: we don't have the equivalent of the
"show errors" nor error counters inside the mux).

> https://snaps.hcdn.dk/AONBkWojDFwuErMjxF66Mg4VxJl9jz4KUM61jPpgcL.png -
> however the request is ~ 68-69 ms, and that browser is Google Chrome, so in
> that regard, the 400 Bad Request for POST/PUT also happens in Google Chrome.

That's interesting because if the client managed to display the error,
it means the stream was not reset and the connection not aborted. So
in fact I suspect a side effect of the work done to better support the
abortonclose option.

> Which is the case for 1.8.2 and latest master.
> 
> If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the 
> "empty" GET requests.

Just to be clear, do you mean you *only* get the BADREQ with GET (ie it
never happens with POST) or that you *also* get the BADREQ with GET ?
I mean, given that we're fixing bugs, I'm not interested in seeing if
patches also work as alternatives to bugs we've already fixed, but if
you're observing regressions, that's different.

In short what I'm interested in is :
  - bugs that are (still) present in supposedly fixed (hence recent)
versions ;
  - among them, those that were not present in older versions (which
are thus regressions)

> On nginx backend it looks like this:
> 
> https://gist.github.com/lucasRolff/5eb0ae277c97b7457fbe546a1118e34f
> 
> I do see a few connection resets, however these happen also when things work.

OK.

> However, that also confirms that "connection" header isn't actually sent,
> despite Firefox says it's sent in their dev tools.

Great, this means that we *have* to consider errors reported by this version
as it's supposed to be trusted then.

> So, just so no misunderstandings happen, it seems there's two things going 
> wrong:
> 
> ===
> 1.8.1:
> - Firefox sometimes has issues with GET requests failing under http2 
> (Confirmed by Maximilian B)
> - Haproxy log shows: Dec 29 08:17:50 localhost haproxy[4881]: 
> 92.70.20.xx:56726 [29/Dec/2017:08:17:17.178] https_frontend~ 
> https_frontend/ -1/-1/-1/-1/33388 400 0 - - CR-- 24/1/0/0/0 0/0 
> "" (CR == The client aborted before sending a full HTTP request)
> 
> 1.8.2 and git master:
> - All browsers seem to suffer with occasional issues with POST/PUT requests, 
> the GET requests issue still persist). (POST confirmed by Lukas T)
> - Haproxy log shows (for POST requests): Dec 29 08:09:27 localhost 
> haproxy[4432]: 92.70.20.xx:54951 [29/Dec/2017:08:09:27.167] https_frontend~ 
> cdn-backend/mycdn 0/0/0/-1/2 400 187 - - CH-- 1/1/0/0/0 0/0 "POST /login 
> HTTP/1.1" (CH == The client aborted while waiting for the server to start 
> responding), while backend server

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-29 Thread Lucas Rolff

> Yep. For what it's worth, it's been enabled for about one month on 
> haproxy.org and till now we didn't get any bad report, which is pretty 
> encouraging.

Can I ask where? The negotiated protocol I get on https://haproxy.org/ is 
http/1.1 in both Google Chrome and Firefox as an example.

If I use curl, I can see it has ALPN enabled with http2 – however, mentioned 
browsers doesn’t seem to actually establish the http2 connection, but rather a 
1.1 connection.

> If this is met in production it definitely is a problem that we have to 
> address. Could you please try the attached patch to see if it fixes the issue 
> for you ? If so, I would at least like that we can keep some statistics on 
> it, and maybe even condition it.

For POST requests, it’s not resolved with the patch, what I did find when 
getting the 400 Bad Request on POST, it sometimes actually arrive at the 
backend:

So haproxy:

Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.800] https_frontend~ cdn-backend/mycdn 0/0/1/-1/3 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"
Dec 29 08:01:36 localhost haproxy[4432]: 92.70.20.xx:52949 
[29/Dec/2017:08:01:36.806] https_frontend~ cdn-backend/mycdn 0/0/0/-1/2 400 187 
- - CH-- 1/1/0/0/0 0/0 "POST /login HTTP/1.1"

Backend:

2017/12/29 08:01:36 [info] 7701#7701: *1191 epoll_wait() reported that client 
prematurely closed connection, so upstream connection is closed too while 
sending request to upstream, client: 92.70.20.xxx, server: 
dashboard.domain.com, request: "POST /login HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/php-fpm/php5.sock:", host: "dashboard.domain.com", 
referrer: "https://dashboard.domain.com/login";
2017/12/29 08:01:36 [info] 7701#7701: *1193 epoll_wait() reported that client 
prematurely closed connection, so upstream connection is closed too while 
sending request to upstream, client: 92.70.20.xxx, server: 
dashboard.domain.com, request: "POST /login HTTP/1.1", upstream: 
"fastcgi://unix:/var/run/php-fpm/php5.sock:", host: "dashboard.domain.com", 
referrer: https://dashboard.domain.com/login

So haproxy says the client aborted while waiting for the server to start 
responding.

https://snaps.hcdn.dk/AONBkWojDFwuErMjxF66Mg4VxJl9jz4KUM61jPpgcL.png - however 
the request is ~ 68-69 ms, and that browser is Google Chrome, so in that 
regard, the 400 Bad Request for POST/PUT also happens in Google Chrome.

Which is the case for 1.8.2 and latest master.

If I do the patch for 1.8.1 I still get BADREQ sometimes in haproxy with the 
“empty” GET requests.

On nginx backend it looks like this:

https://gist.github.com/lucasRolff/5eb0ae277c97b7457fbe546a1118e34f

I do see a few connection resets, however these happen also when things work.

However, that also confirms that “connection” header isn’t actually sent, 
despite Firefox says it’s sent in their dev tools.

So, just so no misunderstandings happen, it seems there’s two things going 
wrong:

===
1.8.1:
- Firefox sometimes has issues with GET requests failing under http2 (Confirmed 
by Maximilian B)
- Haproxy log shows: Dec 29 08:17:50 localhost haproxy[4881]: 92.70.20.xx:56726 
[29/Dec/2017:08:17:17.178] https_frontend~ https_frontend/ 
-1/-1/-1/-1/33388 400 0 - - CR-- 24/1/0/0/0 0/0 "" (CR == The client 
aborted before sending a full HTTP request)

1.8.2 and git master:
- All browsers seem to suffer with occasional issues with POST/PUT requests, 
the GET requests issue still persist). (POST confirmed by Lukas T)
- Haproxy log shows (for POST requests): Dec 29 08:09:27 localhost 
haproxy[4432]: 92.70.20.xx:54951 [29/Dec/2017:08:09:27.167] https_frontend~ 
cdn-backend/mycdn 0/0/0/-1/2 400 187 - - CH-- 1/1/0/0/0 0/0 "POST /login 
HTTP/1.1" (CH == The client aborted while waiting for the server to start 
responding), while backend server will see the request as aborted by the client 
as well.
- - Client in scope of haproxy == browser
- - Client in scope of backend == haproxy

Semi conclusion:
- Connection headers are not sent by Firefox, Safari or Webkit, despite webdev 
tools say so (in current versions at least)
- Not able to replicate GET request failures in other browsers than Firefox 
despite doing thousands of requests (Tested Chrome, Safari, Opera, Webkit)
- Some bug seems like it got introduced between 1.8.1 and 1.8.2 causing 
POST/PUT requests to fail in some cases

What I’d like to know:
- URL on haproxy.org that negotiates http2 correctly for Chrome and Firefox 
(not exactly sure why it doesn’t do it already?) ( 
https://snaps.hcdn.dk/JFsEzPmxspw9hnXuFyM5G4QGYst7Q6R2zXmkZbRRjz.png )

Best Regards,
Lucas Rolff

On 29/12/2017, 08.13, "Willy Tarreau"  wrote:

Hi Lucas,

On Fri, Dec 29, 2017 at 06:06:49AM +, Lucas Rolff wrote:
> As much as I agree about that specs should be followed, I realized that 
even
> if there's people that want to follow the spec 100%, there will always be
> implementations used in large scale that won't be following the spec 10

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Willy Tarreau

Hi Lucas,

On Fri, Dec 29, 2017 at 06:06:49AM +, Lucas Rolff wrote:
> As much as I agree about that specs should be followed, I realized that even
> if there's people that want to follow the spec 100%, there will always be
> implementations used in large scale that won't be following the spec 100% -

Absolutely, and the general principle always applies : "be strict in what you
send, be liberal in what you accept".

> the reasoning behind this can be multiple - one I could imagine is the fact
> when browsers or servers start implementing a new protocol (h2 is a good
> example) before the spec is actually 100% finalized - when it's then
> finalized, the vendor might end up with a implementation that is slightly
> violating the actual spec, but however either won't fix it because the
> violations are minor or because the violations are not by any way "breaking"
> when you also compare it to the other implementations done.

Oh I know this pretty well, and you forget one of the most important aspects
which is that browsers first implemented SPDY and then modified it to create
H2. So some of the H2 rules were not closely followed initially because they
were not possible to implement with the initial SPDY code. Also this early
implementation before the protocol is 100% finalized is responsible for some
of the dirty things that were promised to be fixed before the RFC but were
rejected by some implementers who had already deployed (hpack header ordering,
hpack encoding).

> In this case if I understand you correctly, the errors are related to the
> fact that certain clients didn't implement the spec correctly in first place.

Yes but here we were speaking about a very recent client, which doesn't make
much sense, especially with such an important header.

> I was very curious why e.g. the Connection header (even if it isn't sent by
> Firefox or Safari/Webkit even though their webdev tools say it is), would
> work in nginx, and Apache for that matter, so I asked on their mailing list
> why they were violating the spec.
> 
> Valentin gave a rather interesting answer why they in their software actually
> decided to sometimes violate specific parts, it all boiled down to client
> support, because they also realized the fact that many browsers (that might
> be EOL and never get updated), might have implementations that would not work
> with http2 in that case.
> 
> http://mailman.nginx.org/pipermail/nginx/2017-December/055356.html

Oh I totally agree with all the points Valentin made there. We've all been
forced to swallow a lot of ugly stuff to be compatible with existing
deployments. Look at "option accept-invalid-http-requests/responses"
to get an idea. Such ugly things even ended up as relaxed rules in the
updated HTTP spec (723x), like support for multiple content-length headers
that some clients or servers send and that we check for correctness.

> I know that it's different software, and that how others decide to design
> their software is completely up to them.

It's important to keep in mind that haproxy's H2 support came very late. I
participated a lot to the spec, even authored a proposal 6 years ago. But
despite this I didn't have enough time to start implementing in haproxy by
then. Most implementers already had a working SPDY implementation to start
from, allowing them to progress much faster. Nginx was one of them, and as
such they had to face all the early implementations and their issues. It's
totally normal that they had to proceed like this.

> Violating specs on purpose is generally bad, no doubt about that - but if
> it's a requirement to be able to get good coverage in regards to clients
> (both new and old browsers that are actually in use), then I understand why
> one would go to such lengths as having to "hack" a bit to make sure generally
> used browsers can use the protocol.

We're always doing the same. Violating the specs on stuff you send is a big
interoperability problem because you force all other ones to in turn violate
the spec to support you. But violating the specs on stuff you accept is less
big of a deal as long as it doesn't cause vulnerabilities. Silently dropping
the connection-specific headers is no big deal, and clearly something we may
have to do in the future if some trouble with existing clients are reported.
Similarly when you run h2spec you'll see that sometimes we prefer to close
using RST_STREAM than GOAWAY because the latter would require us to keep the
stream "long enough" to remember about it, and we're not going to keep
millions of streams in memory just for the sake of not breaking a connection
once in a while.

> > So at least my analysis for now is that for a reason still to be
> > determined, this version of firefox didn't correctly interoperate with
> > haproxy in a given environment
> 
> Downgrading Firefox to earlier versions (such as 55, which is "pre"-quantum)
> reveals the same issue with bad requests.

Then I don't understand as it seems that differen

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff

Hi Willy,

> In fact it's a race between the GOAWAY frame caused by the invalid request, 
> and the HEADERS frame being sent in response to the stream being closed
> I agree that it's quite confusing, but we're talking about responses to 
> conditions that are explicitly forbidden in the spec, so I'd rather not spend 
> too much energy on this for now.

As much as I agree about that specs should be followed, I realized that even if 
there’s people that want to follow the spec 100%, there will always be 
implementations used in large scale that won’t be following the spec 100% - the 
reasoning behind this can be multiple – one I could imagine is the fact when 
browsers or servers start implementing a new protocol (h2 is a good example) 
before the spec is actually 100% finalized – when it’s then finalized, the 
vendor might end up with a implementation that is slightly violating the actual 
spec, but however either won’t fix it because the violations are minor or 
because the violations are not by any way “breaking” when you also compare it 
to the other implementations done.

In this case if I understand you correctly, the errors are related to the fact 
that certain clients didn’t implement the spec correctly in first place.

I was very curious why e.g. the Connection header (even if it isn’t sent by 
Firefox or Safari/Webkit even though their webdev tools say it is), would work 
in nginx, and Apache for that matter, so I asked on their mailing list why they 
were violating the spec.

Valentin gave a rather interesting answer why they in their software actually 
decided to sometimes violate specific parts, it all boiled down to client 
support, because they also realized the fact that many browsers (that might be 
EOL and never get updated), might have implementations that would not work with 
http2 in that case.

http://mailman.nginx.org/pipermail/nginx/2017-December/055356.html

I know that it’s different software, and that how others decide to design their 
software is completely up to them.
Violating specs on purpose is generally bad, no doubt about that – but if it’s 
a requirement to be able to get good coverage in regards to clients (both new 
and old browsers that are actually in use), then I understand why one would go 
to such lengths as having to “hack” a bit to make sure generally used browsers 
can use the protocol.

> So at least my analysis for now is that for a reason still to be determined, 
> this version of firefox didn't correctly interoperate with haproxy in a given 
> environment

Downgrading Firefox to earlier versions (such as 55, which is “pre”-quantum) 
reveals the same issue with bad requests.

Hopefully you’ll not have to violate the http2 spec in any way – but I do see a 
valid point explained by Valentin – the fact that you cannot guarantee all 
clients to be 100% compliant by the spec, and there might be a bunch of (used) 
EOL devices around.

I used to work at a place where haproxy were used extensively, so seeing http2 
support getting better and better is a really awesome thing, because it would 
actually mean that http2 could be implemented in that specific environment – I 
do hope in a few releases that http2 in haproxy gets to a point where we could 
rate it as “production ready”, with no real visible bugs from a customer 
perspective, at that point I think it would be good to implement in a large 
scale environment (for a percentage of the requests) to see how much traffic 
might actually get dropped in case the spec is followed – to see from some real 
world workload how many clients actually violate the spec.

For now, I’ll personally leave http2 support disabled – since it’s breaking my 
applications for a big percentage of my users, and I’ll have to find an 
intermediate solution until at least the bug in regards to Firefox losing 
connections (this thing):

Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 
[28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/ 
-1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 ""
Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 
[28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 200 
995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

I never expect software to be bug free – but at this given point, this specific 
issue that happens causes too much visible “trouble” for end-users for me to be 
able to keep it enabled
I’ll figure out if I can replicate the same issue in more browsers (without 
connection: keep-alive header), maybe that would give us more insight.

Best Regards,
Lucas Rolff

On 29/12/2017, 00.08, "Willy Tarreau"  wrote:

Hi Lukas,

On Thu, Dec 28, 2017 at 09:19:24PM +0100, Lukas Tribus wrote:
> On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus  wrote:
> > Hello,
> >
> >
> >> But in this example, you're using HTTP/1.1, The "Connection" header is
> >> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
> >> incon

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Willy Tarreau

Hi Lukas,

On Thu, Dec 28, 2017 at 09:19:24PM +0100, Lukas Tribus wrote:
> On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus  wrote:
> > Hello,
> >
> >
> >> But in this example, you're using HTTP/1.1, The "Connection" header is
> >> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
> >> inconsistency here.
> >
> > For me a request like this:
> > $ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
> > -d "bla=bla"
> >
> > Fired multiple times from the shell, leads to a "400 Bad Request"
> > response in about 20 ~ 30 % of the cases and is forwarded to the
> > backend in other cases.

In fact it's a race between the GOAWAY frame caused by the invalid
request, and the HEADERS frame being sent in response to the stream
being closed. It pretty much depends which one makes its way through
the mux first, and given that both depend on the scheduling of all
pending events, I hardly see what we can do to achieve a better
consistency, except cheating (eg: killing the stream in a way to
make it silent). In both cases the GOAWAY should be sent, and only
sometimes there is enough time to get the 400 sent in the middle,
which gets reported. I agree that it's quite confusing, but we're
talking about responses to conditions that are explicitly forbidden
in the spec, so I'd rather not spend too much energy on this for now.

> However I am unable to reproduce the issue with Firefox: none of the
> quantum releases (57.0, 57.0.1, 57.0.2, 57.0.3) emit a connection
> header in my testing:

That's pretty much interesting, so in fact probably that in the end
it's not really sent. I can't test, I installed 57.0.3 on my machine
and it's totally broken, tabs spin forever and even google.com does
not load, so I had to revert to the last working Firefox ESR version :-(

> - https://http2.golang.org/reqinfo never shows a connection header
> (not even with POST)

You never know whether this one could be stripped on the server side
however.

> - sniffing with wiresshark (using SSLKEYLOGFILE) also shows that
> Firefox never emits a connection header in H2

OK this one sounds better.

> - the developer tools *always* show a connection header in the
> request, although there really isn't one - clearly there is a
> discrepancy between what is transmitted on the wire and what is shown
> on in dev tools

Great, so that makes more sense regarding the observations so far.
It's never fun when dev tools report false elements but it possibly
depends where the information is extracted and we could even imagine
that the header is internally emitted and stripped just before the
request is converted to H2, so let's not completely blame the dev
tool yet either :-)

> What am I missing? Can you guys provide a decrypted trace showing this
> behavior, the output of the http2 golang test and can you please both
> clarify which OS you reproduce this on?

So at least my analysis for now is that for a reason still to be
determined, this version of firefox didn't correctly interoperate with
haproxy in a given environment, that the dev tools reported a connection
header, which once forced to be sent via curl or nghttp proved that
haproxy rejected the request as mandated by the spec. This then led
us to conclude that firefox was hit by the same problem, which in fact
isn't the case as you just found.

Thus we're indeed back to first round trying to figure why firefox+haproxy
overthere do not cope well with h2 (given that it doesn't even work with
H1 on my machine and that "ps auxw" clearly shows some buffer overflows
affecting the argument strings, so I have zero trust at all in this
version for now).

Cheers,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lukas Tribus

Hello,


On Thu, Dec 28, 2017 at 10:26 PM, Lucas Rolff  wrote:
>> the output of the http2 golang test and can you please both clarify which OS 
>> you reproduce this on?
>
> If I visit http2 golang test, I also don’t see it, and I saw it in developer 
> tools (Because dev tools shouldn’t put headers that isn’t requested/received) 
> – however based on your findings, that seem to be the case.
>
> What I find odd, is that Firefox (Together with Safari, and Webkit) all have 
> the same behaviour, however, I’m unable to reproduce it in Chrome and Opera.
> I can reproduce the error in nghttp and curl when using -H “Connection: 
> keep-alive” – omitting the header makes the request work in nghttp and curl 
> as well (as expected).
>
> However, are we sure that the http2 golang doesn’t just ignore the header (or 
> even removes it?)

You'r right, it does not show up - yet another useless tool:
$ curl -k --http2 https://http2.golang.org/reqinfo -H "Connection: keep-alive"



> I found that Firefox actually has a way to enable HTTP Logging to get info 
> about what’s going on – it can be enabled by going to about:networking#logging
>
> What I did is to first to take a sample where it doesn’t fail (now in this 
> case for a GET request), figure out what it does specific to my “app.css” 
> file, and then another request (retrying 20-30 times), until it failed, and 
> then comparing what differs.

This is log is just as useless as the developer tools, just look at
it, in the main thread everything is HTTP/1.1 in both not-working.txt
and Working.txt. Then the socket thread takes over and transform the
existing HTTP/1.1 request into a h2 stream. All the debug and
developer focus on the main thread and are therefor useless to debug
this issue.

You cannot rely on browser debugs or developer tools for this one.



> I’m personally sitting on OS X, and I’ve been able to reproduce it on Firefox 
> on Ubuntu 16.04 as well.

Can you triple check that you can reproduce this on Ubuntu? I was
unable, but maybe I didn't try hard enough.


I suggest you guys intercept the stream with SSLKEYLOGFILE in
wireshark and take a look at the actual real request, see:

https://jimshaver.net/2015/02/11/decrypting-tls-browser-traffic-with-wireshark-the-easy-way/




Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff

> the output of the http2 golang test and can you please both clarify which OS 
> you reproduce this on?

If I visit http2 golang test, I also don’t see it, and I saw it in developer 
tools (Because dev tools shouldn’t put headers that isn’t requested/received) – 
however based on your findings, that seem to be the case.

What I find odd, is that Firefox (Together with Safari, and Webkit) all have 
the same behaviour, however, I’m unable to reproduce it in Chrome and Opera.
I can reproduce the error in nghttp and curl when using -H “Connection: 
keep-alive” – omitting the header makes the request work in nghttp and curl as 
well (as expected).

However, are we sure that the http2 golang doesn’t just ignore the header (or 
even removes it?)

I found that Firefox actually has a way to enable HTTP Logging to get info 
about what’s going on – it can be enabled by going to about:networking#logging 

What I did is to first to take a sample where it doesn’t fail (now in this case 
for a GET request), figure out what it does specific to my “app.css” file, and 
then another request (retrying 20-30 times), until it failed, and then 
comparing what differs.

When it doesn’t work, it does a “BeginConnect”, some “ResolveProxy” and 
“OnProxyAvailable” function calls, it seems like establishing a connection - if 
that’s the case, it would be odd, since the TCP connection should already be 
established during the initial page request.

From the logs:

Dec 28 21:22:35 localhost haproxy[1534]: 80.61.160.xxx:64921 
[28/Dec/2017:21:22:12.309] https_frontend~ https_frontend/ 
-1/-1/-1/-1/22978 400 0 - - CR-- 1/1/0/0/0 0/0 ""
Dec 28 21:22:40 localhost haproxy[1534]: 80.61.160.xxx:64972 
[28/Dec/2017:21:22:35.329] https_frontend~ cdn-backend/mycdn 0/0/1/0/5001 200 
995 - -  1/1/0/1/0 0/0 "GET /js/app.js?v=1 HTTP/1.1"

This could explain the additional calls which looks like a connection, due to 
the “CR” states:

C : the TCP session was unexpectedly aborted by the client.
R : the proxy was waiting for a complete, valid REQUEST from the client
(HTTP mode only). Nothing was sent to any server.

Why haproxy sees the TCP session as aborted from the client, I’m not sure.

The working example also does actual stuff in the socket thread, where the 
non-working one doesn’t really do much, other than generating headers (last two 
lines)

The output is here: 
https://gist.github.com/lucasRolff/c7f25c93281715c3911d36e9488b111a

> and can you please both clarify which OS you reproduce this on?

I’m personally sitting on OS X, and I’ve been able to reproduce it on Firefox 
on Ubuntu 16.04 as well.

On 28/12/2017, 21.19, "lu...@ltri.eu on behalf of Lukas Tribus"  
wrote:

Hello,

On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus  wrote:
> Hello,
>
>
>> But in this example, you're using HTTP/1.1, The "Connection" header is
>> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
>> inconsistency here.
>
> For me a request like this:
> $ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
> -d "bla=bla"
>
> Fired multiple times from the shell, leads to a "400 Bad Request"
> response in about 20 ~ 30 % of the cases and is forwarded to the
> backend in other cases.
> I'm unable to reproduce a "400 Bad Request" when using GET request in
> my quick tests.
>
>
>
> Here 2 exact same requests with different haproxy behavior:

My previous mail proves that haproxy's behavior is inconsistent.

However I am unable to reproduce the issue with Firefox: none of the
quantum releases (57.0, 57.0.1, 57.0.2, 57.0.3) emit a connection
header in my testing:

- https://http2.golang.org/reqinfo never shows a connection header
(not even with POST)
- sniffing with wiresshark (using SSLKEYLOGFILE) also shows that
Firefox never emits a connection header in H2
- the developer tools *always* show a connection header in the
request, although there really isn't one - clearly there is a
discrepancy between what is transmitted on the wire and what is shown
on in dev tools

What am I missing? Can you guys provide a decrypted trace showing this
behavior, the output of the http2 golang test and can you please both
clarify which OS you reproduce this on?

Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lukas Tribus

Hello,



On Thu, Dec 28, 2017 at 12:29 PM, Lukas Tribus  wrote:
> Hello,
>
>
>> But in this example, you're using HTTP/1.1, The "Connection" header is
>> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
>> inconsistency here.
>
> For me a request like this:
> $ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
> -d "bla=bla"
>
> Fired multiple times from the shell, leads to a "400 Bad Request"
> response in about 20 ~ 30 % of the cases and is forwarded to the
> backend in other cases.
> I'm unable to reproduce a "400 Bad Request" when using GET request in
> my quick tests.
>
>
>
> Here 2 exact same requests with different haproxy behavior:

My previous mail proves that haproxy's behavior is inconsistent.


However I am unable to reproduce the issue with Firefox: none of the
quantum releases (57.0, 57.0.1, 57.0.2, 57.0.3) emit a connection
header in my testing:

- https://http2.golang.org/reqinfo never shows a connection header
(not even with POST)
- sniffing with wiresshark (using SSLKEYLOGFILE) also shows that
Firefox never emits a connection header in H2
- the developer tools *always* show a connection header in the
request, although there really isn't one - clearly there is a
discrepancy between what is transmitted on the wire and what is shown
on in dev tools

What am I missing? Can you guys provide a decrypted trace showing this
behavior, the output of the http2 golang test and can you please both
clarify which OS you reproduce this on?



Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lukas Tribus

Hello,


> But in this example, you're using HTTP/1.1, The "Connection" header is
> perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
> inconsistency here.

For me a request like this:
$ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
-d "bla=bla"

Fired multiple times from the shell, leads to a "400 Bad Request"
response in about 20 ~ 30 % of the cases and is forwarded to the
backend in other cases.
I'm unable to reproduce a "400 Bad Request" when using GET request in
my quick tests.



Here 2 exact same requests with different haproxy behavior:

$ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
-d "bla=bla"
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=temp.lan.ltri.eu
*  start date: May  4 16:35:00 2017 GMT
*  expire date: Aug  2 16:35:00 2017 GMT
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify result: unable to get local issuer
certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0

* Using Stream ID: 1 (easy handle 0x5e63f0)
> POST /111 HTTP/2
> Host: localhost
> User-Agent: curl/7.56.1
> Accept: */*
> Connection: keep-alive
> Content-Length: 7
> Content-Type: application/x-www-form-urlencoded
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* We are completely uploaded and fine
< HTTP/2 200
< server: nginx
< date: Thu, 28 Dec 2017 11:25:15 GMT
< content-type: text/html; charset=utf-8
< x-powered-by: PHP/5.3.10-1ubuntu3.26
< set-cookie: HAPTESTa=1514460315
< set-cookie: HAPTESTb=1514460315
< set-cookie: HAPTESTc=1514460315
< set-cookie: HAPTESTd=1514460315
<



a browser has no cookie




Please wait: GET parameter expectValue not found; resetting now ...


Reset
source-code (PHP)




* Connection #0 to host localhost left intact



$ curl -kv --http2 https://localhost/111 -H "Connection: keep-alive"
-d "bla=bla"
*   Trying ::1...
* TCP_NODELAY set
* Connected to localhost (::1) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=temp.lan.ltri.eu
*  start date: May  4 16:35:00 2017 GMT
*  expire date: Aug  2 16:35:00 2017 GMT
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify result: unable to get local issuer
certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0

* Using Stream ID: 1 (easy handle 0x12f63f0)
> POST /111 HTTP/2
> Host: localhost
> User-Agent: curl/7.56.1
> Accept: */*
> Connection: keep-alive
> Content-Length: 7
> Content-Type: application/x-www-form-urlencoded
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* We are completely uploaded and fine
< HTTP/2 400
< cache-control: no-cache
< content-type: text/html
<
400 Bad request
Your browser sent an invalid request.

* Connection #0 to host localhost left intact

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff

>Did I get it right, according to the spec, the "Connection"-header is 
> forbidden ("MUST NOT"), still, firefox does send it? This leads to the 
> described issue.

I think it indeed might be the root cause, also for failed GET requests (which 
only seems to happen sometimes?) but it’s really visible with PUT and POST 
requests.

I’ve opened https://bugzilla.mozilla.org/show_bug.cgi?id=1427256 - so if by any 
chance you can go comment with a “I face the same issue”, then Firefox might 
pick it up faster.

>Firefox sends "Connection: keep-alive" while Chrome does not.

Correct, however – it seems like Safari also sends it, so in fact I have to 
open a bug report to Safari as well (

Best Regards,
Lucas Rolff

On 28/12/2017, 12.08, "Maximilian Böhm"  wrote:

Sorry, for my long absence. Thank you, Lucas, for perfectly describing and 
digging into the issue. I'll be here if there is any further assistance 
required.  

Did I get it right, according to the spec, the "Connection"-header is 
forbidden ("MUST NOT"), still, firefox does send it? This leads to the 
described issue.

Just checked it on https://http2.golang.org/.
Firefox sends "Connection: keep-alive" while Chrome does not.

>> I'd rather not fall into such idiocies, you see.
Thanks 😊 whereby, I'd rather prefer such idiocies instead installing 
plugins without asking users (well, that's another topic, I guess.. 
https://www.theverge.com/2017/12/16/16784628/mozilla-mr-robot-arg-plugin-firefox-looking-glass
 )


-Ursprüngliche Nachricht-
Von: Lucas Rolff [mailto:lu...@lucasrolff.com] 
Gesendet: Donnerstag, 28. Dezember 2017 11:27
An: Willy Tarreau 
Cc: haproxy@formilux.org
Betreff: Re: HTTP/2 Termination vs. Firefox Quantum

> It's normal then, as it's mandated by the HTTP/2 spec to reject 
> requests containing any connection-specific header fields

In that case, haproxy should be consistent in it’s way of handling clients 
sending connection-specific headers:

$ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: 
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 
Firefox/57.0' --compressed -H 'Connection: keep-alive' -o /dev/null -vvv
  % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
 Dload  Upload   Total   SpentLeft  
Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:--
 0*   Trying 178.63.183.40...
* TCP_NODELAY set
* Connected to dashboard.domain.com (178.63.183.xxx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: 
ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [93 bytes data]
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:--
 0* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3000 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
> GET /js/app.js?v=1 HTTP/1.1
> Host: dashboard.domain.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) 
> Gecko/20100101 Firefox/57.0
> Connection: keep-alive
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.5
< Date: Thu, 28 Dec 2017 10:11:34 GMT
< Content-Type: application/javascript; charset=utf-8 < Last-Modified: Sun, 
25 Jun 2017 17:17:05 GMT < Transfer-Encoding: chunked < Vary: Accept-Encoding < 
ETag: W/"594ff011-7b

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff

Sorry regarding my previous curl – I didn’t use --http2 in my curl request, but 
result is the same (with negotiated http2 protocol), I’ve removed the TLSv1.2 
output since it’s useless in this case:

===

$ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: Mozilla/5.0 
(Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 Firefox/57.0' 
--compressed -H 'Connection: keep-alive' -vo /dev/null --http2
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
  Trying 178.63.183.xx...
* TCP_NODELAY set
* Connected to dashboard.domain.com (178.63.183.xx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7f860b005800)
> GET /js/app.js?v=1 HTTP/2
> Host: dashboard.domain.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) 
> Gecko/20100101 Firefox/57.0
> Connection: keep-alive
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200
< server: nginx/1.13.5
< date: Thu, 28 Dec 2017 10:43:31 GMT
< content-type: application/javascript; charset=utf-8
< last-modified: Sun, 25 Jun 2017 17:17:05 GMT
< vary: Accept-Encoding
< etag: W/"594ff011-7b7"
< content-encoding: gzip
<
{ [683 bytes data]
100   6830   6830 0   3749  0 --:--:-- --:--:-- --:--:--  3752
* Connection #0 to host dashboard.domain.com left intact

===

So as you can see, I’m sending a “Connection: keep-alive” request from the 
client (like Firefox does), protocol is http2, and the response is the 
javascript file I’ve requested.

And when it’s requested in Firefox: 
https://snaps.hcdn.dk/3uaT06s2RJmAqMu5TqSJYAxBHjSzHOJGiHjfK0qcrV.png 

Best Regards,
Lucas Rolff


On 28/12/2017, 11.39, "Willy Tarreau"  wrote:

On Thu, Dec 28, 2017 at 10:27:28AM +, Lucas Rolff wrote:
> In that case, haproxy should be consistent in it's way of handling clients
> sending connection-specific headers:
> 
> $ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: 
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 
Firefox/57.0' --compressed -H 'Connection: keep-alive' -o /dev/null -vvv
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
Current
>  Dload  Upload   Total   SpentLeft  
Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:--  
   0*   Trying 178.63.183.40...
> * TCP_NODELAY set
> * Connected to dashboard.domain.com (178.63.183.xxx) port 443 (#0)
> * ALPN, offering h2
> * ALPN, offering http/1.1
> * Cipher selection: 
ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
> * successfully set certificate verify locations:
> *   CAfile: /etc/ssl/cert.pem
>   CApath: none
> * TLSv1.2 (OUT), TLS handshake, Client hello (1):
> } [512 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server hello (2):
> { [93 bytes data]
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:--  
   0* TLSv1.2 (IN), TLS handshake, Certificate (11):
> { [3000 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
> { [333 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server finished (14):
> { [4 bytes data]
> * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
> } [70 bytes data]
> * TLSv1.2 (OUT), TLS change cipher, Client hello (1):
> } [1 bytes data]
> * TLSv1.2 (OUT), TLS handshake, Finished (20):
> } [16 bytes data]
> * TLSv1.2 (IN), TLS change cipher, Client hello (1):
> { [1 bytes data]
> * TLSv1.2 (IN), TLS handshake, Finished (20):
> { [16 bytes data]
> * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
> * ALPN, server did not agree to a protocol
> * Server certificate:
> *  subject: OU=Domain Control Validated; CN=*.domain.com
> *  start date: Jan  3 11:17:55 2017 GMT
> *  expire date: Jan  4 11:17:55 2018

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Willy Tarreau

On Thu, Dec 28, 2017 at 10:27:28AM +, Lucas Rolff wrote:
> In that case, haproxy should be consistent in it's way of handling clients
> sending connection-specific headers:
> 
> $ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: 
> Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 
> Firefox/57.0' --compressed -H 'Connection: keep-alive' -o /dev/null -vvv
>   % Total% Received % Xferd  Average Speed   TimeTime Time  
> Current
>  Dload  Upload   Total   SpentLeft  Speed
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 
> 0*   Trying 178.63.183.40...
> * TCP_NODELAY set
> * Connected to dashboard.domain.com (178.63.183.xxx) port 443 (#0)
> * ALPN, offering h2
> * ALPN, offering http/1.1
> * Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
> * successfully set certificate verify locations:
> *   CAfile: /etc/ssl/cert.pem
>   CApath: none
> * TLSv1.2 (OUT), TLS handshake, Client hello (1):
> } [512 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server hello (2):
> { [93 bytes data]
>   0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 
> 0* TLSv1.2 (IN), TLS handshake, Certificate (11):
> { [3000 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server key exchange (12):
> { [333 bytes data]
> * TLSv1.2 (IN), TLS handshake, Server finished (14):
> { [4 bytes data]
> * TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
> } [70 bytes data]
> * TLSv1.2 (OUT), TLS change cipher, Client hello (1):
> } [1 bytes data]
> * TLSv1.2 (OUT), TLS handshake, Finished (20):
> } [16 bytes data]
> * TLSv1.2 (IN), TLS change cipher, Client hello (1):
> { [1 bytes data]
> * TLSv1.2 (IN), TLS handshake, Finished (20):
> { [16 bytes data]
> * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
> * ALPN, server did not agree to a protocol
> * Server certificate:
> *  subject: OU=Domain Control Validated; CN=*.domain.com
> *  start date: Jan  3 11:17:55 2017 GMT
> *  expire date: Jan  4 11:17:55 2018 GMT
> *  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
> *  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
> *  SSL certificate verify ok.
> > GET /js/app.js?v=1 HTTP/1.1
> > Host: dashboard.domain.com
> > Accept: */*
> > Accept-Encoding: deflate, gzip
> > User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) 
> > Gecko/20100101 Firefox/57.0
> > Connection: keep-alive
> >
> < HTTP/1.1 200 OK
> < Server: nginx/1.13.5
> < Date: Thu, 28 Dec 2017 10:11:34 GMT
> < Content-Type: application/javascript; charset=utf-8
> < Last-Modified: Sun, 25 Jun 2017 17:17:05 GMT
> < Transfer-Encoding: chunked
> < Vary: Accept-Encoding
> < ETag: W/"594ff011-7b7"
> < Content-Encoding: gzip
> <
> { [695 bytes data]
> 100   6830   6830 0   3936  0 --:--:-- --:--:-- --:--:--  3925
> * Connection #0 to host dashboard.domain.com left intact
> 
> 
> Making a GET request (and sending Connection: keep-alive) continues to work,
> and haproxy doesn't handle it as malformed according to the specification.

But in this example, you're using HTTP/1.1, The "Connection" header is
perfectly valid for 1.1. It's HTTP/2 which forbids it. There is no
inconsistency here.

> I agree that if the specification says it cannot be done, it shouldn't, but
> then at least it should be consistent.

It *is*, the code makes no distinction based on the request. Any of the
following headers coming from a decoded HTTP/2 request will make it
immediately fail : connection, proxy-connection, keep-alive, upgrade,
transfer-encoding.

> Fyi, in that case - nginx isn't compliant with the http2 specification.

I find this a bit strange, maybe they had to relax it during an early
implementation to accomodate other incompatible ones.

> I'm creating a bug report with Mozilla to see if they can change the
> behaviour of their browser to not send a Connection header at all, maybe this
> will resolve the issue.

OK thank you.

> > I still have no idea what this "quantum" is by the way ;-)
> 
> It's the 2017 version of "We released a better version of Firefox that is
> faster".. Let's call it Firefox Quantum.

Ah, so they changed numbering and naming again and again ? It becomes
quite painful to follow. How one may consider that Quantum is greater
than 52 for me is equivalent to saying that Monday is greater than blue.
When one will think about using a count down of the estimated life left
to their project to name them ? You could have version 2 billion to
indicate that you estimate 60 years left for example.

> Haproxy 2.0 could be named "haproxy Quantum 1.5" just for the giggles.

I'd rather not fall into such idiocies, you see.

Cheers,
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff

> It's normal then, as it's mandated by the HTTP/2 spec to reject requests 
> containing any connection-specific header fields 

In that case, haproxy should be consistent in it’s way of handling clients 
sending connection-specific headers:

$ curl 'https://dashboard.domain.com/js/app.js?v=1' -H 'User-Agent: Mozilla/5.0 
(Macintosh; Intel Mac OS X 10.13; rv:57.0) Gecko/20100101 Firefox/57.0' 
--compressed -H 'Connection: keep-alive' -o /dev/null -vvv
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
  Trying 178.63.183.40...
* TCP_NODELAY set
* Connected to dashboard.domain.com (178.63.183.xxx) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/cert.pem
  CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [512 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [93 bytes data]
  0 00 00 0  0  0 --:--:-- --:--:-- --:--:-- 0* 
TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [3000 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: OU=Domain Control Validated; CN=*.domain.com
*  start date: Jan  3 11:17:55 2017 GMT
*  expire date: Jan  4 11:17:55 2018 GMT
*  subjectAltName: host "dashboard.domain.com" matched cert's "*.domain.com"
*  issuer: C=BE; O=GlobalSign nv-sa; CN=AlphaSSL CA - SHA256 - G2
*  SSL certificate verify ok.
> GET /js/app.js?v=1 HTTP/1.1
> Host: dashboard.domain.com
> Accept: */*
> Accept-Encoding: deflate, gzip
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:57.0) 
> Gecko/20100101 Firefox/57.0
> Connection: keep-alive
>
< HTTP/1.1 200 OK
< Server: nginx/1.13.5
< Date: Thu, 28 Dec 2017 10:11:34 GMT
< Content-Type: application/javascript; charset=utf-8
< Last-Modified: Sun, 25 Jun 2017 17:17:05 GMT
< Transfer-Encoding: chunked
< Vary: Accept-Encoding
< ETag: W/"594ff011-7b7"
< Content-Encoding: gzip
<
{ [695 bytes data]
100   6830   6830 0   3936  0 --:--:-- --:--:-- --:--:--  3925
* Connection #0 to host dashboard.domain.com left intact


Making a GET request (and sending Connection: keep-alive) continues to work, 
and haproxy doesn’t handle it as malformed according to the specification.
So not sure if POST/PUT is handled differently within haproxy when it comes to 
the connection header compared to GET requests.

I agree that if the specification says it cannot be done, it shouldn’t, but 
then at least it should be consistent.
Fyi, in that case – nginx isn’t compliant with the http2 specification.

I’m creating a bug report with Mozilla to see if they can change the behaviour 
of their browser to not send a Connection header at all, maybe this will 
resolve the issue.

> I still have no idea what this "quantum" is by the way ;-)

It’s the 2017 version of “We released a better version of Firefox that is 
faster”.. Let’s call it Firefox Quantum.

Haproxy 2.0 could be named “haproxy Quantum 1.5” just for the giggles.

Best Regards,
Lucas R

On 28/12/2017, 10.27, "Willy Tarreau"  wrote:

Hi Lucas,

On Thu, Dec 28, 2017 at 08:38:52AM +, Lucas Rolff wrote:
> It worked as it should, so I started adding more and more headers, until I
> hit the culprit: -H "Connection: keep-alive" or -H "Connection: close" (or
> even "Connection: test")

(...)

It's normal then, as it's mandated by the HTTP/2 spec to reject requests
containing any connection-specific header fields (Connection being the
first one) :

8.1.2.2.  Connection-Specific Header Fields

   HTTP/2 does not use the Connection header field to indicate
   connection-specific header fields; in this protocol, connection-
   specific metadata is conveyed by other means.  An endpoint MUST NOT
   generate an HTTP/2 message containing connection-specific header
   fields; any message containing connection-specific header fields MUST
   be treated as malformed (Section 8.1.2.6).

> I tried to replicate the issue in haproxy version 1.8.1, 1.8.2 and latest
> commit from master - all with the same result, I also tried playing around
> with the op

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Willy Tarreau

Hi Lucas,

On Thu, Dec 28, 2017 at 08:38:52AM +, Lucas Rolff wrote:
> It worked as it should, so I started adding more and more headers, until I
> hit the culprit: -H "Connection: keep-alive" or -H "Connection: close" (or
> even "Connection: test")

(...)

It's normal then, as it's mandated by the HTTP/2 spec to reject requests
containing any connection-specific header fields (Connection being the
first one) :

8.1.2.2.  Connection-Specific Header Fields

   HTTP/2 does not use the Connection header field to indicate
   connection-specific header fields; in this protocol, connection-
   specific metadata is conveyed by other means.  An endpoint MUST NOT
   generate an HTTP/2 message containing connection-specific header
   fields; any message containing connection-specific header fields MUST
   be treated as malformed (Section 8.1.2.6).

> I tried to replicate the issue in haproxy version 1.8.1, 1.8.2 and latest
> commit from master - all with the same result, I also tried playing around
> with the options of forceclose, http-server-close etc on both the frontend
> and backend in haproxy, none of them seem to "fix" the issue.

That's normal, you don't even reach this step, as it dies while decompressing
the request (right after hpack decoding just before conversion from H2 to H1).

> However in 1.8.2 I have 100% chance of replicating it using post requests in
> Firefox and nghttp, where in 1.8.1 the issue in the majority of the time
> works in Firefox and only have the few percentage failure rate.

There were so many state issues with 1.8.1 that it's not much surprizing, it's
possible that some of them would fail differently.

> I haven't been able to replicate the issue in other than Firefox and nghttp -

Clearly this means that this has to be reported to the Firefox team, as it's
expected to break basically everywhere (or to help detect non-compliant
servers). I still have no idea what this "quantum" is by the way ;-)

> Also, sorry for the lengthy email

Quite the opposite, it was extremely helpful in spotting the problem's
origin.

Thanks!
Willy

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-28 Thread Lucas Rolff

ther), and the other 
browsers has a more “slacking” approach towards handling possible compression 
issues, or some kind of error correction – I’m not sure – but it seems to be 
related to how headers are compressed, but only when using “keep-alive”, 
“close”, etc.

Simply removing the header (in nghttp), resolves the issue – since I cannot 
remove the “connection” header in Firefox, I’m not sure if it will actually fix 
it there as well.

Also, sorry for the lengthy email

Best Regards,
Lucas Rolff

From: Lucas Rolff 
Date: Wednesday, 27 December 2017 at 23.08
To: Lukas Tribus 
Cc: "haproxy@formilux.org" 
Subject: Re: HTTP/2 Termination vs. Firefox Quantum

My small site is basically a html page with two pages (home and about for 
example), each page contains basic markup, some styling and some JavaScript, 
switching pages tends to replicate the issue every now and then (differs a bit 
how often it happens, but possibly every 20-30 request or so)

I’m single user being able to replicate the issue, no other traffic than myself

So the test scenario is fairly easy to replicate in that sense

Tomorrow I’ll check if I can replicate the same issue in other browsers as well

I haven’t been able to replicate it with curl yet and haven’t tried with 
nghttp, I’ll continue to troubleshoot meanwhile, but it’s a bit odd it happens

Best regards,

Get Outlook for iOS<https://aka.ms/o0ukef>

From: lu...@ltri.eu  on behalf of Lukas Tribus 
Sent: Wednesday, December 27, 2017 10:51:01 PM
To: Lucas Rolff
Cc: haproxy@formilux.org
Subject: Re: HTTP/2 Termination vs. Firefox Quantum

Hello Lucas,

On Wed, Dec 27, 2017 at 9:24 PM, Lucas Rolff  wrote:
> Can't even compose an email correctly..
>
> So:
>
> I experience the same issue however with nginx as a backend.
>
> I tried enabling “option httplog” within my frontend, it's rather easy for
> me to replicate, it affects a few percent of the traffic.

So you have this html endpoint and you hit F5 in FF Quantum until you
can see the issue or how is it that you actually reproduce? Does this
occur in a idle test environment as well, or do you need production
traffic to hit this issue?

> I have a site, with a total of 3 requests being performed:
>
> -  The HTML itself
> - 1 app.css file
> - 1 app.js file

Please clarify:

- if any of those responses are cached (or if they are uncachable) and
if they use any kind of revalidation (If-modified-since --> 304)
- if any of those files are compressed, by haproxy or nginx, and which
compression is used
- the exact uncompressed content-length of each of those responses
- the exact client OS
- is Quantum a 32 or 64 bit executable on the client?
- is haproxy a 32 or 64 bit executable?
- can you run this repro in debug mode on show the output when the issue occurs?
- RTT and bandwidth between server and client (race conditions may
depend on specific network performance - not every issue is
reproducible on localhost)
- confirm that FF sandboxing is not affecting this issue by lowering
security.sandbox.content.level to 2 or 0 in about:config (then restart
FF) - don't forget to turn it back on

Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-27 Thread Lucas Rolff

My small site is basically a html page with two pages (home and about for 
example), each page contains basic markup, some styling and some JavaScript, 
switching pages tends to replicate the issue every now and then (differs a bit 
how often it happens, but possibly every 20-30 request or so)

I’m single user being able to replicate the issue, no other traffic than myself

So the test scenario is fairly easy to replicate in that sense

Tomorrow I’ll check if I can replicate the same issue in other browsers as well

I haven’t been able to replicate it with curl yet and haven’t tried with 
nghttp, I’ll continue to troubleshoot meanwhile, but it’s a bit odd it happens

Best regards,

Get Outlook for iOS<https://aka.ms/o0ukef>

From: lu...@ltri.eu  on behalf of Lukas Tribus 
Sent: Wednesday, December 27, 2017 10:51:01 PM
To: Lucas Rolff
Cc: haproxy@formilux.org
Subject: Re: HTTP/2 Termination vs. Firefox Quantum

Hello Lucas,



On Wed, Dec 27, 2017 at 9:24 PM, Lucas Rolff  wrote:
> Can't even compose an email correctly..
>
> So:
>
> I experience the same issue however with nginx as a backend.
>
> I tried enabling “option httplog” within my frontend, it's rather easy for
> me to replicate, it affects a few percent of the traffic.

So you have this html endpoint and you hit F5 in FF Quantum until you
can see the issue or how is it that you actually reproduce? Does this
occur in a idle test environment as well, or do you need production
traffic to hit this issue?



> I have a site, with a total of 3 requests being performed:
>
> -  The HTML itself
> - 1 app.css file
> - 1 app.js file

Please clarify:

- if any of those responses are cached (or if they are uncachable) and
if they use any kind of revalidation (If-modified-since --> 304)
- if any of those files are compressed, by haproxy or nginx, and which
compression is used
- the exact uncompressed content-length of each of those responses
- the exact client OS
- is Quantum a 32 or 64 bit executable on the client?
- is haproxy a 32 or 64 bit executable?
- can you run this repro in debug mode on show the output when the issue occurs?
- RTT and bandwidth between server and client (race conditions may
depend on specific network performance - not every issue is
reproducible on localhost)
- confirm that FF sandboxing is not affecting this issue by lowering
security.sandbox.content.level to 2 or 0 in about:config (then restart
FF) - don't forget to turn it back on




Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-27 Thread Lukas Tribus

Hello Lucas,



On Wed, Dec 27, 2017 at 9:24 PM, Lucas Rolff  wrote:
> Can't even compose an email correctly..
>
> So:
>
> I experience the same issue however with nginx as a backend.
>
> I tried enabling “option httplog” within my frontend, it's rather easy for
> me to replicate, it affects a few percent of the traffic.

So you have this html endpoint and you hit F5 in FF Quantum until you
can see the issue or how is it that you actually reproduce? Does this
occur in a idle test environment as well, or do you need production
traffic to hit this issue?



> I have a site, with a total of 3 requests being performed:
>
> -  The HTML itself
> - 1 app.css file
> - 1 app.js file

Please clarify:

- if any of those responses are cached (or if they are uncachable) and
if they use any kind of revalidation (If-modified-since --> 304)
- if any of those files are compressed, by haproxy or nginx, and which
compression is used
- the exact uncompressed content-length of each of those responses
- the exact client OS
- is Quantum a 32 or 64 bit executable on the client?
- is haproxy a 32 or 64 bit executable?
- can you run this repro in debug mode on show the output when the issue occurs?
- RTT and bandwidth between server and client (race conditions may
depend on specific network performance - not every issue is
reproducible on localhost)
- confirm that FF sandboxing is not affecting this issue by lowering
security.sandbox.content.level to 2 or 0 in about:config (then restart
FF) - don't forget to turn it back on




Thanks,
Lukas

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-27 Thread Lucas Rolff


Can't even compose an email correctly..

So:

I experience the same issue however with nginx as a backend.

I tried enabling “option httplog” within my frontend, it's rather easy for me 
to replicate, it affects a few percent of the traffic.

I have a site, with a total of 3 requests being performed:

-  The HTML itself
- 1 app.css file
- 1 app.js file

When checking the console on the browser itself (Firefox in this example), Protocol is 
empty, remote address is "unknown" and Size is 0 bytes for the app.css file

For the app.js file, the protocol is +h2 (where others are seen as 
HTTP/2.0+h2), the size of the response is 1.93 KB (same as original file) and 
the content is actually in the response as expected - however, no status code 
gets returned as well ( https://snaps.hcdn.dk/dWf981KBC0pGZfovP7se.png )

In the haproxy log I see following:

Dec 27 20:12:18 localhost haproxy[27270]: 80.61.160.xxx:50011 [27/Dec/2017:20:12:01.692] 
https_frontend~ https_frontend/  -1/-1/-1/-1/16796 400 0 - - CR-- 1/1/0/0/0 0/0 
""
Dec 27 20:12:18 localhost haproxy[27270]: 80.61.160.xxx:50047 [27/Dec/2017:20:12:18.531] 
https_frontend~ cdn-backend/mycdn 0/0/0/1/1 200 2242 - -  1/1/0/1/0 0/0 "GET 
/js/app.js?v=1 HTTP/1.1"

So app.css gets the  with a CR-- and a
app.js then again works as it should, however when it arrives at the browser, 
it doesn't seem to contain all the required information (I'd assume some 
headers are missing, due to the status code, and remote address is missing).

On the backend server itself, it receives the request for app.js, but not 
app.css

I'm not really sure how to make strace generate sane output for haproxy, all I 
see when stracing processes is something like:
\27\3\3@\30\330\330n\341\276\205\240o

So won't be able to dig deep into whatever goes on there.


Lucas Rolff wrote:

I tried enabling “option httplog” within my frontend, I do have the same issue 
wit

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-27 Thread Lucas Rolff

I tried enabling “option httplog” within my frontend, I do have the same issue 
wit

Re: HTTP/2 Termination vs. Firefox Quantum

2017-12-21 Thread Vincent Bernat

 ❦ 21 décembre 2017 09:00 GMT, Maximilian Böhm  :

> We are using HA-Proxy version 1.8.1-1~bpo8+1 2017/12/04 on Debian 8. On the 
> backend, jetty 9.3.11.v20160721 with http/1.1 answers requests.
>
> Since I've enabled http/2 ("alpn h2,http/1.1"), we are facing issues with 
> Firefox Quantum both, on windows 10 and macOS. I do not have any complaints 
> regarding other browsers (yet?). Requested HTML pages are delivered empty or 
> even cut in the middle. There is no recurring pattern, it's like a lottery, 
> still, very seldom.. The yet simple but not satisfiable solution is to 
> restart the browser.
>
> I know the provided information is quite spare, so my question is
> actually, if there Is there any guideline I can follow to provide you
> more information? I've appended some snippets of the proxy
> configuration.

Also provide "haproxy -vv". I believe in your case, this is:

HA-Proxy version 1.8.1-1~bpo8+1 2017/12/04
Copyright 2000-2017 Willy Tarreau 

Build options :
  TARGET  = linux2628
  CPU = generic
  CC  = gcc
  CFLAGS  = -g -O2 -fPIE -fstack-protector-strong -Wformat 
-Werror=format-security -D_FORTIFY_SOURCE=2
  OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_REGPARM=1 USE_OPENSSL=1 USE_LUA=1 
USE_SYSTEMD=1 USE_PCRE=1 USE_PCRE_JIT=1 USE_NS=1

Default settings :
  maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200

Built with OpenSSL version : OpenSSL 1.0.2l  25 May 2017
Running on OpenSSL version : OpenSSL 1.0.2l  25 May 2017
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.3
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT 
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.35 2014-04-04
Running on PCRE version : 8.35 2014-04-04
PCRE library supports JIT : yes
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"), deflate("deflate"), 
raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.

Available polling systems :
  epoll : pref=300,  test result OK
   poll : pref=200,  test result OK
 select : pref=150,  test result OK
Total: 3 (3 usable), will use epoll.

Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
-- 
Each module should do one thing well.
- The Elements of Programming Style (Kernighan & Plauger)

52 matches

Mail list logo