Hi Luke,

On Tue, Jan 29, 2019 at 10:06:03AM +0000, Luke Seelenbinder wrote:
> I just pulled, compiled, and tested the newly minted 1.9.3, and I'm
> experiencing the same issue with alpn h2 on the backend definition.

Ah sh*t :-(

> I also
> strongly suspect it's not related to maximum streams per connection, because
> the issue happens well before 1000 requests (and consistently at that).

OK, that's useful info.

> > Perhaps the client causing the issues was a red herring for
> > the server-side bugs.
> 
> I believe after the fixes in 1.9.3, this has actually been proven false. I
> can replicate this bug every single time with the following:
> 
> 1. Make set of requests
> 2. Cancel all or subset of requests
> 3. Make another set of requests

I failed to produce this case, but I'm now figuring that I tried using
curl/h2load/nghttp and that all of these break the connection and the
stream at the same time. The tests I've ran in a browser were involving
a single stream. I'll try to make a dummy HTML page to force the browser
to emit multiple streams over the same connection and see if I can produce
it like this.

By the way, how do you manage to cancel a single stream in the browser ?
Pressing Esc might break all of them I guess ? Thus I'm uncertain how to
achieve this.

> On step 3, every single request fails because something is getting messed up
> by 2, causing the server stream to go away. The log lines are the same
> pattern of C(C|D)-- / SD--.
> 
> Another piece of information, when this happens, Chrome drops this in the
> console, which always correlates to a SD-- line in the haproxy logs:
> 
> Failed to load resource: net::ERR_SPDY_PROTOCOL_ERROR

OK.

> I also just verified this happens under similar circumstances using alpn
> http/1.1 on the backend (this may or may not be new in 1.9.3). 4 requests
> failed on the client side with the following error messages after using the
> same 3 step process (all correlate to a CD-- message in the logs):
> 
> net::ERR_SPDY_PROTOCOL_ERROR
> net::ERR_CONNECTION_CLOSED 200
> net::ERR_CONNECTION_CLOSED 200
> net::ERR_CONNECTION_CLOSED 200

That makes sense, it would then be entirely related to the front connection.
Maybe the RST_STREAM is incorrectly handled and has a side effect on the
connection. The PROTOCOL_ERROR error definitely should not be reported.

> I wonder if HAProxy is interpreting a broken request as a client error and
> going away (but not sending GOAWAY)? I don't know enough about h2 to know if
> this is in the spec or not, but perhaps that's another avenue of
> investigation?

It's very possible, I'll try again with these elements in mind.

> I'm more than happy to help, and while my C is a bit rusty, I'm starting to
> get a feel for the HAProxy source, so I could attempt to debug as well, if
> you have any suggestions in that vein.

It's a bit difficult for me to suggest anything unfortunately. With a
multiplexed protocol, you have a myriad of possible combinations and
the only thing you can do is try to imagine how this or that event could
have a potential impact when mixed with this or that one either on the
same side or on the other side of the mux. My bible here is to always
have RFC7540 opened on my desk to compare the behaviour and sometimes
figure how far small issues can spread :-/

Thanks!
Willy

Reply via email to