Hi Luke,

On Wed, Jan 30, 2019 at 08:41:03AM +0000, Luke Seelenbinder wrote:
> It works! I'm seeing very, very few CD-- -> SD-- chains now. I did see a few
> in h2<->h2 mode, but precious few, so I'm very happy to say the bug as
> previous manifested is remedied! Thanks for digging so wide and deep for this
> bug fix--kinda funny how it ended up being half an if condition on one line.
> . . software development is the best, ain't it? :-)

Excellent, thanks for the report.

> However, that leads to a second tier of less-important questions:
> 
> 1. Should RST_STREAM (I believe that's what happens on request cancellation)
> also reset the full h2 connection?

It should never happen, unless of course the frame is invalid protocol-wise
and results in a protocol error, but in practice it should not happen when
coming from a browser. Thus I suspect we have another case of GOAWAY.

> I'm seeing the following behavior:
> 
> - Ensure connection established. Zoom twice. Set X is canceled. Set Y
> succeeds, but the first request from Y must renegotiate the connection & TLS,
> which definitely results in a performance hit on our side.
> 
> 2. Sometimes a canceled request results in a 403 because the request is
> malformed (PR-- log line); this makes sense in the usual case, but in this
> specific case, I'm not so sure.

403 is strange, it's forbidden. 400 I could understand. What might happen
is the following scenario :
  - the client sends the request
  - the H2 demux decodes it, creates an H2 stream, then creates an
    application-layer stream, attaches them and wakes the latter up
  - the client sends RST_STREAM
  - the demux places an error flag on the stream (it was aborted)
  - the stream is scheduled, reads the decoded request to process it,
    and finds a request plus an error
  -> 400 bad request could be logged to indicate how this stream was
     aborted. It's not much different from what could happen in HTTP/1
     if the client had aborted the connection just before it was started
     to be parsed.

> This would also result in a broken connection
> on the client side, I'm thinking.

It should not, but as you probably know, everything lies in the difference
between "should" and "can".

> Would there be a way to handle RST_STREAM
> different without compromising security reasons for rejecting malformed /
> broken requests?

I'm really not thinking of any case resulting in this but I'll take a
second look.

> Side note: this seems to be a behavior difference between Firefox and Chrome.
> In Chrome I always get PR-- lines. In Firefox I almost always get C(C|D)--
> lines.

So Chrome likely sends RST_STREAM immediately while Firefox silently
cancels the stream and responds with RST_STREAM when it gets the response
back.

> Would it be possible to remove the tie between stream failure and connection
> failures on the client side? That is, would it be possible to allow
> individual stream failures, such is the case with client-side request
> cancellation, while maintaining the integrity of the whole h2 client
> connection? I know this is delving pretty deep into the nuances of h2 vs
> http/1.1 and resulting issues and may not even be a server-side solution, but
> I figured it's worth asking while the code is already in your head. :-)

Don't get me wrong, I'm extremely careful about killing streams only
as much as possible. There's even an h2spec failure due to this, where
it expects a connection error, but where doing so would result in too
many streams being caught causing this to happen. So as long as we can
identify the exact sequence which causes this, I'm obviously interested
in studying it!

> Thanks again for this fix! I will roll it out to our servers today after a
> bit more testing, since the incidence of bad results is negligible and this
> definitely should improve our e2e timings.

Great, looking forward to hearing about your new findings!

Thanks,
Willy

Reply via email to