Hi,

I am still investigating why Chromium got stalled connecting through
HAProxy. Chromium's connection send window was gradually decreasing
over time until it reached zero (and the connection became stalled
forever).

I put a logging point in h2c_send_window_update() and compared it with
Chromium's net log. I found HAProxy will not send connection-level
WINDOW_UPDATE for DATA frames during half-closed local or closed
state. What happened looks like this:

- HAProxy sends an END_STREAM frame to Chromium, entering half-closed
local state.
- Later, Chromium sends one or more DATA frames to HAProxy, decreases
its connection send window by the size of the DATA frames.
- HAProxy sends RSTs, without accounting for those DATA frames or
sending WINDOW_UPDATE.

The flow-control credits of those DATA frames are thus leaked forever.
Accumulating leaks then stall the sender.

The RFC has these comments:

> [half-closed (local)] Providing flow-control credit using WINDOW_UPDATE 
> frames is necessary to continue receiving flow-controlled frames.

Not very clear on what needs to be done here, but there's more:

> [closed] Flow-controlled frames (i.e., DATA) received after sending 
> RST_STREAM are counted toward the connection flow-control window. Even though 
> these frames might be ignored, because they are sent before the sender 
> receives the RST_STREAM, the sender will consider the frames to count against 
> the flow-control window.

Also there is a MUST here. Seems to be saying the same thing:

> A receiver that receives a flow-controlled frame MUST always account for its 
> contribution against the connection flow-control window, unless the receiver 
> treats this as a connection error (Section 5.4.1).  This is necessary even if 
> the frame is in error. The sender counts the frame toward the flow-control 
> window, but if the receiver does not, the flow-control window at the sender 
> and receiver can become different.

I think basically it should send connection-level (stream 0)
WINDOW_UPDATE frames for all DATA frames. Though it could do it like
selective ACK instead of sending one WINDOW_UPDATE per DATA.

Right now the code appears to be only counting those DATA frames that
are transferred to the h1 side. Testing showed the same.

-klzgrad

Reply via email to