Hi, I am still investigating why Chromium got stalled connecting through HAProxy. Chromium's connection send window was gradually decreasing over time until it reached zero (and the connection became stalled forever).
I put a logging point in h2c_send_window_update() and compared it with Chromium's net log. I found HAProxy will not send connection-level WINDOW_UPDATE for DATA frames during half-closed local or closed state. What happened looks like this: - HAProxy sends an END_STREAM frame to Chromium, entering half-closed local state. - Later, Chromium sends one or more DATA frames to HAProxy, decreases its connection send window by the size of the DATA frames. - HAProxy sends RSTs, without accounting for those DATA frames or sending WINDOW_UPDATE. The flow-control credits of those DATA frames are thus leaked forever. Accumulating leaks then stall the sender. The RFC has these comments: > [half-closed (local)] Providing flow-control credit using WINDOW_UPDATE > frames is necessary to continue receiving flow-controlled frames. Not very clear on what needs to be done here, but there's more: > [closed] Flow-controlled frames (i.e., DATA) received after sending > RST_STREAM are counted toward the connection flow-control window. Even though > these frames might be ignored, because they are sent before the sender > receives the RST_STREAM, the sender will consider the frames to count against > the flow-control window. Also there is a MUST here. Seems to be saying the same thing: > A receiver that receives a flow-controlled frame MUST always account for its > contribution against the connection flow-control window, unless the receiver > treats this as a connection error (Section 5.4.1). This is necessary even if > the frame is in error. The sender counts the frame toward the flow-control > window, but if the receiver does not, the flow-control window at the sender > and receiver can become different. I think basically it should send connection-level (stream 0) WINDOW_UPDATE frames for all DATA frames. Though it could do it like selective ACK instead of sending one WINDOW_UPDATE per DATA. Right now the code appears to be only counting those DATA frames that are transferred to the h1 side. Testing showed the same. -klzgrad

