yknoya opened a new pull request, #12910: URL: https://github.com/apache/trafficserver/pull/12910
# Issue In ATS 9.2.x, we observed an increase in 408/504 (timeout) responses for HTTP/2 POST requests. Although simplified, the issue occurs when all of the following conditions are met: 1. Multiple POST requests are sent over a single HTTP/2 connection. 2. ATS fails to connect to the origin during processing and returns a 502 response. 3. When the second condition occurs, the connection-level window size tracked by the client for ATS becomes 0. This issue was resolved when the [HTTP/2 to origin](https://github.com/apache/trafficserver/pull/9366) feature was implemented. Therefore, it reproduces on the 9.2.x branch, but not on the following branches: - master - 10.1.x # Root Cause In conclusion, the issue is caused by a mismatch in the connection-level window size between the client and ATS. The client's tracked window size reaches 0, while ATS internally maintains a value greater than 0. More specifically, when handling multiple POST requests, the timeout occurs through the following sequence: 1. The client sends a HEADERS frame. 2. ATS receives the HEADERS frame. 3. ATS attempts to connect to the origin but fails. 4. ATS sends a 502 response (HEADERS frame) to the client. 5. ATS closes the stream. 6. ATS receives a DATA frame from the client. 7. Since the stream is already closed, ATS sends a RST_STREAM frame to the client. (**The connection-level window size on the ATS side is not decremented.**) 8. Steps 6 and 7 are repeated multiple times, eventually causing the client's tracked window size for ATS to reach 0. 9. The client stops sending DATA frames because the window size has reached 0. 10. **ATS determines that sufficient window size remains and does not send a WINDOW_UPDATE frame.** 11. The client continues waiting for a WINDOW_UPDATE frame and eventually times out. Although the stream has already been closed, the root cause is that ATS does not decrement the connection-level window size when it receives a DATA frame for a closed stream. # Fix In the 10.1.x branch, the timing of the connection-level window size decrement was moved earlier in the processing flow. The same change has been applied to the 9.2.x branch. https://github.com/apache/trafficserver/blob/6d36bdbae25b6e70529085cde702f2a30d233d4f/src/proxy/http2/Http2ConnectionState.cc#L104-L111 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
