yknoya opened a new pull request, #12910:
URL: https://github.com/apache/trafficserver/pull/12910

   # Issue
   In ATS 9.2.x, we observed an increase in 408/504 (timeout) responses for 
HTTP/2 POST requests.
   Although simplified, the issue occurs when all of the following conditions 
are met:
   1. Multiple POST requests are sent over a single HTTP/2 connection.
   2. ATS fails to connect to the origin during processing and returns a 502 
response.
   3. When the second condition occurs, the connection-level window size 
tracked by the client for ATS becomes 0.
   
   This issue was resolved when the [HTTP/2 to 
origin](https://github.com/apache/trafficserver/pull/9366) feature was 
implemented.
   Therefore, it reproduces on the 9.2.x branch, but not on the following 
branches:
   - master
   - 10.1.x
   
   # Root Cause
   In conclusion, the issue is caused by a mismatch in the connection-level 
window size between the client and ATS.
   The client's tracked window size reaches 0, while ATS internally maintains a 
value greater than 0.
   
   More specifically, when handling multiple POST requests, the timeout occurs 
through the following sequence:
   1. The client sends a HEADERS frame.
   2. ATS receives the HEADERS frame.
   3. ATS attempts to connect to the origin but fails.
   4. ATS sends a 502 response (HEADERS frame) to the client.
   5. ATS closes the stream.
   6. ATS receives a DATA frame from the client.
   7. Since the stream is already closed, ATS sends a RST_STREAM frame to the 
client. (**The connection-level window size on the ATS side is not 
decremented.**)
   8. Steps 6 and 7 are repeated multiple times, eventually causing the 
client's tracked window size for ATS to reach 0.
   9. The client stops sending DATA frames because the window size has reached 
0.
   10. **ATS determines that sufficient window size remains and does not send a 
WINDOW_UPDATE frame.**
   11. The client continues waiting for a WINDOW_UPDATE frame and eventually 
times out.
   
   Although the stream has already been closed, the root cause is that ATS does 
not decrement the connection-level window size when it receives a DATA frame 
for a closed stream.
   
   # Fix
   In the 10.1.x branch, the timing of the connection-level window size 
decrement was moved earlier in the processing flow.
   The same change has been applied to the 9.2.x branch.
   
https://github.com/apache/trafficserver/blob/6d36bdbae25b6e70529085cde702f2a30d233d4f/src/proxy/http2/Http2ConnectionState.cc#L104-L111


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to