GUI opened a new issue, #10393:
URL: https://github.com/apache/trafficserver/issues/10393

   When upgrading from 9.1.4 to 9.2.x, I've observed some issues with requests 
failing in unexpected ways that they didn't previously in 9.1.4. I'm not really 
certain what's happening, so it's a bit difficult to give a summary, but it 
seems like perhaps Traffic Server 9.2.x is closing connections to clients in 
front of Traffic Server unexpectedly too early in certain cases. My only theory 
is that it's somehow related to larger request bodies (and maybe specific to 
PUT requests), and maybe only when these larger request bodies are still being 
streamed after the origin server generates an (expected) error? But again, not 
really sure.
   
   Here's a more detailed example of how this is pretty reproducible in all 
versions of 9.2.0-9.2.2, and also demonstrates how this didn't happen in 9.1.4. 
The basic reproducible case I've narrowed this down to looks like this:
   
   ```
   [nginx proxy] => [trafficserver] => [nginx server]
   ```
   
   1. The `nginx proxy` layer does *not* have a maximum request body size.
   2. The underlying `nginx server` component *is* setup with a maximum request 
body size. If a client sends a request body that exceeds this size, then nginx 
returns a `413 Request Entity Too Large` error.
   
   The basic issue I'm seeing is that if a client exceeds this request body 
size at the `nginx server` origin layer then Traffic Server 9.2+ seems to 
behave in unexpected ways:
   
   1. **In Traffic Server 9.1.4:** The `nginx proxy` layer (and client making 
the request) reliably receives the `413 Request Entity Too Large` error that 
the origin `nginx server` layer generates and is proxied via TrafficServer.
   2. **In Traffic Server 9.2.2:** The `nginx proxy` layer maybe 50% of the 
time receives the expected 413 error (from the `nginx server` origin), but then 
the other 50% of the time the `nginx proxy` ends up reporting a `502 Bad 
Gateway` error which is generated by nginx due to an apparent communication 
errors with `trafficserver`. This seems to indicate that the connection from 
`nginx proxy` to `trafficserver` is being closed unexpectedly by Traffic Server 
too early before the `413` error can be proxied back successfully.
   
   Here is a repo that contains a minimal reproduction of this along more 
detailed steps: https://github.com/GUI/trafficserver-debugging This issue 
appears to be present using all default Traffic Server configuration, so 
there's no custom Traffic Server configuration other than proxying to the 
underlying server. See the repo's README for exact steps to reproduce an more 
examples of the expected output in Traffic Server 9.1.4 versus the new behavior 
that's more erratic in Traffic Server 9.2.2.
   
   The short version is that Traffic Server 9.1.4 will always return the 
expected `413 Request Entity Too Large` that is proxied from the underlying 
origin server, but when Traffic Server 9.2.x is in the middle, then it will 
randomly lead to nginx's connections to Traffic Server failing and the `nginx 
proxy` layer generates `502 Bad Gateway` errors.
   
   A few notes I've observed:
   
   - It happens more readily if the request body size is bigger (eg, more than 
a couple MBs).
   - Strangely, I can reproduce it reliably for PUT requests with a body, but 
not POST requests.
   - In tcpdumps, I've observed TCP RSTs under TrafficServer 9.2.x during these 
situations where there don't appear to be any RSTs in 9.1.x.
   - I've been able to reproduce this in both 9.2.0 and 9.2.2, so it seems like 
it's related to some change between 9.1.4 to 9.2.0.
   
   Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@trafficserver.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to