bryancall opened a new issue, #12958:
URL: https://github.com/apache/trafficserver/issues/12958

   ## Summary
   
   `HttpSM::tunnel_handler` can receive `VC_EVENT_ACTIVE_TIMEOUT`, 
`VC_EVENT_ERROR`, or `VC_EVENT_EOS` events but only asserts for 
`HTTP_TUNNEL_EVENT_DONE` and `VC_EVENT_INACTIVITY_TIMEOUT`, causing a fatal 
assertion failure and process abort.
   
   ## Details
   
   After response header parsing completes in 
`state_read_server_response_header`, the server entry's VC handlers are set to 
`tunnel_handler`:
   
   ```cpp
   server_entry->vc_read_handler  = &HttpSM::tunnel_handler;
   server_entry->vc_write_handler = &HttpSM::tunnel_handler;
   ```
   
   The comment above says "Any other events to the end", but `tunnel_handler` 
only handles:
   - `VC_EVENT_WRITE_READY` / `VC_EVENT_WRITE_COMPLETE` (only when 
`server_entry->eos` is true)
   - `HTTP_TUNNEL_EVENT_DONE`
   - `VC_EVENT_INACTIVITY_TIMEOUT`
   
   It does **not** handle `VC_EVENT_ACTIVE_TIMEOUT`, `VC_EVENT_ERROR`, or 
`VC_EVENT_EOS`. If any of these arrive on the server connection during this 
window, the assertion fires:
   
   ```
   Fatal: HttpSM.cc:3083: failed assertion `event == HTTP_TUNNEL_EVENT_DONE || 
event == VC_EVENT_INACTIVITY_TIMEOUT`
   ```
   
   Every other VC handler in `HttpSM.cc` (e.g., `tunnel_handler_server`, 
`state_watch_for_client_abort`) properly handles these events. This is the only 
handler that is missing them.
   
   ## Crash observed on controller.trafficserver.org
   
   This was observed on `controller.trafficserver.org` running ATS 10.2.0 (ASAN 
build). The crash came through an HTTP/2 path:
   
   ```
   HttpSM::tunnel_handler
   HttpSM::main_handler
   Continuation::handleEvent
   Http2Stream::main_event_handler
   ```
   
   The ASAN build then failed to restart cleanly due to AddressSanitizer memory 
mapping errors, leaving `docs.trafficserver.apache.org` and 
`ci.trafficserver.apache.org` unresponsive.
   
   ## Impact
   
   - Fatal crash terminates the entire `traffic_server` process
   - More likely to occur under HTTP/2 with multiplexed streams where 
timeout/error events can race with tunnel setup
   - Confirmed on 10.2.x, still present on master
   
   ## Fix
   
   The fix is to widen the assertion to accept the additional events. Since 
`tunnel_handler` already sets `terminate_sm = true` for all handled events, 
these additional events should simply terminate the state machine as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to