yknoya opened a new pull request, #12729: URL: https://github.com/apache/trafficserver/pull/12729
# Problem https://github.com/apache/trafficserver/pull/9181 introduced an issue where an origin server was marked as down even though a connection had been successfully established. This issue occurs under the following conditions: 1. `proxy.config.http.server_session_sharing.match` is set to a value other than `none` (i.e., server session reuse is enabled). 2. A server session is reused when connecting to the origin. 3. The connection is closed after sending a request to the origin. 4. Condition 3 occurs repeatedly until it reaches the threshold defined by `proxy.config.http.connect_attempts_rr_retries`. The issue has been confirmed in the following branches/versions (other versions not tested): - master (90dbc21) - 10.1.0 - 9.2.11 # Cause When ATS begins processing an origin connection, it executes `t_state.set_connect_fail(EIO)` to tentatively set `connect_result` to `EIO`: https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L8054 https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/include/proxy/http/HttpTransact.h#L932 If server session reuse is not possible, `connect_result` is cleared once the connection is established: https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L1860 However, when a server session is reused, `connect_result` is not cleared and remains set to `EIO`. This regression was triggered by the change introduced in https://github.com/apache/trafficserver/pull/9181 . Before the PR was merged, `t_state.set_connect_fail(EIO)` was not executed when a server session was reused. After the PR, it is executed regardless of whether a server session is reused or not. With `connect_result` incorrectly left as `EIO`, if the connection is closed after sending a request to the origin, the following call chain leads to execution of `HttpSM::mark_host_failure`, causing the `fail_count` to be incremented: 1. https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpTransact.cc#L3466 2. https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpTransact.cc#L3786 3. https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpTransact.cc#L3884 4. https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L4630 5. https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L5876 If this happens repeatedly and reaches the threshold defined by `proxy.config.http.connect_attempts_rr_retries`, the origin server is incorrectly marked as down: https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L5876-L5885 Since the connection to the origin is actually successful, marking it as down is incorrect. # Fix Update the logic so that `t_state.set_connect_fail(EIO)` is executed only when establishing a new connection to the origin (i.e., when a server session is not reused), and ensure that `connect_result` is cleared once the connection succeeds. Additionally, when `multiplexed_origin` is true, `connect_result` was also not being cleared after a successful connection. In this case, although `t_state.set_connect_fail(EIO)` is executed (see below), the lack of a corresponding clear operation results in `connect_result` remaining `EIO`: https://github.com/apache/trafficserver/blob/90dbc21a541986db6b223649cb4f798e190a550f/src/proxy/http/HttpSM.cc#L5706-L5723 This patch ensures that `connect_result` is cleared whenever the connection succeeds, regardless of whether `multiplexed_origin` is enabled. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
