bryancall opened a new issue, #12941:
URL: https://github.com/apache/trafficserver/issues/12941

   ## Description
   
   The `CACHE_OPEN_READ_END` milestone is set at the end of 
`state_cache_open_read()` **after** `call_transact_and_set_next_state()` 
returns. However, that call runs the entire transact chain synchronously — 
including `do_http_server_open()` (which sets `SERVER_FIRST_CONNECT`) and 
`do_api_callout_internal()` for `API_SEND_RESPONSE_HDR` (which sets 
`UA_BEGIN_WRITE`).
   
   This means `CACHE_OPEN_READ_END` is stamped **after** milestones that 
logically depend on it, causing negative values in derived timing fields.
   
   ## Impact
   
   Three `msdms` log fields are affected:
   
   | Field | Definition | Symptom | Rate |
   |-------|-----------|---------|------|
   | `dns` | `SERVER_FIRST_CONNECT - CACHE_OPEN_READ_END` | Negative values 
(-10 to -192ms) | 0.007% of requests |
   | `hit_proc` | `UA_BEGIN_WRITE - CACHE_OPEN_READ_END` | Negative values 
(e.g. -2ms on TCP_MEM_HIT) | 0.0003% of requests |
   | `c_proc` / `cache` | Depend on `CACHE_OPEN_READ_BEGIN` / 
`CACHE_OPEN_READ_END` | Report `-1` when milestones are unset | ~17% of cache 
misses |
   
   The negative `dns` cases all show high `c_proc` (39-193ms from slow plugin 
processing) with reused connections where `SERVER_FIRST_CONNECT` is set 
instantly inside the synchronous callout chain.
   
   ## Root Cause
   
   In `HttpSM::state_cache_open_read()` (`src/proxy/http/HttpSM.cc`):
   
   ```cpp
   // line ~2661: runs entire transact chain synchronously
   call_transact_and_set_next_state(HttpTransact::HandleCacheOpenRead);
   
   // line ~2697: milestone set AFTER the above returns
   milestones[TS_MILESTONE_CACHE_OPEN_READ_END] = ink_get_hrtime();
   ```
   
   For cache hits, the synchronous path reaches `SERVE_FROM_CACHE` → 
`API_SEND_RESPONSE_HDR` → `UA_BEGIN_WRITE` is stamped. For cache misses with 
reused connections, the path reaches `do_http_server_open()` → 
`SERVER_FIRST_CONNECT` is stamped. Both happen before `CACHE_OPEN_READ_END`.
   
   ## Fix
   
   Move `CACHE_OPEN_READ_END` before `call_transact_and_set_next_state()` so 
the cache lookup completion is recorded before any transact/API work begins.
   
   ## Data
   
   Observed on 400K production squid.log lines from SWB incubator (e17/e18, ATS 
10.0.0.144).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to