bryancall opened a new issue, #12941: URL: https://github.com/apache/trafficserver/issues/12941
## Description The `CACHE_OPEN_READ_END` milestone is set at the end of `state_cache_open_read()` **after** `call_transact_and_set_next_state()` returns. However, that call runs the entire transact chain synchronously — including `do_http_server_open()` (which sets `SERVER_FIRST_CONNECT`) and `do_api_callout_internal()` for `API_SEND_RESPONSE_HDR` (which sets `UA_BEGIN_WRITE`). This means `CACHE_OPEN_READ_END` is stamped **after** milestones that logically depend on it, causing negative values in derived timing fields. ## Impact Three `msdms` log fields are affected: | Field | Definition | Symptom | Rate | |-------|-----------|---------|------| | `dns` | `SERVER_FIRST_CONNECT - CACHE_OPEN_READ_END` | Negative values (-10 to -192ms) | 0.007% of requests | | `hit_proc` | `UA_BEGIN_WRITE - CACHE_OPEN_READ_END` | Negative values (e.g. -2ms on TCP_MEM_HIT) | 0.0003% of requests | | `c_proc` / `cache` | Depend on `CACHE_OPEN_READ_BEGIN` / `CACHE_OPEN_READ_END` | Report `-1` when milestones are unset | ~17% of cache misses | The negative `dns` cases all show high `c_proc` (39-193ms from slow plugin processing) with reused connections where `SERVER_FIRST_CONNECT` is set instantly inside the synchronous callout chain. ## Root Cause In `HttpSM::state_cache_open_read()` (`src/proxy/http/HttpSM.cc`): ```cpp // line ~2661: runs entire transact chain synchronously call_transact_and_set_next_state(HttpTransact::HandleCacheOpenRead); // line ~2697: milestone set AFTER the above returns milestones[TS_MILESTONE_CACHE_OPEN_READ_END] = ink_get_hrtime(); ``` For cache hits, the synchronous path reaches `SERVE_FROM_CACHE` → `API_SEND_RESPONSE_HDR` → `UA_BEGIN_WRITE` is stamped. For cache misses with reused connections, the path reaches `do_http_server_open()` → `SERVER_FIRST_CONNECT` is stamped. Both happen before `CACHE_OPEN_READ_END`. ## Fix Move `CACHE_OPEN_READ_END` before `call_transact_and_set_next_state()` so the cache lookup completion is recorded before any transact/API work begins. ## Data Observed on 400K production squid.log lines from SWB incubator (e17/e18, ATS 10.0.0.144). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
