Copilot commented on code in PR #12852:
URL: https://github.com/apache/trafficserver/pull/12852#discussion_r2761138007
##########
src/proxy/http/HttpTransact.cc:
##########
@@ -2942,14 +2951,26 @@ HttpTransact::HandleCacheOpenReadHit(State *s)
HttpCacheSM &cache_sm = s->state_machine->get_cache_sm();
TxnDbg(dbg_ctl_http_trans, "CacheOpenRead --- HIT-FRESH read while write
%d", cache_sm.is_readwhilewrite_inprogress());
- if (cache_sm.is_readwhilewrite_inprogress())
+ if (cache_sm.is_readwhilewrite_inprogress()) {
SET_VIA_STRING(VIA_CACHE_RESULT, VIA_IN_CACHE_RWW_HIT);
+ }
+
+ // If serving stale due to write lock failure (actions 2, 3, or 6), adjust
VIA to reflect stale serving.
+ // This ensures correct statistics attribution (cache_hit_stale_served
instead of cache_hit_fresh).
+ if (s->serving_stale_due_to_write_lock) {
+ TxnDbg(dbg_ctl_http_trans, "Serving stale due to write lock failure,
adjusting VIA for statistics");
+ SET_VIA_STRING(VIA_CACHE_RESULT, VIA_IN_CACHE_STALE);
+ SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_ERROR);
+ }
if (s->cache_lookup_result == CacheLookupResult_t::HIT_WARNING) {
build_response_from_cache(s, HTTPWarningCode::HERUISTIC_EXPIRATION);
} else if (s->cache_lookup_result == CacheLookupResult_t::HIT_STALE) {
Review Comment:
When serving stale due to write lock failure,
`serving_stale_due_to_write_lock` forces `cache_lookup_result` to `HIT_STALE`
(see `HandleCacheOpenReadHitFreshness`). That means this branch will execute
and hit `ink_assert(server_up == false)` even though `send_revalidate` was
explicitly skipped (so `server_up` remains true), causing an assertion/crash.
Also, it will incorrectly add the REVALIDATION_FAILED warning. Consider gating
the `HIT_STALE` branch with `!serving_stale_due_to_write_lock` and/or checking
`serving_stale_due_to_write_lock` before `HIT_STALE` so the no-warning path is
taken.
```suggestion
} else if (s->cache_lookup_result == CacheLookupResult_t::HIT_STALE &&
!s->serving_stale_due_to_write_lock) {
```
##########
doc/admin-guide/files/records.yaml.en.rst:
##########
@@ -2931,6 +2931,12 @@ Dynamic Content & Content Negotiation
Make sure to configure the :ref:`admin-config-read-while-writer`
feature
correctly. Note that this option may result in CACHE_LOOKUP_COMPLETE
HOOK
being called back more than once.
Review Comment:
The docs for option ``5`` still say CACHE_LOOKUP_COMPLETE may be called more
than once, but this PR’s behavior change explicitly defers
CACHE_LOOKUP_COMPLETE for READ_RETRY actions to ensure plugins see only the
final result. Please update this note (for ``5`` and/or ``6``) to match the new
hook behavior.
```suggestion
correctly. With this option, CACHE_LOOKUP_COMPLETE HOOK is deferred
for
read retries so that plugins see only the final cache lookup result.
```
##########
tests/gold_tests/cache/cache-read-retry-stale.test.py:
##########
@@ -0,0 +1,29 @@
+'''
+Test cache_open_write_fail_action = 6 (READ_RETRY with stale fallback)
+'''
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements. See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership. The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+Test.Summary = '''
+Test cache_open_write_fail_action = 6 (READ_RETRY_STALE_ON_REVALIDATE) to
verify:
+1. READ_RETRY behavior (same as action 5)
+2. Stale fallback when read retries are exhausted
+3. System does not crash under write lock contention with stale objects
+'''
Review Comment:
The test summary claims it verifies stale fallback when read retries are
exhausted, but the replay only primes the cache and verifies normal cache hits
(no write-lock contention / read-retry exhaustion / stale serving path is
exercised). Either adjust the summary to describe this as a
smoke/config-acceptance test, or extend the replay to deterministically trigger
the action-6 stale fallback behavior.
##########
src/proxy/http/HttpSM.cc:
##########
@@ -2663,7 +2666,20 @@ HttpSM::state_cache_open_read(int event, void *data)
ink_assert(t_state.transact_return_point == nullptr);
t_state.transact_return_point = HttpTransact::HandleCacheOpenRead;
- setup_cache_lookup_complete_api();
+
+ // For READ_RETRY actions (5 and 6), skip the CACHE_LOOKUP_COMPLETE hook
now.
+ // The hook will fire later with the final result: HIT if stale content is
found
+ // during retry, or MISS if nothing is found (from
HandleCacheOpenReadMiss).
+ // This ensures plugins see only the final cache lookup result, avoiding
issues
+ // like stats double-counting and duplicate hook registrations.
+ if (t_state.txn_conf->cache_open_write_fail_action ==
static_cast<MgmtByte>(CacheOpenWriteFailAction_t::READ_RETRY) ||
+ t_state.txn_conf->cache_open_write_fail_action ==
+
static_cast<MgmtByte>(CacheOpenWriteFailAction_t::READ_RETRY_STALE_ON_REVALIDATE))
{
+ SMDbg(dbg_ctl_http, "READ_RETRY configured, deferring
CACHE_LOOKUP_COMPLETE hook");
+ call_transact_and_set_next_state(nullptr);
+ } else {
+ setup_cache_lookup_complete_api();
+ }
Review Comment:
`state_cache_open_read` defers `setup_cache_lookup_complete_api()` for *all*
cache open-read failures whenever open_write_fail_action is 5/6. That changes
behavior for ordinary cache misses under action 5/6: if
`HandleCacheOpenReadMiss` sets `cache_info.action = NO_ACTION` (e.g., Range not
cacheable / method not cacheable / plugin no-store), there is no later
`TRANSACT_RETURN(API_CACHE_LOOKUP_COMPLETE, ...)`, so the CACHE_LOOKUP_COMPLETE
hook will never fire for those requests. Consider deferring the hook only when
the transaction is actually in the READ_RETRY flow (e.g.,
`t_state.cache_info.write_lock_state == READ_RETRY`), or otherwise ensure a
single MISS hook is always fired even when no cache write will occur.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]