Copilot commented on code in PR #12852:
URL: https://github.com/apache/trafficserver/pull/12852#discussion_r2761138007


##########
src/proxy/http/HttpTransact.cc:
##########
@@ -2942,14 +2951,26 @@ HttpTransact::HandleCacheOpenReadHit(State *s)
 
   HttpCacheSM &cache_sm = s->state_machine->get_cache_sm();
   TxnDbg(dbg_ctl_http_trans, "CacheOpenRead --- HIT-FRESH read while write 
%d", cache_sm.is_readwhilewrite_inprogress());
-  if (cache_sm.is_readwhilewrite_inprogress())
+  if (cache_sm.is_readwhilewrite_inprogress()) {
     SET_VIA_STRING(VIA_CACHE_RESULT, VIA_IN_CACHE_RWW_HIT);
+  }
+
+  // If serving stale due to write lock failure (actions 2, 3, or 6), adjust 
VIA to reflect stale serving.
+  // This ensures correct statistics attribution (cache_hit_stale_served 
instead of cache_hit_fresh).
+  if (s->serving_stale_due_to_write_lock) {
+    TxnDbg(dbg_ctl_http_trans, "Serving stale due to write lock failure, 
adjusting VIA for statistics");
+    SET_VIA_STRING(VIA_CACHE_RESULT, VIA_IN_CACHE_STALE);
+    SET_VIA_STRING(VIA_SERVER_RESULT, VIA_SERVER_ERROR);
+  }
 
   if (s->cache_lookup_result == CacheLookupResult_t::HIT_WARNING) {
     build_response_from_cache(s, HTTPWarningCode::HERUISTIC_EXPIRATION);
   } else if (s->cache_lookup_result == CacheLookupResult_t::HIT_STALE) {

Review Comment:
   When serving stale due to write lock failure, 
`serving_stale_due_to_write_lock` forces `cache_lookup_result` to `HIT_STALE` 
(see `HandleCacheOpenReadHitFreshness`). That means this branch will execute 
and hit `ink_assert(server_up == false)` even though `send_revalidate` was 
explicitly skipped (so `server_up` remains true), causing an assertion/crash. 
Also, it will incorrectly add the REVALIDATION_FAILED warning. Consider gating 
the `HIT_STALE` branch with `!serving_stale_due_to_write_lock` and/or checking 
`serving_stale_due_to_write_lock` before `HIT_STALE` so the no-warning path is 
taken.
   ```suggestion
     } else if (s->cache_lookup_result == CacheLookupResult_t::HIT_STALE && 
!s->serving_stale_due_to_write_lock) {
   ```



##########
doc/admin-guide/files/records.yaml.en.rst:
##########
@@ -2931,6 +2931,12 @@ Dynamic Content & Content Negotiation
          Make sure to configure the :ref:`admin-config-read-while-writer` 
feature
          correctly. Note that this option may result in CACHE_LOOKUP_COMPLETE 
HOOK
          being called back more than once.

Review Comment:
   The docs for option ``5`` still say CACHE_LOOKUP_COMPLETE may be called more 
than once, but this PR’s behavior change explicitly defers 
CACHE_LOOKUP_COMPLETE for READ_RETRY actions to ensure plugins see only the 
final result. Please update this note (for ``5`` and/or ``6``) to match the new 
hook behavior.
   ```suggestion
            correctly. With this option, CACHE_LOOKUP_COMPLETE HOOK is deferred 
for
            read retries so that plugins see only the final cache lookup result.
   ```



##########
tests/gold_tests/cache/cache-read-retry-stale.test.py:
##########
@@ -0,0 +1,29 @@
+'''
+Test cache_open_write_fail_action = 6 (READ_RETRY with stale fallback)
+'''
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+#  limitations under the License.
+
+Test.Summary = '''
+Test cache_open_write_fail_action = 6 (READ_RETRY_STALE_ON_REVALIDATE) to 
verify:
+1. READ_RETRY behavior (same as action 5)
+2. Stale fallback when read retries are exhausted
+3. System does not crash under write lock contention with stale objects
+'''

Review Comment:
   The test summary claims it verifies stale fallback when read retries are 
exhausted, but the replay only primes the cache and verifies normal cache hits 
(no write-lock contention / read-retry exhaustion / stale serving path is 
exercised). Either adjust the summary to describe this as a 
smoke/config-acceptance test, or extend the replay to deterministically trigger 
the action-6 stale fallback behavior.



##########
src/proxy/http/HttpSM.cc:
##########
@@ -2663,7 +2666,20 @@ HttpSM::state_cache_open_read(int event, void *data)
 
     ink_assert(t_state.transact_return_point == nullptr);
     t_state.transact_return_point = HttpTransact::HandleCacheOpenRead;
-    setup_cache_lookup_complete_api();
+
+    // For READ_RETRY actions (5 and 6), skip the CACHE_LOOKUP_COMPLETE hook 
now.
+    // The hook will fire later with the final result: HIT if stale content is 
found
+    // during retry, or MISS if nothing is found (from 
HandleCacheOpenReadMiss).
+    // This ensures plugins see only the final cache lookup result, avoiding 
issues
+    // like stats double-counting and duplicate hook registrations.
+    if (t_state.txn_conf->cache_open_write_fail_action == 
static_cast<MgmtByte>(CacheOpenWriteFailAction_t::READ_RETRY) ||
+        t_state.txn_conf->cache_open_write_fail_action ==
+          
static_cast<MgmtByte>(CacheOpenWriteFailAction_t::READ_RETRY_STALE_ON_REVALIDATE))
 {
+      SMDbg(dbg_ctl_http, "READ_RETRY configured, deferring 
CACHE_LOOKUP_COMPLETE hook");
+      call_transact_and_set_next_state(nullptr);
+    } else {
+      setup_cache_lookup_complete_api();
+    }

Review Comment:
   `state_cache_open_read` defers `setup_cache_lookup_complete_api()` for *all* 
cache open-read failures whenever open_write_fail_action is 5/6. That changes 
behavior for ordinary cache misses under action 5/6: if 
`HandleCacheOpenReadMiss` sets `cache_info.action = NO_ACTION` (e.g., Range not 
cacheable / method not cacheable / plugin no-store), there is no later 
`TRANSACT_RETURN(API_CACHE_LOOKUP_COMPLETE, ...)`, so the CACHE_LOOKUP_COMPLETE 
hook will never fire for those requests. Consider deferring the hook only when 
the transaction is actually in the READ_RETRY flow (e.g., 
`t_state.cache_info.write_lock_state == READ_RETRY`), or otherwise ensure a 
single MISS hook is always fired even when no cache write will occur.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to