[PR] Make the CLFUS RAM cache adapt to a shifting working set [trafficserver]

via GitHub Wed, 03 Jun 2026 16:35:58 -0700


phongn opened a new pull request, #13235:
URL: https://github.com/apache/trafficserver/pull/13235


   ## Summary
   
   Building on #13233 (which restored the CLFUS value metric), this makes the 
CLFUS RAM cache actually follow a working set that changes over time. 
Previously CLFUS captured an initial set of objects and then effectively froze 
on it: on a working-set change it kept serving the stale set and never admitted 
the new one.
   
   > Stacked on #13233 — please review/merge that first; this branch contains 
the
   > value-metric fix as its base commit.
   
   ## Root cause
   
   Two independent problems, both of which had to be fixed:
   
   1. **Resident frequency never ages.** A resident object's `hits` only ever 
increased, so an object that was hot days ago kept winning replacement long 
after going cold. Aging existed only for the history/ghost list (`_tick()`).
   2. **New candidates can't be admitted.** `_tick()` freed a history (ghost) 
entry the moment its aged `hits` reached 0, so the ghost list stayed ~1 entry; 
a re-requested key was forgotten before it could accumulate the value needed 
for admission, and incumbents were restored on every attempt.
   
   With the value metric fixed, this is stark: on an abrupt 100% working-set 
change CLFUS scored a 0.125 hit rate on the new set vs LRU's 1.0, while 
retaining 100% of the now-cold set.
   
   ## Fix
   
   Two small, complementary changes in `RamCacheCLFUS.cc`:
   
   1. **Admission — keep the history list.** `_tick()` now ages the oldest 
ghost entry and *keeps* it, freeing only to hold the list at its target size, 
so a recently evicted/seen key is remembered long enough to be re-admitted.
   2. **Aging — decay resident counts.** Once per "turnover" (one `Put` per 
resident object) `_age_resident()` halves every resident `hits` *and* 
`_average_value` (the admission bar must fall in step with the values it gates, 
or the decay is invisible to it).
   
   ## Memory
   
   Ghost entries are ~88 bytes each and are **not** counted against 
`proxy.config.cache.ram_cache.size`. A full cache-worth of history would be a 
large unbudgeted cost for caches of many small objects, so the history is 
bounded to `_objects / HISTORY_DIVISOR` (4). Testing showed a quarter preserves 
adaptivity (an eighth begins to slip); the seen-filter threshold tracks the 
same bound. Indicative cost for a 32 GB cache of 1 KB objects: ~700 MB, vs ~2.8 
GB unbounded.
   
   ## Tests
   
   Adds two regression tests in `CacheTest.cc`, each comparing CLFUS to the LRU 
RAM cache (synthetic; higher is better except A-retained):
   
   | test                   | LRU    | CLFUS before | CLFUS after |
   |------------------------|--------|--------------|-------------|
   | gradual-drift hit rate | 0.969  | 0.391        | 0.902       |
   | abrupt B-hit-rate      | 1.000  | 0.125        | 1.000       |
   | abrupt A-retained      | 15/112 | 112/112      | 14/112      |
   | steady-state 16 MB var | 0.795  | 0.790        | 0.839       |
   
   The existing `ram_cache` test still passes; CLFUS now also beats LRU on 
steady-state Zipfian, its intended strength.
   
   ## Docs
   
   Updates `doc/developer-guide/cache-architecture/ram-cache.en.rst`: the 
History List section no longer matched the code. Adds the value metric and 
floating admission bar, the CLOCK aging (`_tick`, `_age_resident`), "Following 
a shifting
   working set," and "Memory overhead."
   
   ## Notes
   
   - Validated on synthetic access patterns, not production traces.
   - A further, unimplemented lever remains if ever needed: relaxing the 
incumbent bias (re-queue second-chance + cost/benefit) on a detected shift — 
not required to pass the tests, so left out to keep the change minimal.
   - Possible follow-ups: budget the ghost RAM against `ram_cache.size`; expose 
`HISTORY_DIVISOR` as a config knob.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] Make the CLFUS RAM cache adapt to a shifting working set [trafficserver]

Reply via email to