phongn opened a new pull request, #13255:
URL: https://github.com/apache/trafficserver/pull/13255

   # Add S3-FIFO RAM cache eviction algorithm (`ram_cache.algorithm = 2`)
   
   ## Summary
   
   This adds **S3-FIFO** as a selectable RAM cache eviction policy via 
`proxy.config.cache.ram_cache.algorithm = 2`, alongside the existing CLFUS 
(`0`) and LRU (`1`). The default (LRU) is unchanged.
   
   S3-FIFO (Yang, Zhang, Qiu, Yue & Rashmi, "FIFO queues are all you need for 
cache eviction", SOSP 2023) is a FIFO-based policy: a small admission queue and 
a main queue, plus a ghost queue that remembers the keys of recently evicted 
objects. The small queue and ghost filter one-hit-wonders, which makes the 
policy scan-resistant and yields strong hit rates on CDN and key-value 
workloads — while keeping every operation cheap, since a hit only bumps a 2-bit 
counter and nothing is ever reordered.
   
   ## Motivation
   
   The two current RAM cache policies split sharply on real traffic: LRU favors 
recency and loses on frequency-skewed CDN workloads, while CLFUS favors 
frequency/size, is comparatively expensive, and can collapse on recency-heavy 
traffic. In benchmarking, S3-FIFO beats both LRU and CLFUS on every production 
trace tested (CDN and key-value), with a `put` on par with LRU and far below 
CLFUS and a `get` between the two — making it an attractive general-purpose 
option to offer operators.
   
   ## What's in this PR
   
   - `RamCacheS3FIFO.cc` — the new `RamCache` implementation.
   - `Cache.h`, `CacheProcessor.cc`, `RecordsConfig.cc`, `P_RamCache.h`, 
`CMakeLists.txt` — the algorithm enum, the factory wiring (with a one-line 
`cache_init` log of the selected algorithm), the config range `[0-2]`, the 
declaration, and the build entry.
   - `CacheTest.cc` — S3-FIFO added to the existing `ram_cache` regression 
comparison.
   - Admin docs (`records.yaml.en.rst`, `storage/index.en.rst`).
   
   ## Design
   
   S3-FIFO keeps three FIFO queues: a small admission queue (~10% of capacity), 
a main queue (~90%, run as a 2-bit CLOCK), and a ghost queue holding only keys 
of recently evicted objects. A new object enters the small queue. When an 
object is evicted from the small queue it is promoted to the main queue if it 
was reused (frequency ≥ 2) and otherwise demoted to the ghost; a subsequent 
miss whose key is still in the ghost is admitted straight to the main queue. 
This "quick demotion" of one-hit-wonders is what gives S3-FIFO its scan 
resistance and CDN-friendly behavior.
   
   The policy is byte-budgeted to fit the ATS RAM cache. The ghost stores keys, 
not data, but each remembered key still costs real memory, so the ghost is 
bounded both by its object-size footprint and by an entry-count cap, and that 
metadata is counted against `proxy.config.cache.ram_cache.size` — so total 
resident memory (data plus all eviction metadata) stays within the configured 
budget regardless of object cardinality. Access is serialized per stripe by the 
existing stripe lock, exactly like LRU and CLFUS; no new concurrency primitives 
are introduced.
   
   The implementation follows the original paper and the reference 
implementation in libCacheSim (`S3FIFO.c`).
   
   ## Benchmarks
   
   S3-FIFO was selected after an independent evaluation by @bryancall across 
all five candidate policies (the incumbents plus W-TinyLFU, SIEVE, and 
S3-FIFO), using both an in-process microbenchmark and an end-to-end h2load 
sweep through a real ATS proxy. S3-FIFO was the best all-rounder — top or 
tied-top hit rate on every workload, passing scan resistance and adaptivity, 
with the cheapest `put`. The full writeup and methodology are in @bryancall's 
benchmark on the evaluation PR: https://github.com/phongn/trafficserver/pull/2. 
The tables below focus on S3-FIFO against the two incumbents this change adds 
it alongside.
   
   ### Hit rate — real production traces
   
   Full-trace replay (libCacheSim `oracleGeneral` traces), hit rate over the 
second half after warmup, the same stream for every policy; higher is better.
   
   | trace (cache size) | LRU | CLFUS | S3-FIFO |
   |---|---|---|---|
   | wiki CDN, frequency-heavy (512 MB) | 0.176 | 0.194 | **0.237** |
   | wiki CDN, frequency-heavy (8 GB) | 0.521 | 0.571 | **0.585** |
   | Meta CDN-prn, recency-heavy (512 MB) | 0.324 | 0.168 | **0.340** |
   | Meta CDN-prn, recency-heavy (8 GB) | 0.387 | 0.219 | **0.427** |
   | Meta key-value (4 GB) | 0.926 | 0.916 | **0.942** |
   
   S3-FIFO has the best hit rate on every real trace and at every size measured.
   
   ### Microbenchmark (@bryancall, Ryzen 9950X3D)
   
   Per-operation cost, ns/op, lower is better:
   
   | | LRU | CLFUS | S3-FIFO |
   |---|---:|---:|---:|
   | get | 10.1 | 21.7 | 15.3 |
   | put | 566.3 | 837.8 | **567.2** |
   
   S3-FIFO's `put` is the cheapest measured (on par with LRU, ~35% below 
CLFUS); its `get` sits between LRU and CLFUS. On the correctness suite it 
passes scan resistance (hot-set retention 1.000 under a one-time scan), 
adaptivity after an abrupt working-set shift (new-set hit rate 1.000, stale-set 
retention 15/112), and gradual drift (0.978), and its synthetic-Zipf hit rate 
is top or tied-top at every cache size.
   
   ### End-to-end throughput (@bryancall)
   
   In the h2load sweep through a real proxy, end-to-end req/s barely moved 
between algorithms — 3–5% on 1 KB objects (where S3-FIFO was the top performer) 
and flat on bandwidth-saturated large objects. The reason is the test box's 
tiering: a RAM-cache miss falls through to a fast local NVMe disk-cache hit 
rather than an origin fetch, so the eviction policy only chooses which tier 
serves a hit. The policy's value is in hit ratio under memory pressure, which 
surfaces as throughput only where a RAM miss is expensive (origin-bound or 
slow-tier deployments) — so this change is offered on its hit-ratio and 
per-operation-cost merits, not as a throughput win on every topology.
   
   ## Configuration
   
   ```yaml
   records:
     cache:
       ram_cache:
         algorithm: 2   # 0 = CLFUS, 1 = LRU (default), 2 = S3-FIFO
   ```
   
   The selected algorithm is logged at cache initialization (`cache_init` debug 
tag): `ram_cache algorithm = 2 (S3-FIFO)`.
   
   ## Testing
   
   The NIGHTLY-gated `ram_cache` regression test (`traffic_server -R 2 -r 
ram_cache`) now exercises S3-FIFO alongside LRU and CLFUS on the synthetic 
Zipfian workload. Booting with `ram_cache.algorithm: 2` was verified to 
construct the S3-FIFO cache and log the selection.
   
   ## Notes
   
   - Experimental opt-in; the default remains LRU.
   - S3-FIFO does not use the seen filter (it is scan-resistant by 
construction) and does not support in-RAM compression (which remains 
CLFUS-only).
   
   🤖 Generated with [Claude Code](https://claude.com/claude-code)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to