phongn opened a new pull request, #13233:
URL: https://github.com/apache/trafficserver/pull/13233
## Summary
The CLFUS RAM cache ranks objects by a value density, `value = (hits + 1) /
(size + overhead)`, compared against a running `double` average on every
eviction/admission decision. Since #11733 this has been computed with
**integer** division and truncates to `0` for essentially every object,
silently disabling CLFUS's frequency/size/recency logic and degrading it to
FIFO.
## Root cause
#11733 ("Fix C-style cast in Cache subsystem") modernized the macro:
```diff
-#define CACHE_VALUE_HITS_SIZE(_h, _s) ((float)((_h) + 1) / ((_s) +
ENTRY_OVERHEAD))
+#define CACHE_VALUE_HITS_SIZE(_h, _s) (static_cast<float>(((_h) + 1) /
((_s) + ENTRY_OVERHEAD)))
```
The original C cast bound only to the numerator, so the division was
floating point. The `static_cast<float>(...)` wraps the **whole** quotient, so
`(hits + 1) / (size + overhead)` is now integer division. `hits` is on the
order of tens while `size + overhead` is thousands+, so the result truncates to
`0` for any normal object.
With `value ≡ 0`, `_average_value` stays `0` and every value-gated branch is
dead (`0 > 0` is false):
- no promote-to-MRU on hit (`get()`),
- no CLOCK second-chance / requeue (`put()`),
- no value-based re-admission from the history (ghost) list.
CLFUS effectively becomes FIFO, which generally performs worse than the
simpler LRU it is meant to beat. This has affected every CLFUS deployment since
the regression merged on 2024-08-28.
## Fix
Bind the cast to the numerator only, restoring floating-point division (the
pre-#11733 semantics):
```c
#define CACHE_VALUE_HITS_SIZE(_h, _s) (static_cast<float>((_h) + 1) / ((_s)
+ ENTRY_OVERHEAD))
```
## Test
Adds `ram_cache_clfus_value`, a focused regression test asserting the value
metric is a non-zero fraction and is monotonic in hits and in size. It fails on
the pre-fix macro and passes after:
- buggy macro → `FAILED` ("value metric truncated to zero", etc.)
- fixed macro → `PASSED`
The existing `ram_cache` regression still passes. With the fix, CLFUS's
size-aware behavior is restored — in the bundled `ram_cache` test at the 16 MB
size, its variable-size hit rate rises to ~0.82, slightly ahead of LRU (~0.80)
on the same workload.
## Notes
- Severity: correctness regression affecting all users of the non-default
`proxy.config.cache.ram_cache.algorithm = 0` (CLFUS) since #11733; a backport
may be warranted.
- This restores the intended algorithm. A separate, long-standing limitation
— resident entries' hit counts are only aged on the ghost list, never while
resident — is **not** addressed here and is tracked separately.
Fixes a regression introduced in #11733.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]