moomindani commented on PR #14440:
URL: https://github.com/apache/iceberg/pull/14440#issuecomment-4590842955

   **Follow-up on the freshness check — tightening the methodology**
   
   While re-validating the benchmark against the current head, I want to 
correct one detail in the freshness section above. That check used a 
deliberately short `TTL=200ms` so write-expiration would fire within the test 
window. At that very short TTL, the `access-only` result is actually **not 
deterministic**: Caffeine evaluates `expireAfterAccess` during amortized 
maintenance, and at a 200ms TTL that granularity is coarse enough that the 
entry occasionally expires even under continuous reads (~5–20% of runs in my 
measurements). So my "TTL never fires while reads continue" wording was too 
strong — if you re-run the bench as-is, you may occasionally see `access-only` 
refresh.
   
   I checked whether this was a GC pause (it isn't — the max inter-read gap in 
the refreshing runs was 1–6ms, nowhere near the TTL), then re-confirmed the 
behavior at realistic TTLs with a sane read rate (one read every 50ms):
   
   | TTL | access-only (`expireAfterAccess`) | write-only (`expireAfterWrite`) |
   |---|---|---|
   | 1s | **stale 15/15** | refreshed 15/15 |
   | 2s | **stale 15/15** | refreshed 15/15 |
   
   At realistic TTLs (including the 30s default in #14417) the maintenance 
granularity is negligible, so the staleness bug and the write-based fix are 
both fully deterministic. The 200ms flakiness is purely an artifact of the 
degenerate test TTL, not the cache behavior users actually hit.
   
   Net: the latency/throughput/streaming-overhead numbers and the core 
conclusion are unchanged — `expireAfterAccess` keeps serving stale metadata 
under steady reads, and `expireAfterWrite` fixes it. If the freshness scenario 
gets turned into a regression test, it should use a realistic TTL (1–2s+) 
rather than 200ms to avoid being flaky.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to