moomindani commented on PR #14440: URL: https://github.com/apache/iceberg/pull/14440#issuecomment-4590842955
**Follow-up on the freshness check — tightening the methodology** While re-validating the benchmark against the current head, I want to correct one detail in the freshness section above. That check used a deliberately short `TTL=200ms` so write-expiration would fire within the test window. At that very short TTL, the `access-only` result is actually **not deterministic**: Caffeine evaluates `expireAfterAccess` during amortized maintenance, and at a 200ms TTL that granularity is coarse enough that the entry occasionally expires even under continuous reads (~5–20% of runs in my measurements). So my "TTL never fires while reads continue" wording was too strong — if you re-run the bench as-is, you may occasionally see `access-only` refresh. I checked whether this was a GC pause (it isn't — the max inter-read gap in the refreshing runs was 1–6ms, nowhere near the TTL), then re-confirmed the behavior at realistic TTLs with a sane read rate (one read every 50ms): | TTL | access-only (`expireAfterAccess`) | write-only (`expireAfterWrite`) | |---|---|---| | 1s | **stale 15/15** | refreshed 15/15 | | 2s | **stale 15/15** | refreshed 15/15 | At realistic TTLs (including the 30s default in #14417) the maintenance granularity is negligible, so the staleness bug and the write-based fix are both fully deterministic. The 200ms flakiness is purely an artifact of the degenerate test TTL, not the cache behavior users actually hit. Net: the latency/throughput/streaming-overhead numbers and the core conclusion are unchanged — `expireAfterAccess` keeps serving stale metadata under steady reads, and `expireAfterWrite` fixes it. If the freshness scenario gets turned into a regression test, it should use a realistic TTL (1–2s+) rather than 200ms to avoid being flaky. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
