Hi all, Reviving the cache-policy discussion that Hossein Torabi (@blcksrx) started in October 2025 [1]. That ML thread received zero replies, and PR #14440 [2] was subsequently auto-closed by the stale bot in March, despite directional support from @gaborkaszab, @findepi, and @pvary on the PR itself. I'm starting a fresh thread for broader visibility.
[1] https://lists.apache.org/thread/4hnk0d5bfcw4y5ow5l1n2y4x9m2qgmjh [2] https://github.com/apache/iceberg/pull/14440 *Problem (recap)* CachingCatalog uses expireAfterAccess exclusively. In long-running workloads — most concretely Spark Structured Streaming with a stream-to-static join against a slowly-changing Iceberg reference table — every microbatch read resets the TTL, so the cache never observes new snapshots. The only documented workaround ( spark.sql.catalog.<name>.cache-enabled=false) forces full metadata reload on every microbatch. *Bench results* I cherry-picked PR #14440 onto current main and ran a benchmark to give the proposal an empirical footing. Full setup, reproduction steps, and results are in this PR comment: [3] The headline numbers: - The staleness bug is reproducible: with expireAfterAccess(200ms) and 300ms of continuous reads, the cache returns 5.8M stale results and never observes the underlying snapshot update. - expireAfterWrite(200ms) under the same load: 800k reads, all correctly refreshed after the TTL boundary. - Per-call latency: 0.07us (access-only, today's default) vs 906us (cache-disabled workaround) vs 0.44us (write-only, this PR). The proposed dual policy costs essentially nothing on the hit path. - Projected for a 1Hz streaming microbatch over 24h: ~78 sec/day of metadata overhead with the workaround vs. ~0.06 sec/day with write-only (TTL=1h). On S3 the absolute saving is on the order of hours/day per cached table. [3] https://github.com/apache/iceberg/pull/14440#issuecomment-4561571918 *Open question for the community* The dual-policy approach in #14440 (both expireAfterAccess and expireAfterWrite, independently configurable, default = current behavior) had positive directional feedback from @gaborkaszab, @findepi, and @pvary on the PR. Is there a competing design we should consider before proceeding — for example, the pluggable Cache factory suggested in [4]? Speak up now or it's likely we'll move forward with the dual-policy approach. I'll coordinate the PR-level next steps (reviving #14440 vs. opening a successor PR with attribution) with @blcksrx directly on #14440. [4] https://github.com/apache/iceberg/issues/14417#issuecomment-3451805984 Best, Noritaka Sekiyama
