mao-liu opened a new pull request, #8186:
URL: https://github.com/apache/paimon/pull/8186

   ### Purpose
   
   In Paimon v1.3 (prior to 
https://github.com/apache/paimon/commit/960dce1a18cccb3beac1d5ae0c8a1f59414498ae),
 manifest cache cold-filling incurred significant heap memory spike during 
cold-filling. This problem was raised and discussed in 
https://github.com/apache/paimon/issues/7030 and 
https://github.com/apache/paimon/pull/7031. This problem is particularly 
evident for highly partitioned tables in jobs with high parallelism.
   
   While the heap spike issue is mostly resolved via 
https://github.com/apache/paimon/commit/960dce1a18cccb3beac1d5ae0c8a1f59414498ae,
 some additional manifest cache options are proposed here to help tune the 
manifest cache for highly partitioned tables in jobs with high parallelism.
   
   When many high-parallelism writers restore at the same time, the Job 
Manager's manifest cache can become a memory bottleneck. The cache holds 
entries with soft references, so under sustained heap pressure the JVM reclaims 
entries that are then immediately re-read and decompressed, driving heap back 
up and triggering further reclamation — a cache-thrash spiral. There was 
previously no way to tune this behavior.
   
   This PR exposes additional manifest-cache controls and a prefetch option to 
make this tunable:
   
   - Added `WriteRestoreScanBenchmark`, a micro-benchmark that reproduces the 
manifest-cache cold-fill memory spike and reports heap/cache footprint across 
cache-disabled vs. cache-enabled (strong-ref) arms. On Paimon v1.3, this 
benchmark would reveal significant memory heap spike during cold-filling on the 
cache-enabled path. This problem is no longer present after 
https://github.com/apache/paimon/commit/960dce1a18cccb3beac1d5ae0c8a1f59414498ae,
 however the benchmark could still be useful in measuring performance and 
detecting regression in the future.
   
   - `SegmentsCache` now supports a configurable idle TTL 
(`expire-after-access`) and a `soft-values` toggle. Setting `soft-values=false` 
pins the working set with strong references so the thrash spiral cannot start; 
the cache then stays bounded by weight (up to its configured memory). The 
defaults preserve the existing behavior (soft references on).
   
   - New catalog option:
     - `cache.manifest.soft-values` (default `true`) — toggle soft/strong 
references for the catalog manifest cache. The catalog manifest cache continues 
to inherit the catalog-wide `cache.expire-after-access` TTL.
   
   - New writer-coordinator options:
     - `sink.writer-coordinator.cache-soft-values` (default `true`) — same 
soft/strong reference toggle for the coordinator manifest cache.
     - `sink.writer-coordinator.cache-expire-after-access` (default disabled) — 
optional idle TTL for coordinator cache entries; the cache stays bounded by 
`sink.writer-coordinator.cache-memory` regardless.
     - `sink.writer-coordinator.prefetch-manifests` (default `false`) — eagerly 
read all data manifests of the latest snapshot during refresh to warm the 
in-Job-Manager manifest cache once, avoiding many concurrent cold manifest 
reads when writers restore simultaneously.
   
   - Docs: documented the new options and added a "Write Initialize" section in 
`write-performance.md` explaining when these settings help, the failure 
mechanism, and how they resolve it.
   
   
   ### Tests
   
   - `SegmentsCacheTest`: covers defaults (soft refs on, no TTL), getter 
pass-through, `create` returning null on zero memory, and that strong 
references stay bounded by weight-based eviction.
   - `CachingCatalogTest#testManifestCacheOptions`: asserts the catalog 
manifest cache picks up `soft-values` and inherits the catalog idle TTL.
   - `TableWriteCoordinatorTest`: `testBuildManifestCacheOptions` verifies the 
coordinator options map to the cache (default soft refs + no TTL, explicit TTL 
honored, `soft-values=false` switches to strong refs, zero memory disables the 
cache); `testPrefetchManifestsWarmsCache` verifies that constructing the 
coordinator with prefetch enabled warms the cache and that scan results remain 
correct.
   - Regenerated config docs verified by `ConfigOptionsDocsCompletenessITCase`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to