Thanks for the flip,This ability is needed in production.I would like to add a suggestion,The dynamic adjustment of these configuration parameters is also helpful.`execution.checkpointing.max-concurrent-checkpoints``execution.checkpointing.min-pause``execution.checkpointing.tolerable-failed-checkpoints`
Best xingsuo-zbz At 2026-04-08 15:51:08, "熊饶饶" <[email protected]> wrote: >Thanks for the flip. It is useful for users. I have only one question: JM >Memory Pressure Under High-Concurrency Sampling — Could It Cause OOM in >Large-Scale Jobs? > >> 2026年3月24日 16:29,Jiangang Liu <[email protected]> 写道: >> >> Hi everyone, >> >> I would like to start a discussion on FLIP-571: Support Dynamically >> Updating Checkpoint Configuration at Runtime via REST API [1]. >> >> Currently, checkpoint configuration (checkpointInterval, checkpointTimeout) >> is immutable after job submission. This creates significant operational >> challenges for long-running streaming jobs: >> >> 1. Cascading checkpoint failures cannot be resolved without restarting >> the >> job, causing data reprocessing delays. >> 2. Near-complete checkpoints (e.g., 95% persisted) are entirely discarded >> on timeout — wasting all I/O work and potentially creating a failure >> loop for large-state jobs. >> 3. Static configuration cannot adapt to variable workloads at runtime. >> >> FLIP-571 proposes a new REST API endpoint: >> >> PATCH /jobs/:jobid/checkpoints/configuration >> >> Key design points: >> >> - Timeout changes apply immediately to in-flight checkpoints by >> rescheduling their canceller timers, saving near-complete checkpoints >> from being discarded. >> - Interval changes take effect on the next checkpoint trigger cycle. >> - Configuration overrides are persisted to ExecutionPlanStore (following >> the JobResourceRequirements pattern) and automatically restored after >> failover. >> >> For more details, please refer to the FLIP [1]. >> >> Looking forward to your feedback and suggestions! >> >> [1] >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-571%3A+Support+Dynamically+Updating+Checkpoint+Configuration+at+Runtime+via+REST+API >> >> Best regards, >> Jiangang Liu
