[
https://issues.apache.org/jira/browse/FLINK-38990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-38990:
-----------------------------------
Labels: pull-request-available (was: )
> Support configurable initial delay for first checkpoint trigger
> ---------------------------------------------------------------
>
> Key: FLINK-38990
> URL: https://issues.apache.org/jira/browse/FLINK-38990
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Checkpointing
> Reporter: Liu
> Priority: Major
> Labels: pull-request-available
>
> h1. Summary
> Add a new configuration option execution.checkpointing.initial-delay to allow
> users to configure the initial delay before the first checkpoint is triggered
> after job startup.
> h1. Motivation
> When a Flink streaming job starts consuming from a message queue (e.g.,
> Kafka, Pulsar) with a significant backlog, the job needs time to catch up
> with the accumulated data. During this catch-up phase, triggering checkpoints
> can negatively impact processing performance due to:
> * Memory pressure: Checkpoint barriers alignment and state snapshots consume
> additional memory
> * I/O overhead: Writing state to external storage increases disk/network
> * I/OReduced throughput: Checkpoint operations compete with data processing
> for resources
> Currently, the initial checkpoint delay is calculated randomly within the
> range [minPauseBetweenCheckpoints, baseInterval + 1) (see
> getRandomInitDelay() in CheckpointCoordinator.java), which:
> * Cannot be directly configured by users
> * May not provide sufficient delay for jobs with large backlogs
> * Has a maximum value limited to baseInterval
> While Flink already provides execution.checkpointing.interval-during-backlog
> (introduced in FLIP-309) to adjust checkpoint intervals during backlog
> processing, there is no dedicated option to delay the first checkpoint
> trigger after job startup.
> h1. Proposed Changes
> Add a new configuration in
> ExecutionCheckpointingOptions:execution.checkpointing.initial-delay
--
This message was sent by Atlassian Jira
(v8.20.10#820010)