[ https://issues.apache.org/jira/browse/FLINK-20217 ]
Piotr Nowojski deleted comment on FLINK-20217:
----------------------------------------
was (Author: pnowojski):
Firing timers can now be interrupted to speed up checkpointing. Timers that
were interrupted by a checkpoint, will be fired shortly after checkpoint
completes. By default this features is disabled. To enabled it please set:
execution.checkpointing.unaligned.interruptible-timers.enabled
to true.
> More fine-grained timer processing
> ----------------------------------
>
> Key: FLINK-20217
> URL: https://issues.apache.org/jira/browse/FLINK-20217
> Project: Flink
> Issue Type: Improvement
> Components: API / DataStream, Runtime / Task
> Affects Versions: 1.10.2, 1.11.2, 1.12.0
> Reporter: Nico Kruber
> Assignee: Piotr Nowojski
> Priority: Not a Priority
> Labels: auto-deprioritized-major, auto-deprioritized-minor,
> pull-request-available
> Fix For: 1.20.0
>
>
> Timers are currently processed in one big block under the checkpoint lock
> (under {{InternalTimerServiceImpl#advanceWatermark}}. This can be problematic
> in a number of scenarios while doing checkpointing which would lead to
> checkpoints timing out (and even unaligned checkpoints would not help).
> If you have a huge number of timers to process when advancing the watermark
> and the task is also back-pressured, the situation may actually be worse
> since you would block on the checkpoint lock and also wait for
> buffers/credits from the receiver.
> I propose to make this loop more fine-grained so that it is interruptible by
> checkpoints, but maybe there is also some other way to improve here.
> This issue has been for example observed here:
> https://lists.apache.org/thread/f6ffk9912fg5j1rfkxbzrh0qmp4w6qry
--
This message was sent by Atlassian Jira
(v8.20.10#820010)