[
https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441180#comment-16441180
]
ASF GitHub Bot commented on FLINK-4809:
---------------------------------------
Github user rmetzger commented on a diff in the pull request:
https://github.com/apache/flink/pull/4883#discussion_r182155763
--- Diff: docs/dev/stream/state/checkpointing.md ---
@@ -118,6 +120,9 @@
env.getCheckpointConfig.setMinPauseBetweenCheckpoints(500)
// checkpoints have to complete within one minute, or are discarded
env.getCheckpointConfig.setCheckpointTimeout(60000)
+// prevent the tasks from failing if an error happens in their
checkpointing, the checkpoint will just be declined.
+env.getCheckpointConfig.setFailTasksOnCheckpointingErrors(false)
--- End diff --
This line is missing from the Java tab.
> Operators should tolerate checkpoint failures
> ---------------------------------------------
>
> Key: FLINK-4809
> URL: https://issues.apache.org/jira/browse/FLINK-4809
> Project: Flink
> Issue Type: Sub-task
> Components: State Backends, Checkpointing
> Reporter: Stephan Ewen
> Assignee: Stefan Richter
> Priority: Major
> Fix For: 1.5.0
>
>
> Operators should try/catch exceptions in the synchronous and asynchronous
> part of the checkpoint and send a {{DeclineCheckpoint}} message as a result.
> The decline message should have the failure cause attached to it.
> The checkpoint barrier should be sent anyways as a first step before
> attempting to make a state checkpoint, to make sure that downstream operators
> do not block in alignment.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)