[jira] [Commented] (FLINK-4809) Operators should tolerate checkpoint failures

ASF GitHub Bot (JIRA) Mon, 20 Nov 2017 02:10:30 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259046#comment-16259046
 ]


ASF GitHub Bot commented on FLINK-4809:
---------------------------------------

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4883#discussion_r151948416
  
    --- Diff: 
flink-core/src/main/java/org/apache/flink/api/common/ExecutionConfig.java ---
    @@ -150,6 +150,9 @@
        /** This flag defines if we use compression for the state snapshot data 
or not. Default: false */
        private boolean useSnapshotCompression = false;
     
    +   /** Determines if a task fails or not if there is an error in writing 
its checkpoint data. Default: true */
    --- End diff --
    
    After discussions with @aljoscha, we decided to keep this because it is the 
current way that Flink forwards the configuration. We add some `@Internal` and 
`@Deprecated` annotations to the methods in `ExecutionConfig` so that the user 
will use the proper calls on `CheckpointingConfig` instead.


> Operators should tolerate checkpoint failures
> ---------------------------------------------
>
>                 Key: FLINK-4809
>                 URL: https://issues.apache.org/jira/browse/FLINK-4809
>             Project: Flink
>          Issue Type: Sub-task
>          Components: State Backends, Checkpointing
>            Reporter: Stephan Ewen
>            Assignee: Stefan Richter
>             Fix For: 1.4.0
>
>
> Operators should try/catch exceptions in the synchronous and asynchronous 
> part of the checkpoint and send a {{DeclineCheckpoint}} message as a result.
> The decline message should have the failure cause attached to it.
> The checkpoint barrier should be sent anyways as a first step before 
> attempting to make a state checkpoint, to make sure that downstream operators 
> do not block in alignment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (FLINK-4809) Operators should tolerate checkpoint failures

Reply via email to