[ 
https://issues.apache.org/jira/browse/FLINK-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982434#comment-16982434
 ] 

Victor Wong commented on FLINK-14653:
-------------------------------------

[~mxm], any progress on this?

I have some solutions, do you mind taking a look:

 

*Solution 1:*

catch the exception of `CheckpointedFunction#snapshotState` and rethrow as 
*Error* like the patch of Beam did. ** 

 

*Solution 2:*

catch the exception of `CheckpointedFunction#snapshotState` and rethrow as a 
new exception type, e.g. *SnapshotStateException*, and catch 
SnapshotStateException later to not mark CheckpointFailureReason as 
CHECKPOINT_DECLINED, so it would not be ignored even if the user has set his 
job to tolerate checkpointing failures.

> Job-related errors in snapshotState do not result in job failure
> ----------------------------------------------------------------
>
>                 Key: FLINK-14653
>                 URL: https://issues.apache.org/jira/browse/FLINK-14653
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>            Reporter: Maximilian Michels
>            Priority: Minor
>
> When users override {{snapshoteState}}, they might include logic there which 
> is crucial for the correctness of their application, e.g. finalizing a 
> transaction and buffering the results of that transaction, or flushing events 
> to an external store. Exceptions occurring should lead to failing the job.
> Currently, users must make sure to throw a {{Throwable}} because any 
> {{Exception}} will be caught by the task and reported as checkpointing error, 
> when it could be an application error.
> It would be helpful to update the documentation and introduce a special 
> exception that can be thrown for job-related failures, e.g. 
> {{ApplicationError}} or similar.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to