[
https://issues.apache.org/jira/browse/FLINK-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982434#comment-16982434
]
Victor Wong commented on FLINK-14653:
-------------------------------------
[~mxm], any progress on this?
I have some solutions, do you mind taking a look:
*Solution 1:*
catch the exception of `CheckpointedFunction#snapshotState` and rethrow as
*Error* like the patch of Beam did. **
*Solution 2:*
catch the exception of `CheckpointedFunction#snapshotState` and rethrow as a
new exception type, e.g. *SnapshotStateException*, and catch
SnapshotStateException later to not mark CheckpointFailureReason as
CHECKPOINT_DECLINED, so it would not be ignored even if the user has set his
job to tolerate checkpointing failures.
> Job-related errors in snapshotState do not result in job failure
> ----------------------------------------------------------------
>
> Key: FLINK-14653
> URL: https://issues.apache.org/jira/browse/FLINK-14653
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Reporter: Maximilian Michels
> Priority: Minor
>
> When users override {{snapshoteState}}, they might include logic there which
> is crucial for the correctness of their application, e.g. finalizing a
> transaction and buffering the results of that transaction, or flushing events
> to an external store. Exceptions occurring should lead to failing the job.
> Currently, users must make sure to throw a {{Throwable}} because any
> {{Exception}} will be caught by the task and reported as checkpointing error,
> when it could be an application error.
> It would be helpful to update the documentation and introduce a special
> exception that can be thrown for job-related failures, e.g.
> {{ApplicationError}} or similar.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)