[
https://issues.apache.org/jira/browse/FLINK-14653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969962#comment-16969962
]
Maximilian Michels commented on FLINK-14653:
--------------------------------------------
I'm aware of that. However, there is a difference between checkpointing related
failures (e.g. slow tasks, checkpoint fails to write to disk, etc.) and an
application errors which can occur in {{snapshotState}}. I'm ok with tolerating
checkpointing failures, but not ok with sacrificing the correctness of my Flink
job. I'm linking an issue from Beam: BEAM-8566.
> Job-related errors in snapshotState do not result in job failure
> ----------------------------------------------------------------
>
> Key: FLINK-14653
> URL: https://issues.apache.org/jira/browse/FLINK-14653
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Reporter: Maximilian Michels
> Priority: Minor
>
> When users override {{snapshoteState}}, they might include logic there which
> is crucial for the correctness of their application, e.g. finalizing a
> transaction and buffering the results of that transaction, or flushing events
> to an external store. Exceptions occurring should lead to failing the job.
> Currently, users must make sure to throw a {{Throwable}} because any
> {{Exception}} will be caught by the task and reported as checkpointing error,
> when it could be an application error.
> It would be helpful to update the documentation and introduce a special
> exception that can be thrown for job-related failures, e.g.
> {{ApplicationError}} or similar.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)