curcur commented on a change in pull request #18964:
URL: https://github.com/apache/flink/pull/18964#discussion_r820493182
##########
File path:
flink-state-backends/flink-statebackend-changelog/src/main/java/org/apache/flink/state/changelog/PeriodicMaterializationManager.java
##########
@@ -166,6 +167,10 @@ private void asyncMaterializationPhase(
subtaskName,
upTo);
+ scheduleNextMaterialization();
+ } else if (throwable instanceof
CancellationException) {
+ // likely due to task cancellation or abortion
notification
+ LOG.info("materialization cancelled",
throwable);
scheduleNextMaterialization();
} else {
Review comment:
> > If a task gets a CancellationException, shouldn't it fail the whole
job?
>
> No, task cancellation doesn't mean job failure.
This is true.
> > Why checkpoint abortion notification can pass CancellationException to
the part of materialization? Materialization should be independent of
Checkpointing.
>
> Currently, the issue happens only because of task cancellation; abortion
notification can not reach the nested backend and therefore materializer. But
with [FLINK-25850](https://issues.apache.org/jira/browse/FLINK-25850) it will
be possible. So I added this comment and decided not to react to
`CancellationException` (e.g. by stopping the materializer).
The checkpoint should be independent of Materialization? Checkpoint abortion
should not affect materialization. That's the main purpose of separating
materialization out from checkpointing procedure.
I can not agree on this part.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]