fredia commented on PR #20420: URL: https://github.com/apache/flink/pull/20420#issuecomment-1202404441
Thanks a lot for reviewing. > I still not get the idea why the test is unstable. Could you describe it more clearly? According to the stack trace in[ FLINK-27162](https://issues.apache.org/jira/browse/FLINK-27162) and [FLINK-28626](FLINK-28626), the reason for the test failure is `FileNotFoundException`. This is because: 1. `triggerCheckpoint()` is unstable, it might cause "checkpoint expired before completing", which can refer to the exception stack trace in [FLINK-28529](https://issues.apache.org/jira/browse/FLINK-28529). 2. The checkpoint obtained by `getMostRecentCompletedCheckpoint` may be incompleted when unaligned checkpoint is enabled, we can see that all `FileNotFoundException` are thrown when reading files under `taskowned`. I replaced the way to obtain checkpoints by waiting `notifyCheckpointComplete()`, to avoid using `triggerCheckpoint()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
