fredia commented on PR #20420:
URL: https://github.com/apache/flink/pull/20420#issuecomment-1202404441

   Thanks a lot for reviewing.
   > I still not get the idea why the test is unstable. Could you describe it 
more clearly?
   
   According to the stack trace in[ 
FLINK-27162](https://issues.apache.org/jira/browse/FLINK-27162) and 
[FLINK-28626](FLINK-28626), the reason for the test failure is 
`FileNotFoundException`. This is because:
   1. `triggerCheckpoint()` is unstable,  it might cause "checkpoint expired 
before completing", which can refer to the exception stack trace in 
[FLINK-28529](https://issues.apache.org/jira/browse/FLINK-28529). 
   2. The checkpoint obtained by `getMostRecentCompletedCheckpoint` may be 
incompleted when unaligned checkpoint is enabled, we can see that all 
`FileNotFoundException` are thrown when reading files under `taskowned`.
   
   I replaced the way to obtain checkpoints by waiting 
`notifyCheckpointComplete()`, to avoid using `triggerCheckpoint()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to