XComp edited a comment on pull request #18963: URL: https://github.com/apache/flink/pull/18963#issuecomment-1058378672
I had to revisit the issue because I noticed that the `FileSystem.delete` method is not clear on cases where the underlying file doesn't exist. The `LocalFileSystem` implements the delete method in a way that it would return `false` if it didn't delete the file since it relies on `java.io.File.delete` This was probably the cause for [this build](https://dev.azure.com/mapohl/flink/_build/results?buildId=808&view=logs&j=d63a5fc4-24ea-51df-9ade-fa4330af161c&t=977479f1-49ea-5c4c-884c-4646ed1443ab) to fail in the e2e tests: ``` 2022-03-03 14:30:11,092 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 11 (type=CheckpointType{name='Checkpoint', sharingFilesStrategy=FORWARD_BACKWARD}) @ 1646317811091 for job b570100734a17ad72d8d2ccc712f6 81d. 2022-03-03 14:30:11,215 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Triggering stop-with-savepoint for job b570100734a17ad72d8d2ccc712f681d. 2022-03-03 14:30:11,232 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Triggering checkpoint 12 (type=SavepointType{name='Suspend Savepoint', postCheckpointAction=SUSPEND, formatType=CANONICAL}) @ 1646317811228 for job b570100734 a17ad72d8d2ccc712f681d. 2022-03-03 14:30:11,259 WARN org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received late message for now expired checkpoint attempt 11 from task 275909f41c4e9d1635d1c3d3c1f55b4c of job b570100734a17ad72d8d2ccc712f681d at 127.0.0.1:34 655-d7bf22 @ localhost (dataPort=37055). [...] 2022-03-03 14:30:11,282 WARN org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received late message for now expired checkpoint attempt 11 from task f827493a1120315cebf2c38987fb2709 of job b570100734a17ad72d8d2ccc712f681d at 127.0.0.1:34 655-d7bf22 @ localhost (dataPort=37055). 2022-03-03 14:30:11,288 WARN org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Received late message for now expired checkpoint attempt 11 from task bb54c8be2cceb115193c02f53ce3cf3e of job b570100734a17ad72d8d2ccc712f681d at 127.0.0.1:34 655-d7bf22 @ localhost (dataPort=37055). 2022-03-03 14:30:11,282 WARN org.apache.flink.runtime.checkpoint.OperatorSubtaskState [] - Error while discarding operator states. java.io.IOException: /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-47072687872/savepoint-e2e-test-chckpt-dir/b570100734a17ad72d8d2ccc712f681d/chk-11/73833c1e-bc28-4d68-8752-496d0ea65e8b could not be deleted for unknown reasons. at org.apache.flink.runtime.state.filesystem.FileStateHandle.discardState(FileStateHandle.java:86) ~[flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.runtime.state.KeyGroupsStateHandle.discardState(KeyGroupsStateHandle.java:125) ~[flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.util.LambdaUtil.applyToAllWhileSuppressingExceptions(LambdaUtil.java:55) ~[flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.runtime.state.StateUtil.bestEffortDiscardAllStateObjects(StateUtil.java:62) ~[flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.runtime.checkpoint.OperatorSubtaskState.discardState(OperatorSubtaskState.java:211) ~[flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.util.LambdaUtil.applyToAllWhileSuppressingExceptions(LambdaUtil.java:55) [flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.runtime.state.StateUtil.bestEffortDiscardAllStateObjects(StateUtil.java:62) [flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.runtime.checkpoint.TaskStateSnapshot.discardState(TaskStateSnapshot.java:156) [flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at org.apache.flink.runtime.checkpoint.CheckpointCoordinator$1.run(CheckpointCoordinator.java:2007) [flink-dist-1.15-SNAPSHOT.jar:1.15-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_322] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_322] at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
