[jira] [Commented] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled
[ https://issues.apache.org/jira/browse/FLINK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762597#comment-17762597 ] Hangxiang Yu commented on FLINK-31685: -- [~Zakelly] This makes sense to me. _FsCompletedCheckpointStorageLocation_ losts global view of checkpoint dir info. +1 for deleting the directory only when we know all checkpoint files are deleted. > Checkpoint job folder not deleted after job is cancelled > > > Key: FLINK-31685 > URL: https://issues.apache.org/jira/browse/FLINK-31685 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.1 >Reporter: Sergio Sainz >Priority: Major > > When flink job is being checkpointed, and after the job is cancelled, the > checkpoint is indeed deleted (as per > {{{}execution.checkpointing.externalized-checkpoint-retention: > DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains: > > [sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls > 01eff17aa2910484b5aeb644bc531172 3a59309ef018541fc0c20856d0d89855 > 78ff2344dd7ef89f9fbcc9789fc0cd79 a6fd7cec89c0af78c3353d4a46a7d273 > dbc957868c08ebeb100d708bbd057593 > 04ff0abb9e860fc85f0e39d722367c3c 3e09166341615b1b4786efd6745a05d6 > 79efc000aa29522f0a9598661f485f67 a8c42bfe158abd78ebcb4adb135de61f > dc8e04b02c9d8a1bc04b21d2c8f21f74 > 05f48019475de40230900230c63cfe89 3f9fb467c9af91ef41d527fe92f9b590 > 7a6ad7407d7120eda635d71cd843916a a8db748c1d329407405387ac82040be4 > dfb2df1c25056e920d41c94b659dcdab > 09d30bc0ff786994a6a3bb06abd3 455525b76a1c6826b6eaebd5649c5b6b > 7b1458424496baaf3d020e9fece525a4 aa2ef9587b2e9c123744e8940a66a287 > All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}} , > are empty ~ > > *Expected behaviour:* > The job folder id should also be deleted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled
[ https://issues.apache.org/jira/browse/FLINK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17761487#comment-17761487 ] Zakelly Lan commented on FLINK-31685: - [~masteryhx] I think there are two issue within this problem: # User may not need the job-id directory (to simplify the cp dir layout especially in CLAIM node), I will create another ticket to address this. # Deleting the job-id directory if all the checkpoint files are deleted. Different from [~Wencong Liu]'s opinion, I think it is the ```CompletedCheckpointStore```'s responsibility to delete the job-id directory, since it has the global view of whether it is needed by any other checkpoint. WDYT? > Checkpoint job folder not deleted after job is cancelled > > > Key: FLINK-31685 > URL: https://issues.apache.org/jira/browse/FLINK-31685 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.1 >Reporter: Sergio Sainz >Priority: Major > > When flink job is being checkpointed, and after the job is cancelled, the > checkpoint is indeed deleted (as per > {{{}execution.checkpointing.externalized-checkpoint-retention: > DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains: > > [sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls > 01eff17aa2910484b5aeb644bc531172 3a59309ef018541fc0c20856d0d89855 > 78ff2344dd7ef89f9fbcc9789fc0cd79 a6fd7cec89c0af78c3353d4a46a7d273 > dbc957868c08ebeb100d708bbd057593 > 04ff0abb9e860fc85f0e39d722367c3c 3e09166341615b1b4786efd6745a05d6 > 79efc000aa29522f0a9598661f485f67 a8c42bfe158abd78ebcb4adb135de61f > dc8e04b02c9d8a1bc04b21d2c8f21f74 > 05f48019475de40230900230c63cfe89 3f9fb467c9af91ef41d527fe92f9b590 > 7a6ad7407d7120eda635d71cd843916a a8db748c1d329407405387ac82040be4 > dfb2df1c25056e920d41c94b659dcdab > 09d30bc0ff786994a6a3bb06abd3 455525b76a1c6826b6eaebd5649c5b6b > 7b1458424496baaf3d020e9fece525a4 aa2ef9587b2e9c123744e8940a66a287 > All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}} , > are empty ~ > > *Expected behaviour:* > The job folder id should also be deleted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled
[ https://issues.apache.org/jira/browse/FLINK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17757767#comment-17757767 ] Hangxiang Yu commented on FLINK-31685: -- I just linked many related tickets. It's valid and many users want to resolve. I think we could just introduce an option whether generate the job id directory and make them compatible. As for the job id layout, I think it's still useful if user want to save some historical checkpoints with NO_CLAIM mode. [~tangyun] WDYT? > Checkpoint job folder not deleted after job is cancelled > > > Key: FLINK-31685 > URL: https://issues.apache.org/jira/browse/FLINK-31685 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.1 >Reporter: Sergio Sainz >Priority: Major > > When flink job is being checkpointed, and after the job is cancelled, the > checkpoint is indeed deleted (as per > {{{}execution.checkpointing.externalized-checkpoint-retention: > DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains: > > [sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls > 01eff17aa2910484b5aeb644bc531172 3a59309ef018541fc0c20856d0d89855 > 78ff2344dd7ef89f9fbcc9789fc0cd79 a6fd7cec89c0af78c3353d4a46a7d273 > dbc957868c08ebeb100d708bbd057593 > 04ff0abb9e860fc85f0e39d722367c3c 3e09166341615b1b4786efd6745a05d6 > 79efc000aa29522f0a9598661f485f67 a8c42bfe158abd78ebcb4adb135de61f > dc8e04b02c9d8a1bc04b21d2c8f21f74 > 05f48019475de40230900230c63cfe89 3f9fb467c9af91ef41d527fe92f9b590 > 7a6ad7407d7120eda635d71cd843916a a8db748c1d329407405387ac82040be4 > dfb2df1c25056e920d41c94b659dcdab > 09d30bc0ff786994a6a3bb06abd3 455525b76a1c6826b6eaebd5649c5b6b > 7b1458424496baaf3d020e9fece525a4 aa2ef9587b2e9c123744e8940a66a287 > All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}} , > are empty ~ > > *Expected behaviour:* > The job folder id should also be deleted. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-31685) Checkpoint job folder not deleted after job is cancelled
[ https://issues.apache.org/jira/browse/FLINK-31685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17707508#comment-17707508 ] Wencong Liu commented on FLINK-31685: - Hello [~sergiosp] , thanks for proposing this ticket! I think the key code path is {code:java} FsCompletedCheckpointStorageLocation#disposeStorageLocation {code} We could delete the parent folder in this method. > Checkpoint job folder not deleted after job is cancelled > > > Key: FLINK-31685 > URL: https://issues.apache.org/jira/browse/FLINK-31685 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.16.1 >Reporter: Sergio Sainz >Priority: Major > > When flink job is being checkpointed, and after the job is cancelled, the > checkpoint is indeed deleted (as per > {{{}execution.checkpointing.externalized-checkpoint-retention: > DELETE_ON_CANCELLATION{}}}), but the job-id folder still remains: > > [sergio@flink-cluster-54f7fc7c6-k6km8 JobCheckpoints]$ ls > 01eff17aa2910484b5aeb644bc531172 3a59309ef018541fc0c20856d0d89855 > 78ff2344dd7ef89f9fbcc9789fc0cd79 a6fd7cec89c0af78c3353d4a46a7d273 > dbc957868c08ebeb100d708bbd057593 > 04ff0abb9e860fc85f0e39d722367c3c 3e09166341615b1b4786efd6745a05d6 > 79efc000aa29522f0a9598661f485f67 a8c42bfe158abd78ebcb4adb135de61f > dc8e04b02c9d8a1bc04b21d2c8f21f74 > 05f48019475de40230900230c63cfe89 3f9fb467c9af91ef41d527fe92f9b590 > 7a6ad7407d7120eda635d71cd843916a a8db748c1d329407405387ac82040be4 > dfb2df1c25056e920d41c94b659dcdab > 09d30bc0ff786994a6a3bb06abd3 455525b76a1c6826b6eaebd5649c5b6b > 7b1458424496baaf3d020e9fece525a4 aa2ef9587b2e9c123744e8940a66a287 > All folders in the above list, like {{01eff17aa2910484b5aeb644bc531172}} , > are empty ~ > > *Expected behaviour:* > The job folder id should also be deleted. -- This message was sent by Atlassian Jira (v8.20.10#820010)