[ 
https://issues.apache.org/jira/browse/FLINK-28984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ChangjiGuo updated FLINK-28984:
-------------------------------
    Description: 
If the checkpoint is aborted, AsyncSnapshotCallable will close the 
snapshotCloseableRegistry when it is canceled. There may be two situations here:
 # The FSDataOutputStream has been created and closed while closing 
FsCheckpointStateOutputStream.
 # The FSDataOutputStream has not been created yet, but closed flag has been 
set to true. You can see this in log:
{code:java}
2022-08-16 12:55:44,161 WARN  
org.apache.flink.core.fs.SafetyNetCloseableRegistry           - Closing 
unclosed resource via safety-net: 
ClosingFSDataOutputStream(org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream@4ebe8e64)
 : 
xxxxx/flink/checkpoint/state/9214a2e302904b14baf2dc1aacbc7933/ae157c5a05a8922a46a179cdb4c86b10/shared/9d8a1e92-2f69-4ab0-8ce9-c1beb149229a
 {code}

        The output stream will be automatically closed by the 
SafetyNetCloseableRegistry but the file will not be deleted.

The second case usually occurs when the storage system has high latency in 
creating files.

 

  was:
If the checkpoint is aborted, AsyncSnapshotCallable will close the 
snapshotCloseableRegistry when it is canceled. There may be two situations here:
 # The FSDataOutputStream has been created and closed while closing 
FsCheckpointStateOutputStream.
 # The FSDataOutputStream has not been created yet, but closed flag has been 
set to true. You can see this in log:
{code:java}
2022-08-16 12:55:44,161 WARN  
org.apache.flink.core.fs.SafetyNetCloseableRegistry           - Closing 
unclosed resource via safety-net: 
ClosingFSDataOutputStream(org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream@4ebe8e64)
 : 
xxxxx/flink/checkpoint/state/9214a2e302904b14baf2dc1aacbc7933/ae157c5a05a8922a46a179cdb4c86b10/shared/9d8a1e92-2f69-4ab0-8ce9-c1beb149229a
 {code}

        The output stream will be automatically closed by the 
SafetyNetCloseableRegistry          but the file will not be deleted.

The second case usually occurs when the storage system has high latency in 
creating files.

 


> FsCheckpointStateOutputStream is not being released normally
> ------------------------------------------------------------
>
>                 Key: FLINK-28984
>                 URL: https://issues.apache.org/jira/browse/FLINK-28984
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Checkpointing
>    Affects Versions: 1.11.6, 1.15.1
>            Reporter: ChangjiGuo
>            Priority: Major
>
> If the checkpoint is aborted, AsyncSnapshotCallable will close the 
> snapshotCloseableRegistry when it is canceled. There may be two situations 
> here:
>  # The FSDataOutputStream has been created and closed while closing 
> FsCheckpointStateOutputStream.
>  # The FSDataOutputStream has not been created yet, but closed flag has been 
> set to true. You can see this in log:
> {code:java}
> 2022-08-16 12:55:44,161 WARN  
> org.apache.flink.core.fs.SafetyNetCloseableRegistry           - Closing 
> unclosed resource via safety-net: 
> ClosingFSDataOutputStream(org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream@4ebe8e64)
>  : 
> xxxxx/flink/checkpoint/state/9214a2e302904b14baf2dc1aacbc7933/ae157c5a05a8922a46a179cdb4c86b10/shared/9d8a1e92-2f69-4ab0-8ce9-c1beb149229a
>  {code}
>         The output stream will be automatically closed by the 
> SafetyNetCloseableRegistry but the file will not be deleted.
> The second case usually occurs when the storage system has high latency in 
> creating files.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to