ChangjiGuo created FLINK-28984:
----------------------------------
Summary: FsCheckpointStateOutputStream is not being released
normally
Key: FLINK-28984
URL: https://issues.apache.org/jira/browse/FLINK-28984
Project: Flink
Issue Type: Bug
Components: Runtime / Checkpointing
Affects Versions: 1.15.1, 1.11.6
Reporter: ChangjiGuo
If the checkpoint is aborted, AsyncSnapshotCallable will close the
snapshotCloseableRegistry when it is canceled. There may be two situations here:
# The FSDataOutputStream has been created and closed while closing
FsCheckpointStateOutputStream.
# The FSDataOutputStream has not been created yet, but closed flag has been
set to true. You can see this in log:
{code:java}
2022-08-16 12:55:44,161 WARN
org.apache.flink.core.fs.SafetyNetCloseableRegistry - Closing
unclosed resource via safety-net:
ClosingFSDataOutputStream(org.apache.flink.runtime.fs.hdfs.HadoopDataOutputStream@4ebe8e64)
:
xxxxx/flink/checkpoint/state/9214a2e302904b14baf2dc1aacbc7933/ae157c5a05a8922a46a179cdb4c86b10/shared/9d8a1e92-2f69-4ab0-8ce9-c1beb149229a
{code}
The output stream will be automatically closed by the
SafetyNetCloseableRegistry but the file will not be deleted.
The second case usually occurs when the storage system has high latency in
creating files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)