fredia commented on code in PR #21822:
URL: https://github.com/apache/flink/pull/21822#discussion_r1170783321
##########
flink-dstl/flink-dstl-dfs/src/main/java/org/apache/flink/changelog/fs/DuplicatingStateChangeFsUploader.java:
##########
@@ -51,14 +52,15 @@
* <li>Store the meta of files into {@link ChangelogTaskLocalStateStore}
by
* AsyncCheckpointRunnable#reportCompletedSnapshotStates().
* <li>Pass control of the file to {@link
LocalChangelogRegistry#register} when
- * ChangelogKeyedStateBackend#notifyCheckpointComplete() , files of
the previous
- * checkpoint will be deleted by {@link
LocalChangelogRegistry#discardUpToCheckpoint} at
- * the same time.
+ * FsStateChangelogWriter#persist , files of the previous checkpoint
will be deleted by
+ * {@link LocalChangelogRegistry#discardUpToCheckpoint} when the
previous checkpoint is
+ * confirmed.
Review Comment:
The current implementation can partially alleviate the issue of local file
accumulation:
1. In the case that TM can exit
normally(`ChangelogTaskLocalStateStore#dispose()` can be executed), all local
files can be cleaned up.
2. If `emptyDir` is used as the local disk, all files will be deleted when
the pod exits.
So I think the problem with local dstl files accumulation is milder than
remote dstl files.
[Pruning
](https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/TaskStateManagerImpl.java#L210)
other dstl local files during restore may alleviate this issue, WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]