[
https://issues.apache.org/jira/browse/FLINK-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006448#comment-17006448
]
Yun Gao commented on FLINK-15010:
---------------------------------
The reason for this issue should be in standalone mode TaskManagers are
shutdown by SIG_TERM signal, and the cleanup of directories rely on shutdown
hooks, however, there are no shutdown hook registered for netty shuffle
environment.
An intuitive thought is to add shutdown hook directly for
_NettyShuffleEnvironment_, however, it cannot ensure the directories get
cleaned up in all cases, since the directories are created in the constructor
of _FileChannelManagerImpl_, which comes before registering shutdown hook in
_NettyShuffleEnvironment's_ constructor_._ If __ task __ managers receive
SIG_TERM between the two actions, the directories will not be cleaned.
Therefore, the current PR enhance _FileChannelManagerImpl_ by allowing the
callers to specify whether to register a shutdown hook for the manager, and the
hook is registered before creating the directories.
Besides, The above issue also exist for the existing _FileChannelManagerImpl_
usage in _IOManager_. If the current fix is acceptable, we might also fix the
_IOManager_ case in similar way.
> Temp directories flink-netty-shuffle-* are not cleaned up
> ---------------------------------------------------------
>
> Key: FLINK-15010
> URL: https://issues.apache.org/jira/browse/FLINK-15010
> Project: Flink
> Issue Type: Improvement
> Components: Runtime / Network
> Affects Versions: 1.9.1
> Reporter: Nico Kruber
> Assignee: Yun Gao
> Priority: Major
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Starting a Flink cluster with 2 TMs and stopping it again will leave 2
> temporary directories (and not delete them): flink-netty-shuffle-<uid>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)