Sejin Hwang created HADOOP-18639:
------------------------------------
Summary: DockerContainerDeletionTask is not removed from the
Nodemanager's statestore when the task is completed.
Key: HADOOP-18639
URL: https://issues.apache.org/jira/browse/HADOOP-18639
Project: Hadoop Common
Issue Type: Bug
Reporter: Sejin Hwang
YARN NodeManager has two types of deletion tasks: the FileDeletionTask for
deleting log files and the DockerContainerDeletionTask for deleting Docker
containers.
The FileDeletionTask is removed from the statestore when the task is completed,
but the DockerContainerDeletionTask is not.
Therefore, the DockerContainerDeletionTask accumulates continuously in the
statestore.
This causes the NodeManager's deletion service to run the accumulated
DockerContainerDeletionTask in the statestore when the NodeManager restarts.
As a result, the FileDeletionTask and DockerContainerDeletionTask are delayed
unnecessarily while processing accumulated tasks, which can cause disk full
issues in environments where a large number of containers are allocated and
released.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]