Sejin Hwang created HADOOP-18639:
------------------------------------

             Summary: DockerContainerDeletionTask is not removed from the 
Nodemanager's statestore when the task is completed.
                 Key: HADOOP-18639
                 URL: https://issues.apache.org/jira/browse/HADOOP-18639
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Sejin Hwang


YARN NodeManager has two types of deletion tasks: the FileDeletionTask for 
deleting log files and the DockerContainerDeletionTask for deleting Docker 
containers.
 
The FileDeletionTask is removed from the statestore when the task is completed, 
but the DockerContainerDeletionTask is not.
Therefore, the DockerContainerDeletionTask accumulates continuously in the 
statestore.
 
This causes the NodeManager's deletion service to run the accumulated 
DockerContainerDeletionTask in the statestore when the NodeManager restarts.

As a result, the FileDeletionTask and DockerContainerDeletionTask are delayed 
unnecessarily while processing accumulated tasks, which can cause disk full 
issues in environments where a large number of containers are allocated and 
released.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to