[jira] [Updated] (HDDS-3892) Datanode initialization is too slow when there are thousands of containers per volume

Sammi Chen (Jira) Mon, 29 Jun 2020 06:19:42 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sammi Chen updated HDDS-3892:
-----------------------------
    Description: 
It takes hours for datanode to finish the volume and container initailization.  
Worst case is after 12hours it is still running volume reader on one datanode, 
which has 11 datanode volumes and each volume has 8000+ containers. 

Before the patch applied,  most of the volume reader threads are waiting for 
the ContainerCache lock.  With the patch applied, the worst case datanode cost 
17m to finish the volume and container verify process. 

BTW: why there are so many containers are still under investigation.  It might 
the result of pipeline close.  So in long term, I think we should consider 
reuse healthy but not full containers which are closed because of their 
pipelines are closed. 




  was:
It takes hours for datanode to finish the volume and container initailization.  
Worst case is after 12hours datanode is still running volume reader on one 
datanode, which has 11 datanode volumes and each volume has 8000+ containers. 

Before the patch applied,  most of the volume reader threads are waiting for 
the ContainerCache lock.  With the patch applied, the worst case datanode cost 
17m to finish the volume and container verify process. 

BTW: why there are so many containers are still under investigation.  It might 
the result of pipeline close.  So in long term, I think we should consider 
reuse healthy but not full containers which are closed because of their 
pipelines are closed. 





> Datanode initialization is too slow when there are thousands of containers 
> per volume
> -------------------------------------------------------------------------------------
>
>                 Key: HDDS-3892
>                 URL: https://issues.apache.org/jira/browse/HDDS-3892
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Sammi Chen
>            Assignee: Sammi Chen
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: jstack-68763-with-patch-applied.log, 
> jstack-datanode-without-patch.log
>
>
> It takes hours for datanode to finish the volume and container 
> initailization.  Worst case is after 12hours it is still running volume 
> reader on one datanode, which has 11 datanode volumes and each volume has 
> 8000+ containers. 
> Before the patch applied,  most of the volume reader threads are waiting for 
> the ContainerCache lock.  With the patch applied, the worst case datanode 
> cost 17m to finish the volume and container verify process. 
> BTW: why there are so many containers are still under investigation.  It 
> might the result of pipeline close.  So in long term, I think we should 
> consider reuse healthy but not full containers which are closed because of 
> their pipelines are closed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-3892) Datanode initialization is too slow when there are thousands of containers per volume

Reply via email to