[
https://issues.apache.org/jira/browse/HDDS-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sammi Chen updated HDDS-3892:
-----------------------------
Description:
It takes hours for datanode to finish the volume and container initailization.
Worst case is after 12hours it is still running volume reader on one datanode,
which has 11 datanode volumes and each volume has 8000+ containers.
Before the patch applied, most of the volume reader threads are waiting for
the ContainerCache lock. With the patch applied, the worst case datanode cost
17m to finish the volume and container verify process.
BTW: why there are so many containers are still under investigation. It might
the result of pipeline close. So in long term, I think we should consider
reuse healthy but not full containers which are closed because of their
pipelines are closed.
was:
It takes hours for datanode to finish the volume and container initailization.
Worst case is after 12hours datanode is still running volume reader on one
datanode, which has 11 datanode volumes and each volume has 8000+ containers.
Before the patch applied, most of the volume reader threads are waiting for
the ContainerCache lock. With the patch applied, the worst case datanode cost
17m to finish the volume and container verify process.
BTW: why there are so many containers are still under investigation. It might
the result of pipeline close. So in long term, I think we should consider
reuse healthy but not full containers which are closed because of their
pipelines are closed.
> Datanode initialization is too slow when there are thousands of containers
> per volume
> -------------------------------------------------------------------------------------
>
> Key: HDDS-3892
> URL: https://issues.apache.org/jira/browse/HDDS-3892
> Project: Hadoop Distributed Data Store
> Issue Type: Improvement
> Reporter: Sammi Chen
> Assignee: Sammi Chen
> Priority: Major
> Labels: pull-request-available
> Attachments: jstack-68763-with-patch-applied.log,
> jstack-datanode-without-patch.log
>
>
> It takes hours for datanode to finish the volume and container
> initailization. Worst case is after 12hours it is still running volume
> reader on one datanode, which has 11 datanode volumes and each volume has
> 8000+ containers.
> Before the patch applied, most of the volume reader threads are waiting for
> the ContainerCache lock. With the patch applied, the worst case datanode
> cost 17m to finish the volume and container verify process.
> BTW: why there are so many containers are still under investigation. It
> might the result of pipeline close. So in long term, I think we should
> consider reuse healthy but not full containers which are closed because of
> their pipelines are closed.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]