[
https://issues.apache.org/jira/browse/HDDS-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287147#comment-17287147
]
Yiqun Lin commented on HDDS-4656:
---------------------------------
Thanks for updating the design doc for the Container Balancer,
[~ljain]/[~maobaolong].
I am thinking one corner cases, the container replication can take long time
due to unexpected network issue or slow target nodes. Considering for this, can
we additionally add the timeout limitation (e.g. 20 minutes) for each
container movement? Once one container is not replicated successfully under the
given timeout value, we don't need to track this container. The timeout
limitation can improve the efficiency during each iteration container balance
since we won't wait the slow running container balance. But by default, the
timeout limitation is disabled in configuration.
> Add a container balancer tool or service for HDDS
> -------------------------------------------------
>
> Key: HDDS-4656
> URL: https://issues.apache.org/jira/browse/HDDS-4656
> Project: Apache Ozone
> Issue Type: New Feature
> Components: Ozone Datanode, SCM, Tools
> Affects Versions: 1.1.0
> Reporter: Baolong Mao
> Assignee: Baolong Mao
> Priority: Major
> Attachments: Container Balancer Design.pdf
>
>
> When an existing Ozone cluster is nearly full, we have to add more datanodes
> into the Ozone cluster, but there are two issue we must face.
> - When new allocate container request coming, SCM should better to choose the
> datanodes in low usage, if not, the performance will getting pool.
> - For read request, the existing datanodes stored lots of blocks, so they are
> responsible for serving the read request and supply the data stream service,
> meanwhile, the new coming datanodes can help nothing.
> If we have a balancer tool just like hdfs balancer, we can move the block or
> container from some high usage datanodes to low, I think this is one of
> necessary tools for Ozone.
> container balancer design doc
> https://docs.google.com/document/d/15PdYaP6aLB18ptbcOK3XWlL4Y1PLfIP-ll30KjNPZ2g/edit?usp=sharing
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]