[ 
https://issues.apache.org/jira/browse/HDDS-4656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17287147#comment-17287147
 ] 

Yiqun Lin commented on HDDS-4656:
---------------------------------

Thanks for updating the design doc for the Container Balancer, 
[~ljain]/[~maobaolong].

I am thinking one corner cases, the container replication can take long time 
due to unexpected network issue or slow target nodes. Considering for this, can 
we  additionally add the timeout limitation (e.g. 20 minutes) for each 
container movement? Once one container is not replicated successfully under the 
given timeout value, we don't need to track this container. The timeout 
limitation can improve the efficiency during each iteration container balance 
since we won't wait the slow running container balance. But by default, the 
timeout limitation is disabled in configuration.

> Add a container balancer tool or service for HDDS
> -------------------------------------------------
>
>                 Key: HDDS-4656
>                 URL: https://issues.apache.org/jira/browse/HDDS-4656
>             Project: Apache Ozone
>          Issue Type: New Feature
>          Components: Ozone Datanode, SCM, Tools
>    Affects Versions: 1.1.0
>            Reporter: Baolong Mao
>            Assignee: Baolong Mao
>            Priority: Major
>         Attachments: Container Balancer Design.pdf
>
>
> When an existing Ozone cluster is nearly full, we have to add more datanodes 
> into the Ozone cluster, but there are two issue we must face.
> - When new allocate container request coming, SCM should better to choose the 
> datanodes in low usage, if not, the performance will getting pool.
> - For read request, the existing datanodes stored lots of blocks, so they are 
> responsible for serving the read request and supply the data stream service, 
> meanwhile, the new coming datanodes can help nothing.
> If we have a balancer tool just like hdfs balancer, we can move the block or 
> container from some high usage datanodes to low, I think this is one of 
> necessary tools for Ozone.
> container balancer design doc 
> https://docs.google.com/document/d/15PdYaP6aLB18ptbcOK3XWlL4Y1PLfIP-ll30KjNPZ2g/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to