[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

Erik Krogen (JIRA) Thu, 25 Jul 2019 14:09:26 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893146#comment-16893146
 ]


Erik Krogen commented on HDFS-13783:
------------------------------------

[~zhangchen], thanks for the changes, it's looking really good. Besides fixing 
the checkstlye warnings reported by Jenkins, I have a few small comments:
* You still have this typo: {{scheduleInteral}} -> {{scheduleInterval}}
* I don't really think we should catch {{InterruptedException}} here:
{code}
        Thread.sleep(scheduleInteval);
      } catch (InterruptedException ie) {
        if (++tried > retryOnException) {
          throw ie;
        }
{code}
If it was interrupted, we should probably respect that and exit.
* Within {{hdfs-default.xml}}, the description has a few spelling and grammar 
errors, I think it should say:
{quote}
When the balancer is executed as a long-running service, it will retry upon 
encountering an exception. This configuration determines how many times it will 
retry before considering the exception to be fatal and quitting.
{quote}
* In {{testBalancerServiceOnError}}, maybe we can use 
{{GenericTestUtils.LogCapturer}} to verify that the Balancer service actually 
had to retry? Or, add a {{retryCount}} variable that tracks the number of 
retries that were necessary (this could be useful for HDFS-10648 as well). If 
this ends up being too difficult I'm okay with leaving it as-is.

> Balancer: make balancer to be a long service process for easy to monitor it.
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-13783
>                 URL: https://issues.apache.org/jira/browse/HDFS-13783
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: balancer &amp; mover
>            Reporter: maobaolong
>            Assignee: Chen Zhang
>            Priority: Major
>         Attachments: HDFS-13783-001.patch, HDFS-13783-002.patch, 
> HDFS-13783.003.patch, HDFS-13783.004.patch
>
>
> If we have a long service process of balancer, like namenode, datanode, we 
> can get metrics of balancer, the metrics can tell us the status of balancer, 
> the amount of block it has moved, 
> We can get or set the balance plan by the balancer webUI. So many things we 
> can do if we have a long balancer service process.
> So, shall we start to plan the new Balancer? Hope this feature can enter the 
> next release of hadoop.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-13783) Balancer: make balancer to be a long service process for easy to monitor it.

Reply via email to