[
https://issues.apache.org/jira/browse/HDFS-9469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036990#comment-15036990
]
Tsz Wo Nicholas Sze commented on HDFS-9469:
-------------------------------------------
I think we need to change the data model to use mean and variance before adding
planner. Otherwise, it is harder to change later.
Other comments:
- In DiskBalancerCluster.createOutPutDirectory
-* createOutPutDirectory: P should be in lower case.
-* It seems that throwing new IOException is enough. We don't need LOG.fatal.
- computePlan: top is "The total number of nodes to process". Then what is
nodesToProcess.size()? Is it supposed top >= nodesToProcess.size()?
- computePoolSize return 0 if nodeCount is 9000. It should not " % 100 " at
the end.
- In PlannerFactory.getPlanner,
-* It logs a messge per node. Is it needed?
-* Is the planner supposed to be fixed for a single run?
-* What other planners are we going to support?
-* It should throw an exception instead of returning null at the end.
- We should use LOG.error instead of LOG.fatal below.
{code}
try {
planList.add(f.get());
} catch (InterruptedException e) {
LOG.fatal("Compute Node plan was cancelled or interrupted : ", e);
} catch (ExecutionException e) {
LOG.fatal("Unable to compute plan : ", e);
}
{code}
- The GreedyPlanner is an algorithm. The input is a DiskBalancerDataNode. So
node should be a parameter of plan() but not a field.
- Use StringUtils.TraditionalBinaryPrefix.long2String(..) instead of adding
getSizeString.
I did not continue reviewing the computation since it requires the data model
change.
> DiskBalancer : Add Planner
> ---------------------------
>
> Key: HDFS-9469
> URL: https://issues.apache.org/jira/browse/HDFS-9469
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: balancer & mover
> Affects Versions: 2.8.0
> Reporter: Anu Engineer
> Assignee: Anu Engineer
> Attachments: HDFS-9469-HDFS-1312.001.patch,
> HDFS-9469-HDFS-1312.002.patch
>
>
> Disk Balancer reads the cluster data and then creates a plan for the data
> moves based on the snap-shot of the data read from the nodes. This plan is
> later submitted to data nodes for execution.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)