He Tianyi created HDFS-10290:
--------------------------------
Summary: Move getBlocks calls to DataNode in Balancer
Key: HDFS-10290
URL: https://issues.apache.org/jira/browse/HDFS-10290
Project: Hadoop HDFS
Issue Type: New Feature
Components: balancer & mover
Affects Versions: 2.6.0
Reporter: He Tianyi
In current implementation, Balancer asks NameNode for a list of blocks on
specific DataNode. This made workload of NameNode heavier, and actually it
caused NameNode flappy when average # of blocks on each DataNode reaches
1,000,000 (NameNode heap size is 192GB, cpu: Xeon E5-2630 * 2).
Recently I investigated whether {{getBlocks}} invocation from Balancer can be
handled by DataNodes, turned out to be practical.
The only pitfall is: since DataNode has no information about other locations of
each block it possesses, some block move may fail (since target node may
already has a replica of that particular block).
I think this may be beneficial for large clusters.
Any suggestions or comments?
Thanks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)