wangzhixiang created HDFS-15560:
-----------------------------------
Summary: The getMaxNodesPerRack May Cause "Failed to place enough
replicas"
Key: HDFS-15560
URL: https://issues.apache.org/jira/browse/HDFS-15560
Project: Hadoop HDFS
Issue Type: Bug
Reporter: wangzhixiang
Assignee: wangzhixiang
In our hdfs Cluster, the nodes in each rack is extremely uneven.
Eg. rack1=[1 node], rack2=[1 node], rack3=[3 nodes], rack4=[5 nodes], rack5=[4
nodes], rack6=[4 nodes].
When invoke getMaxNodesPerRack method, we will get MaxNodesPerRack = 4 by
MaxNodesPerRack = (totalNumOfReplicas-1)/numOfRacks + 2, totalNumOfReplicas =
18, numOfRacks = 6。
And the replications of some files in our cluster is set to 50, so it be
allocated 18 replicas and we need the all nodes . However, the rack4 could only
choose 4 nodes because of MaxNodesPerRack = 4. It will cause only 17
(1+1+3+4+4+4) replicas be choosen and throws the warn log "Failed to place
enough replicas, still in need of 1 to reach 18".
Besides, ReplicationMonitor will add the file as ReplicationWork to retry and
it still failed in loop.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]