Kihwal Lee created HDFS-7300:
--------------------------------
Summary: The getMaxNodesPerRack() method in
BlockPlacementPolicyDefault is flawed
Key: HDFS-7300
URL: https://issues.apache.org/jira/browse/HDFS-7300
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Kihwal Lee
Priority: Critical
The {{getMaxNodesPerRack()}} can produce an undesirable result in some cases.
- Three replicas on two racks. The max is 3, so everything can go to one rack.
- Two replicas on two or more racks. The max is 2, both replicas can end up in
the same rack.
{{BlockManager#isNeededReplication()}} fixes this after block/file is closed
because {{blockHasEnoughRacks()}} will return fail. This is not only extra
work, but also can break the favored nodes feature.
When there are two racks and two favored nodes are specified in the same rack,
NN may allocate the third replica on a node in the same rack, because
{{maxNodesPerRack}} is 3. When closing the file, NN moves a block to the other
rack. There is 66% chance that a favored node is moved. If {{maxNodesPerRack}}
was 2, this would not happen.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)