Stephen O'Donnell created HDFS-14637:
----------------------------------------
Summary: Namenode may not replicate blocks to meet the policy
after enabling upgradeDomain
Key: HDFS-14637
URL: https://issues.apache.org/jira/browse/HDFS-14637
Project: Hadoop HDFS
Issue Type: Bug
Components: namenode
Affects Versions: 3.3.0
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell
After changing the network topology or placement policy on a cluster and
restarting the namenode, the namenode will scan all blocks on the cluster at
startup, and check if they meet the current placement policy. If they do not,
they are added to the replication queue and the namenode will arrange for them
to be replicated to ensure the placement policy is used.
If you start with a cluster with no UpgradeDomain, and then enable
UpgradeDomain, then on restart the NN does notice all the blocks violate the
placement policy and it adds them to the replication queue. I believe there are
some issues in the logic that prevents the blocks from replicating depending on
the setup:
With UD enabled, but no racks configured, and possible on a 2 rack cluster, the
queued replication work never makes any progress, as in
blockManager.validateReconstructionWork(), it checks to see if the new replica
increases the number of racks, and if it does not, it skips it and tries again
later.
{code:java}
DatanodeStorageInfo[] targets = rw.getTargets();
if ((numReplicas.liveReplicas() >= requiredRedundancy) &&
(!isPlacementPolicySatisfied(block)) ) {
if (!isInNewRack(rw.getSrcNodes(), targets[0].getDatanodeDescriptor())) {
// No use continuing, unless a new rack in this case
return false;
}
// mark that the reconstruction work is to replicate internal block to a
// new rack.
rw.setNotEnoughRack();
}
{code}
Additionally, in blockManager.scheduleReconstruction() is there some logic that
sets the number of new replicas required to one, if the live replicas >=
requiredReduncancy:
{code:java}
int additionalReplRequired;
if (numReplicas.liveReplicas() < requiredRedundancy) {
additionalReplRequired = requiredRedundancy - numReplicas.liveReplicas()
- pendingNum;
} else {
additionalReplRequired = 1; // Needed on a new rack
}{code}
With UD, it is possible for 2 new replicas to be needed to meet the block
placement policy, if all existing replicas are on a node with the same domain.
For traditional '2 rack redundancy', only 1 new replica would ever have been
needed in this scenario.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]