Derek Dagit created HDFS-4270:
---------------------------------

             Summary: Replications of the highest priority should be allowed to 
choose a source datanode that has reached its max replication limit
                 Key: HDFS-4270
                 URL: https://issues.apache.org/jira/browse/HDFS-4270
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: namenode
    Affects Versions: 0.23.5, 3.0.0
            Reporter: Derek Dagit
            Assignee: Derek Dagit
            Priority: Minor
             Fix For: 3.0.0, 2.0.3-alpha, 0.23.6


Blocks that have been identified as under-replicated are placed on one of 
several priority queues.  The highest priority queue is essentially reserved 
for situations in which only one replica of the block exists, meaning it should 
be replicated ASAP.

The ReplicationMonitor periodically computes replication work, and a call to 
BlockManager#chooseUnderReplicatedBlocks selects a given number of 
under-replicated blocks, choosing blocks from the highest-priority queue first 
and working down to the lowest priority queue.

In the subsequent call to BlockManager#computeReplicationWorkForBlocks, a 
source for the replication is chosen from among datanodes that have an 
available copy of the block needed.  This is done in 
BlockManager#chooseSourceDatanode.


chooseSourceDatanode's job is to choose the datanode for replication.  It 
chooses a random datanode from the available datanodes that has not reached its 
replication limit (preferring datanodes that are currently decommissioning).

However, the priority queue of the block does not inform the logic.  If a 
datanode holds the last remaining replica of a block and has already reached 
its replication limit, the node is dismissed outright and the replication is 
not scheduled.

In some situations, this could lead to data loss, as the last remaining replica 
could disappear if an opportunity is not taken to schedule a replication.  It 
would be better to waive the max replication limit in cases of highest-priority 
block replication.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to