Uma Maheswara Rao G created HDFS-9381:
-----------------------------------------

             Summary: When same block came for replication for Striped mode, we 
can move that block to PendingReplications
                 Key: HDFS-9381
                 URL: https://issues.apache.org/jira/browse/HDFS-9381
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: namenode
    Affects Versions: 3.0.0
            Reporter: Uma Maheswara Rao G
            Assignee: Uma Maheswara Rao G


Currently I noticed that we are just returning null if block already exists in 
pendingReplications in replication flow for striped blocks.

{code}
if (block.isStriped()) {
      if (pendingNum > 0) {
        // Wait the previous recovery to finish.
        return null;
      }
{code}

 Here if neededReplications contains only fewer blocks(basically by default if 
less than numliveNodes*2), then same blocks can be picked again from 
neededReplications if we just return null as we are not removing element from 
neededReplications. Since this replication process need to take fsnamesystmem 
lock and do, we may spend some time unnecessarily in every loop. 

So my suggestion/improvement is:
 Instead of just returning null, how about incrementing pendingReplications for 
this block and remove from neededReplications? and also another point to 
consider here is, to add into pendingReplications, generally we need target and 
it is nothing to which node we issued replication command. Later when after 
replication success and DN reported it, block will be removed from 
pendingReplications from NN addBlock. 

 So since this is newly picked block from neededReplications, we would not have 
selected target yet. So which target to be passed to pendingReplications if we 
add this block.. One Option I am thinking is, how about just passing srcNode 
itself as target for this special condition? So, anyway if block is really 
missed, srcNode anyway will not report it. So this block will not be removed 
from pending replications, so that when it timeout, it will be considered for 
replication and that time it will find actual target to replicate.





 


 So  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to