wudeyu created HDFS-16100:
-----------------------------

             Summary:  HA: Improve performance of Standby node transition to 
Active
                 Key: HDFS-16100
                 URL: https://issues.apache.org/jira/browse/HDFS-16100
             Project: Hadoop HDFS
          Issue Type: Wish
          Components: namenode
            Reporter: wudeyu


pendingDNMessages in Standby is used to support process postponed block 
reports. Block reports in pendingDNMessages would be processed:
 # If GS of replica is in the future, Standby Node will process it when 
corresponding edit log(e.g add_block) is loaded.
 # If replica is corrupted, Standby Node will process it while it transfer to 
Active.
 # If DataNode is removed, corresponding of block reports will be removed in 
pendingDNMessages.

Obviously, if num of corrupted replica grows, more time cost during 
transferring. In out situation, there're 60 millions block reports in 
pendingDNMessages before transfer. Processing block reports cost almost 7mins 
and it's killed by zkfc. The replica state of the most block reports is RBW 
with wrong GS(less than storedblock in Standby Node).

In my opinion, Standby Node could ignore the block reports that replica state 
is RBW with wrong GS. Because Active node/DataNode will remove it later.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to