[ 
https://issues.apache.org/jira/browse/HDFS-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haiyang Hu updated HDFS-17137:
------------------------------
    Description: 
Standby/Observer NameNode should not handle redundant replica block logic when 
set decrease replication.

At present, when call setReplication to execute the logic of  decrease 
replication, 
* ActiveNameNode will call the BlockManager#processExtraRedundancyBlock method 
to select the dn of the redundant replica , will add to the excessRedundancyMap 
and add to invalidateBlocks (RedundancyMonitor will be scheduled to delete the 
block on dn).

* Then the StandyNameNode or ObserverNameNode load editlog and apply the 
SetReplicationOp, if the dn of the replica to be deleted has not yet performed 
incremental block report,
here also will BlockManager#processExtraRedundancyBlock method be called here 
to select the dn of the redundant replica and add it to the excessRedundancyMap 
(here selected the redundant dn  may be inconsistent with the dn selected in 
the active namenode).

In excessRedundancyMap exist dn maybe affects the dn decommission, resulting 
can not to complete decommission dn operation in Standy/ObserverNameNode.

The specific cases are as follows:
For example a file is 3 replica (d1,d2,d3)  and call setReplication set file to 
2 replica.

* ActiveNameNode  select d1 with redundant replicas to add 
toexcessRedundancyMap and invalidateBlocks.
* StandyNameNode replays SetReplicationOp (at this time, d1 has not yet 
executed incremental block report), so here maybe selected redundant replica dn 
are inconsistent with ActiveNameNode, such as select d2 to add  
excessRedundancyMap.
* At this time, d1 completes deleting the block for incremental block report.
* The DN list for this block in ActiveNameNode includes d2 and d3 (delete d1 
from in the excessRedundancyMap when processing the incremental block report ).
* The DN list for this block in StandyNameNode includes d2 and d3  (can not 
delete d2 from in the excessRedundancyMap when processing the incremental block 
report).

At this time, execute the decommission operation on d3.
* ActiveNameNode will select a new node d4 to copy the replica, and d4 will run 
incrementally block report.
* The DN list for this block in ActiveNameNode includes d2 and 
d3(decommissioning status),d4, then d3 can to decommissioned normally.
* The DN list for this block in StandyNameNode is d3 (decommissioning status), 
d2 (redundant status), d4.  
since the requirements for two live replica are not met, d3 cannot be 
decommissioned at this time.

Therefore, StandyNameNode or ObserverNameNode considers not process redundant 
replicas logic when call setReplication.








>  Standby/Observer NameNode should not  handle redundant replica block logic  
> when set decrease replication
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-17137
>                 URL: https://issues.apache.org/jira/browse/HDFS-17137
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Haiyang Hu
>            Assignee: Haiyang Hu
>            Priority: Major
>
> Standby/Observer NameNode should not handle redundant replica block logic 
> when set decrease replication.
> At present, when call setReplication to execute the logic of  decrease 
> replication, 
> * ActiveNameNode will call the BlockManager#processExtraRedundancyBlock 
> method to select the dn of the redundant replica , will add to the 
> excessRedundancyMap and add to invalidateBlocks (RedundancyMonitor will be 
> scheduled to delete the block on dn).
> * Then the StandyNameNode or ObserverNameNode load editlog and apply the 
> SetReplicationOp, if the dn of the replica to be deleted has not yet 
> performed incremental block report,
> here also will BlockManager#processExtraRedundancyBlock method be called here 
> to select the dn of the redundant replica and add it to the 
> excessRedundancyMap (here selected the redundant dn  may be inconsistent with 
> the dn selected in the active namenode).
> In excessRedundancyMap exist dn maybe affects the dn decommission, resulting 
> can not to complete decommission dn operation in Standy/ObserverNameNode.
> The specific cases are as follows:
> For example a file is 3 replica (d1,d2,d3)  and call setReplication set file 
> to 2 replica.
> * ActiveNameNode  select d1 with redundant replicas to add 
> toexcessRedundancyMap and invalidateBlocks.
> * StandyNameNode replays SetReplicationOp (at this time, d1 has not yet 
> executed incremental block report), so here maybe selected redundant replica 
> dn are inconsistent with ActiveNameNode, such as select d2 to add  
> excessRedundancyMap.
> * At this time, d1 completes deleting the block for incremental block report.
> * The DN list for this block in ActiveNameNode includes d2 and d3 (delete d1 
> from in the excessRedundancyMap when processing the incremental block report 
> ).
> * The DN list for this block in StandyNameNode includes d2 and d3  (can not 
> delete d2 from in the excessRedundancyMap when processing the incremental 
> block report).
> At this time, execute the decommission operation on d3.
> * ActiveNameNode will select a new node d4 to copy the replica, and d4 will 
> run incrementally block report.
> * The DN list for this block in ActiveNameNode includes d2 and 
> d3(decommissioning status),d4, then d3 can to decommissioned normally.
> * The DN list for this block in StandyNameNode is d3 (decommissioning 
> status), d2 (redundant status), d4.  
> since the requirements for two live replica are not met, d3 cannot be 
> decommissioned at this time.
> Therefore, StandyNameNode or ObserverNameNode considers not process redundant 
> replicas logic when call setReplication.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to