[
https://issues.apache.org/jira/browse/HDFS-17137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Haiyang Hu updated HDFS-17137:
------------------------------
Description:
Standby/Observer NameNode should not handle redundant replica block logic when
set decrease replication.
At present, when call setReplication to execute the logic of decrease
replication,
* ActiveNameNode will call the BlockManager#processExtraRedundancyBlock method
to select the dn of the redundant replica , will add to the excessRedundancyMap
and add to invalidateBlocks (RedundancyMonitor will be scheduled to delete the
block on dn).
* Then the StandyNameNode or ObserverNameNode load editlog and apply the
SetReplicationOp, if the dn of the replica to be deleted has not yet performed
incremental block report,
here also will BlockManager#processExtraRedundancyBlock method be called here
to select the dn of the redundant replica and add it to the excessRedundancyMap
(here selected the redundant dn may be inconsistent with the dn selected in
the active namenode).
In excessRedundancyMap exist dn maybe affects the dn decommission, resulting
can not to complete decommission dn operation in Standy/ObserverNameNode.
The specific cases are as follows:
For example a file is 3 replica (d1,d2,d3) and call setReplication set file to
2 replica.
* ActiveNameNode select d1 with redundant replicas to add
toexcessRedundancyMap and invalidateBlocks.
* StandyNameNode replays SetReplicationOp (at this time, d1 has not yet
executed incremental block report), so here maybe selected redundant replica dn
are inconsistent with ActiveNameNode, such as select d2 to add
excessRedundancyMap.
* At this time, d1 completes deleting the block for incremental block report.
* The DN list for this block in ActiveNameNode includes d2 and d3 (delete d1
from in the excessRedundancyMap when processing the incremental block report ).
* The DN list for this block in StandyNameNode includes d2 and d3 (can not
delete d2 from in the excessRedundancyMap when processing the incremental block
report).
At this time, execute the decommission operation on d3.
* ActiveNameNode will select a new node d4 to copy the replica, and d4 will run
incrementally block report.
* The DN list for this block in ActiveNameNode includes d2 and
d3(decommissioning status),d4, then d3 can to decommissioned normally.
* The DN list for this block in StandyNameNode is d3 (decommissioning status),
d2 (redundant status), d4.
since the requirements for two live replica are not met, d3 cannot be
decommissioned at this time.
Therefore, StandyNameNode or ObserverNameNode considers not process redundant
replicas logic when call setReplication.
> Standby/Observer NameNode should not handle redundant replica block logic
> when set decrease replication
> ----------------------------------------------------------------------------------------------------------
>
> Key: HDFS-17137
> URL: https://issues.apache.org/jira/browse/HDFS-17137
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Haiyang Hu
> Assignee: Haiyang Hu
> Priority: Major
>
> Standby/Observer NameNode should not handle redundant replica block logic
> when set decrease replication.
> At present, when call setReplication to execute the logic of decrease
> replication,
> * ActiveNameNode will call the BlockManager#processExtraRedundancyBlock
> method to select the dn of the redundant replica , will add to the
> excessRedundancyMap and add to invalidateBlocks (RedundancyMonitor will be
> scheduled to delete the block on dn).
> * Then the StandyNameNode or ObserverNameNode load editlog and apply the
> SetReplicationOp, if the dn of the replica to be deleted has not yet
> performed incremental block report,
> here also will BlockManager#processExtraRedundancyBlock method be called here
> to select the dn of the redundant replica and add it to the
> excessRedundancyMap (here selected the redundant dn may be inconsistent with
> the dn selected in the active namenode).
> In excessRedundancyMap exist dn maybe affects the dn decommission, resulting
> can not to complete decommission dn operation in Standy/ObserverNameNode.
> The specific cases are as follows:
> For example a file is 3 replica (d1,d2,d3) and call setReplication set file
> to 2 replica.
> * ActiveNameNode select d1 with redundant replicas to add
> toexcessRedundancyMap and invalidateBlocks.
> * StandyNameNode replays SetReplicationOp (at this time, d1 has not yet
> executed incremental block report), so here maybe selected redundant replica
> dn are inconsistent with ActiveNameNode, such as select d2 to add
> excessRedundancyMap.
> * At this time, d1 completes deleting the block for incremental block report.
> * The DN list for this block in ActiveNameNode includes d2 and d3 (delete d1
> from in the excessRedundancyMap when processing the incremental block report
> ).
> * The DN list for this block in StandyNameNode includes d2 and d3 (can not
> delete d2 from in the excessRedundancyMap when processing the incremental
> block report).
> At this time, execute the decommission operation on d3.
> * ActiveNameNode will select a new node d4 to copy the replica, and d4 will
> run incrementally block report.
> * The DN list for this block in ActiveNameNode includes d2 and
> d3(decommissioning status),d4, then d3 can to decommissioned normally.
> * The DN list for this block in StandyNameNode is d3 (decommissioning
> status), d2 (redundant status), d4.
> since the requirements for two live replica are not met, d3 cannot be
> decommissioned at this time.
> Therefore, StandyNameNode or ObserverNameNode considers not process redundant
> replicas logic when call setReplication.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]