[
https://issues.apache.org/jira/browse/HDFS-15200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17049202#comment-17049202
]
Surendra Singh Lilhore commented on HDFS-15200:
-----------------------------------------------
I feel we can delete corrupt replica because no chance of getting corrected it.
As stale storage replica will be reported live in next BR, hopefully :).
[~arp], [~aajisaka], [~weichiu] any thought on this ?
> Delete Corrupt Replica Immediately Irrespective of Replicas On Stale Storage
> -----------------------------------------------------------------------------
>
> Key: HDFS-15200
> URL: https://issues.apache.org/jira/browse/HDFS-15200
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Ayush Saxena
> Assignee: Ayush Saxena
> Priority: Critical
>
> Presently {{invalidateBlock(..)}} before adding a replica into invalidates,
> checks whether any block replica is on stale storage, if any replica is on
> stale storage, it postpones deletion of the replica.
> Here :
> {code:java}
> // Check how many copies we have of the block
> if (nr.replicasOnStaleNodes() > 0) {
> blockLog.debug("BLOCK* invalidateBlocks: postponing " +
> "invalidation of {} on {} because {} replica(s) are located on " +
> "nodes with potentially out-of-date block reports", b, dn,
> nr.replicasOnStaleNodes());
> postponeBlock(b.getCorrupted());
> return false;
> {code}
>
> In case of corrupt replica, we can skip this logic and delete the corrupt
> replica immediately, as a corrupt replica can't get corrected.
> One outcome of this behavior presently is namenodes showing different block
> states post failover, as:
> If a replica is marked corrupt, the Active NN, will mark it as corrupt, and
> mark it for deletion and remove it from corruptReplica's and
> excessRedundancyMap.
> If before the deletion of replica, Failover happens.
> The standby Namenode will mark all the storages as stale.
> Then will start processing IBR's, Now since the replica's would be on stale
> storage, it will skip deletion, and removal from corruptReplica's
> Hence both the namenode will show different numbers and different corrupt
> replicas.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]