[
https://issues.apache.org/jira/browse/HADOOP-2065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591764#action_12591764
]
lohit vijayarenu commented on HADOOP-2065:
------------------------------------------
another approach suggested by Konstantin is to have a global map of
corruptBlocks. This has 2 advantages
- in NumReplicas instead of looking up for corruptBlocks for each
DatanodeDescriptor we fetch list of nodes which hold corrupt replicas from the
global map and match the list locally with NodeIterator returned by blocksMap
- In case we need to know the list of corruptBlocks in the FileSystem, we have
all the information at one place
Extending BlockInfo and replacing it back seems complicated. If we move the
list to be a globalList, all we have to handle is a new DataStructure returned
via getBlockLocations. We could have something like LocatedReplicas instead of
DatanodeInfo inside LocatedBlock. I think this solves the whole problem,
Thoughts?
> Replication policy for corrupted block
> ---------------------------------------
>
> Key: HADOOP-2065
> URL: https://issues.apache.org/jira/browse/HADOOP-2065
> Project: Hadoop Core
> Issue Type: Bug
> Components: dfs
> Affects Versions: 0.14.1
> Reporter: Koji Noguchi
> Assignee: lohit vijayarenu
> Fix For: 0.18.0
>
> Attachments: HADOOP-2065.patch
>
>
> Thanks to HADOOP-1955, even if one of the replica is corrupted, the block
> should get replicated from a good replica relatively fast.
> Created this ticket to continue the discussion from
> http://issues.apache.org/jira/browse/HADOOP-1955#action_12531162.
> bq. 2. Delete corrupted source replica
> bq. 3. If all replicas are corrupt, stop replication.
> For (2), it'll be nice if the namenode can delete the corrupted block if
> there's a good replica on other nodes.
> For (3), I prefer if the namenode can still replicate the block.
> Before 0.14, if the file was corrupted, users were still able to pull the
> data and decide if they want to delete those files. (HADOOP-2063)
> In 0.14 and later, we cannot/don't replicate these blocks so they eventually
> get lost.
> To make the matters worse, if the corrupted file is accessed, all the
> corrupted replicas would be deleted except for one and stay as replication
> factor of 1 forever.
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.