[
https://issues.apache.org/jira/browse/HADOOP-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501720
]
Hairong Kuang commented on HADOOP-1300:
---------------------------------------
> This might be helpful because this algorithm might be heavy on CPU usage
I do not think this algorithm is CPU heavy. Basically it scans all replicas of
a block for at most 3 times. When there are excess replicas, the most common
case is 4 or 5 replicas per block.
> is there any reasons to adopt different policies for allocation and deletion?
For allocation, where to place a replica has an effect on performance. But for
deletion, the cost of deleting any replica is the same.
> deletion of excess replicas does not take into account 'rack-locality'
> ----------------------------------------------------------------------
>
> Key: HADOOP-1300
> URL: https://issues.apache.org/jira/browse/HADOOP-1300
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: Koji Noguchi
> Assignee: Hairong Kuang
> Attachments: excessDel.patch
>
>
> One rack went down today, resulting in one missing block/file.
> Looking at the log, this block was originally over-replicated.
> 3 replicas on one rack and 1 replica on another.
> Namenode decided to delete the latter, leaving 3 replicas on the same rack.
> It'll be nice if the deletion is also rack-aware.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.