[ 
https://issues.apache.org/jira/browse/HADOOP-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12501720
 ] 

Hairong Kuang commented on HADOOP-1300:
---------------------------------------

> This might be helpful because this algorithm might be heavy on CPU usage
I do not think this algorithm is CPU heavy. Basically it scans all replicas of 
a block for at most 3 times. When there are excess replicas,  the most common 
case is 4 or 5 replicas per block. 

> is there any reasons to adopt different policies for allocation and deletion?
For allocation, where to place a replica has an effect on performance. But for 
deletion, the cost of deleting any replica is the same.

> deletion of excess replicas does not take into account 'rack-locality'
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-1300
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1300
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Koji Noguchi
>            Assignee: Hairong Kuang
>         Attachments: excessDel.patch
>
>
> One rack went down today, resulting in one missing block/file.
> Looking at the log, this block was originally over-replicated. 
> 3 replicas on one rack and 1 replica on another.
> Namenode decided to delete the latter, leaving 3 replicas on the same rack.
> It'll be nice if the deletion is also rack-aware.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to