[ 
https://issues.apache.org/jira/browse/HADOOP-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12709016#action_12709016
 ] 

dhruba borthakur commented on HADOOP-3799:
------------------------------------------

After sleeping over it, I think it is necessary to ensure that the balancer 
does at least the bare minimum to work elegantly with an external block 
placement policy.

> if a cluster has to support multiple replication policies, it could be the 
> plugin-code's responsiblity to decide which policy to use based on the file 
> owner/permissions/filename for the block

That's my plan. One of my ideas is to change th block placement policy for a 
file directory based on access patterns. The plugin wil analyze a set of past 
access patterns (stored in an external db) to decide what type of placement is 
"currently" best for a dataset.

> Design a pluggable interface to place replicas of blocks in HDFS
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3799
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3799
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: BlockPlacementPluggable.txt
>
>
> The current HDFS code typically places one replica on local rack, the second 
> replica on remote random rack and the third replica on a random node of that 
> remote rack. This algorithm is baked in the NameNode's code. It would be nice 
> to make the block placement algorithm a pluggable interface. This will allow 
> experimentation of different placement algorithms based on workloads, 
> availability guarantees and failure models.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to