[jira] Commented: (HADOOP-3799) Design a pluggable interface to place replicas of blocks in HDFS

dhruba borthakur (JIRA) Tue, 12 May 2009 22:47:11 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708766#action_12708766
 ]


dhruba borthakur commented on HADOOP-3799:
------------------------------------------

> Especially since in order to be stable under the rebalancer

Oh guys, you are going too far! I am talking of faster cycle of innovation and 
iteration. A pluggable interface allows the hadoop community to try experiments 
with newer methods of block placement. Once such a placement algorithm proves 
beneficial and helpful, does the related questions of "how to make the balancer 
work with the new placement policy" come into my mind. If  experiments prove 
that there isn't any viable alternative pluggable policy, then the question of 
"does the balancer work with a pluggable policy" is moot.  

> hdfs probably needs to store metadata with the files or blocks

I do not like this approach. It makes hdfs heavy, clunky and difficult to 
maintain. Have you seen what happened to other file system that tried to do 
everything inside it, e.g. DCE-DFS? It is possible that HDFS might allow 
generic blobs to be stored stored with files (aka extended file attributes) 
where application specific data can be stored. But it should be disassociated 
from a "requirement" that archival-policy has to be stored with file meta-data.

Again folks, I agree completely with you that a "finished product" needs to 
encompass the "balancer". But to start experimenting to figure out whether a 
different placement policy is beneificial at all, I need the pluggability 
feature, otherwise I have to keep changing my hadoop source code every time I 
want to experiment. My experiments will probably take 3 to six months, 
especially because I want to benchmark results at large scale.

For installations that go with the default policy, there is no impact at all.















> Design a pluggable interface to place replicas of blocks in HDFS
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3799
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3799
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: BlockPlacementPluggable.txt
>
>
> The current HDFS code typically places one replica on local rack, the second 
> replica on remote random rack and the third replica on a random node of that 
> remote rack. This algorithm is baked in the NameNode's code. It would be nice 
> to make the block placement algorithm a pluggable interface. This will allow 
> experimentation of different placement algorithms based on workloads, 
> availability guarantees and failure models.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3799) Design a pluggable interface to place replicas of blocks in HDFS

Reply via email to