Regarding your question about a pluggable module to control placement of
data, try taking a look at the abstract class BlockPlacementPolicy and
BlockPlacementPolicyDefault, which is its default implementation.

On branch-1, you can find these classes
at src/hdfs/org/apache/hadoop/hdfs/server/namenode.  On trunk, the package
structure is different, and these classes are
at 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement.

Best of luck with your research!

--Chris


On Fri, Feb 22, 2013 at 11:17 AM, Harsh J <ha...@cloudera.com> wrote:

> There's no filesystem (i.e. client) level APIs to do this, but the
> Balancer tool of HDFS does exactly this. Reading its sources should
> let you understand what kinda calls you need to make to reuse the
> balancer protocol and achieve what you need.
>
> In trunk, the balancer is at
>
> hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Balancer.java
>
> HTH, and feel free to ask any relevant follow up questions.
>
> On Fri, Feb 22, 2013 at 11:43 PM, Karthiek C <karthi...@gmail.com> wrote:
> > Hi,
> >
> > Is there any APIs to move data blocks in HDFS from one node to another *
> > after* they have been added to HDFS? Also can we write some sort of
> > pluggable module (like scheduler) that controls how data gets placed in
> > hadoop cluster? I am working with hadoop-1.0.3 version and I couldn't
> find
> > any filesystem APIs available to do that.
> >
> > PS: I am working on a research project where we want to investigate how
> to
> > optimally place data in hadoop.
> >
> > Thanks,
> > Karthiek
>
>
>
> --
> Harsh J
>

Reply via email to