You might be interested in https://issues.apache.org/jira/browse/HDFS-385, where there is discussion about how to add pluggable block placement to HDFS.
Cheers, Tom On Tue, Jun 23, 2009 at 5:50 PM, Alex Loddengaard<a...@cloudera.com> wrote: > Hi Hyunsik, > > Unfortunately you can't control the servers that blocks go on. Hadoop does > block allocation for you, and it tries its best to distribute data evenly > among the cluster, so long as replicated blocks reside on different > machines, on different racks (assuming you've made Hadoop rack-aware). > > Hope this clears things up. > > Alex > > 2009/6/23 Hyunsik Choi <c0d3h...@gmail.com> > >> Hi all, >> >> I would like to give data locality. In other words, I want to place >> certain data blocks on one machine. In some problems, subsets of an >> entire dataset need one another for answer. Most of the graph problems >> are good examples. >> >> Is it possible? If impossible, can you advice about that? >> >> Thank you in advance. >> >> - Hyunsik Choi - >> >