According to this page: http://hortonworks.com/blog/webhdfs-%E2%80%93-http-rest-access-to-hdfs/
> *Data Locality*: The file read and file write calls are redirected to the > corresponding datanodes. It uses the full bandwidth of the Hadoop cluster > for streaming data. > > *A HDFS Built-in Component*: WebHDFS is a first class built-in component > of HDFS. It runs inside Namenodes and Datanodes, therefore, it can use all > HDFS functionalities. It is a part of HDFS - there are no additional > servers to install > So it looks like the data locality is built-into webhdfs, client will be redirected to the data node automatically. On Mon, Mar 17, 2014 at 6:07 AM, RJ Nowling <rnowl...@gmail.com> wrote: > Hi all, > > I'm writing up a Google Summer of Code proposal to add HDFS support to > Disco, an Erlang MapReduce framework. > > We're interested in using WebHDFS. I have two questions: > > 1) Does WebHDFS allow querying data locality information? > > 2) If the data locality information is known, can data on specific data > nodes be accessed via Web HDFS? Or do all Web HDFS requests have to go > through a single server? > > Thanks, > RJ > > -- > em rnowl...@gmail.com > c 954.496.2314 > -- Cheers -MJ