1. All the specifics of Hadoop's operations are hidden in the source.
   That's a get-out clause of OSS, I know, but sometimes it's the clearest.
   2. For webhdfs I suspect it picks a local node with the data -you'd have
   to experiment to make sure
   3.  If webhdfs is missing fetaures, I'm sure they'd be welcome
   4. Hadoop 2.2+ uses protobuf for stable and cross platform IPC. , the
   listLocatedStatus() call on a filesystem will give you all the locations of
   blocks. Using that is another option -and probably higher performance- but
   is going to require more upfront engineering than GET calls. Sticking to
   WebHDFS -and extending it- is probably simpler



On 17 March 2014 17:29, RJ Nowling <rnowl...@gmail.com> wrote:

> Hi all,
>
> I sent an email to user@ but no one there was able to answer my question.
>  I hope you don't mind me emailing hdfs-dev@ about it.
>
> I'm submitting a proposal to Google Summer of Code to add support for HDFS
> to Disco, an Erlang MapReduce system.  We're looking at using WebHDFS.  As
> with Hadoop, we need information about the locality of the file blocks so
> that we can schedule tasks accordingly.
>
> WebHDFS does seem to provide some information about data locality.  When
> you make a request for a file to the namenode, you are redirected to the
> datanode containing the first block of that file.
>
> 1) But what happens if you specify an offset in the third block?  Are you
> redirected to the datanode containing that block or are you still
> redirected to the datanode containing the file's first block?
>
> 2) Is there any reason that WebHDFS does not support requesting the block
> locations?
>
> 3) Would the HDFS community be interested in a patch that adds support for
> a) reporting block locations and b) enables requesting blocks from the
> appropriate data nodes (if it is not already there)?  I believe this would
> be of interest to other projects that are using WebHDFS.
>
> Thank you!
>
> RJ
>
> --
> em rnowl...@gmail.com
> c 954.496.2314
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Reply via email to