[ 
https://issues.apache.org/jira/browse/HADOOP-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510818
 ] 

Doug Cutting commented on HADOOP-1563:
--------------------------------------

I think we should implement a servlet that:
1. Considers everything after the HttpServletRequest#getContextPath() as a path.
2. If it names an HDFS file, set attributes as HTTP headers and, if the request 
is HEAD return an empty page, if GET, return the content, otherwise return an 
error.
3. If it's a HEAD or GET of a non-slash-terminated directory, redirect to the 
slash-terminated directory.
4. If it's a HEAD or GET of a slash-terminated directory name, set attributes 
and, if GET, return HTML containing links to that directory's files;
5. Otherwise return an error.

Then we should try to use this as a source for MapReduce and distcp and see how 
it fares.  The HTTP client may need to be replaced, file status may need to be 
cached, etc.  But this simple approach will get us up and going, and avoid 
investing too much time designing a schema, parsing XML, etc. when that may not 
be required.

Thoughts?

> Create FileSystem implementation to read HDFS data via http
> -----------------------------------------------------------
>
>                 Key: HADOOP-1563
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1563
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 0.14.0
>            Reporter: Owen O'Malley
>            Assignee: Chris Douglas
>         Attachments: httpfs.patch
>
>
> There should be a FileSystem implementation that can read from a Namenode's 
> http interface. This would have a couple of useful abilities:
>   1. Copy using distcp between different versions of HDFS.
>   2. Use map/reduce inputs from a different version of HDFS. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to