[
https://issues.apache.org/jira/browse/HADOOP-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510818
]
Doug Cutting commented on HADOOP-1563:
--------------------------------------
I think we should implement a servlet that:
1. Considers everything after the HttpServletRequest#getContextPath() as a path.
2. If it names an HDFS file, set attributes as HTTP headers and, if the request
is HEAD return an empty page, if GET, return the content, otherwise return an
error.
3. If it's a HEAD or GET of a non-slash-terminated directory, redirect to the
slash-terminated directory.
4. If it's a HEAD or GET of a slash-terminated directory name, set attributes
and, if GET, return HTML containing links to that directory's files;
5. Otherwise return an error.
Then we should try to use this as a source for MapReduce and distcp and see how
it fares. The HTTP client may need to be replaced, file status may need to be
cached, etc. But this simple approach will get us up and going, and avoid
investing too much time designing a schema, parsing XML, etc. when that may not
be required.
Thoughts?
> Create FileSystem implementation to read HDFS data via http
> -----------------------------------------------------------
>
> Key: HADOOP-1563
> URL: https://issues.apache.org/jira/browse/HADOOP-1563
> Project: Hadoop
> Issue Type: New Feature
> Components: fs
> Affects Versions: 0.14.0
> Reporter: Owen O'Malley
> Assignee: Chris Douglas
> Attachments: httpfs.patch
>
>
> There should be a FileSystem implementation that can read from a Namenode's
> http interface. This would have a couple of useful abilities:
> 1. Copy using distcp between different versions of HDFS.
> 2. Use map/reduce inputs from a different version of HDFS.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.