[
https://issues.apache.org/jira/browse/HADOOP-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12510998
]
Tom White commented on HADOOP-1563:
-----------------------------------
bq. Then we should try to use this as a source for MapReduce and distcp and see
how it fares. The HTTP client may need to be replaced, file status may need to
be cached, etc. But this simple approach will get us up and going, and avoid
investing too much time designing a schema, parsing XML, etc. when that may not
be required.
+1
A couple of points regarding the patch:
In HttpFileSystem#initialize the name variable is set to itself, so it's always
null.
By removing getDefaultBlockSize() in S3FileSystem the property
"fs.s3.block.size" is removed (but it's still in hadoop-default.xml). This
looks like a change that was made earlier in the checksumming work, so is
probably fine in the context of this patch.
Finally, some unit tests would be good. Otherwise, it looks good.
> Create FileSystem implementation to read HDFS data via http
> -----------------------------------------------------------
>
> Key: HADOOP-1563
> URL: https://issues.apache.org/jira/browse/HADOOP-1563
> Project: Hadoop
> Issue Type: New Feature
> Components: fs
> Affects Versions: 0.14.0
> Reporter: Owen O'Malley
> Assignee: Chris Douglas
> Attachments: httpfs.patch
>
>
> There should be a FileSystem implementation that can read from a Namenode's
> http interface. This would have a couple of useful abilities:
> 1. Copy using distcp between different versions of HDFS.
> 2. Use map/reduce inputs from a different version of HDFS.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.