[ 
https://issues.apache.org/jira/browse/HADOOP-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663475#action_12663475
 ] 

Doug Cutting commented on HADOOP-5010:
--------------------------------------

> There is currently no external way to push data to an HDFS nor pull from an 
> HDFS using an existing standard; instead anyone wishing to do so must install 
> HDFS clients on computers that do not otherwise run Hadoop software.

Or simply run something like 'ssh foo distcp ...', where foo is a host in the 
cluster.  It would be better to know more about the requirement.

> add a pure HTTP support for retrieving files using standard HTTP clients like 
> curl

https://issues.apache.org/jira/browse/HADOOP-1563?focusedCommentId=12510760#action_12510760

In that comment I suggest a convention for encoding directory listings as links 
in the HTML of slash-ending URLs.  I also provided a patch there that 
implements a client for this.  Here the focus seems to be on a servlet that 
implements the server-side of this for HDFS.  That seems reasonable.  It would 
also be browsable, which is nice.

> we actually have a requirement for it in Yahoo

Can you say more about the requirement?  Are directory listings required?  Is 
other file status information required?  Some file status can be done in HTTP 
(e.g., the last-modified header), but some does not have a natural place (e.g., 
owner, group & permissions).

> Replace HFTP/HSFTP with plain HTTP/HTTPS
> ----------------------------------------
>
>                 Key: HADOOP-5010
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5010
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/hdfsproxy
>    Affects Versions: 0.18.0
>            Reporter: Marco Nicosia
>
> In HADOOP-1563, [~cutting] wrote:
> bq. The URI for this should be something like hftp://host:port/a/b/c, since, 
> while HTTP will be used as the transport, this will not be a FileSystem for 
> arbitrary HTTP urls.
> Recently, we've been talking about implementing an HDFS proxy (HADOOP-4575) 
> which would be a secure way to make HFTP/HSFTP available. In so doing, we may 
> even remove HFTP/HSFTP from being offered on the HDFS itself (that's another 
> discussion).
> In the case of the HDFS proxy, does it make sense to do away with the 
> artificial HFTP/HSFTP protocols, and instead simply offer standard HTTP and 
> HTTPS? That would allow non-HDFS-specific clients, as well as using various 
> standard HTTP infrastructure, such as load balancers, etc.
> NB, to the best of my knowledge, HFTP is only documented on the 
> [distcp|http://hadoop.apache.org/core/docs/current/distcp.html] page, and 
> HSFTP is not documented at all?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to