Derek Young wrote:
Reading http://issues.apache.org/jira/browse/HADOOP-341 it sounds like this should be supported, but the http URLs are not working for me. Are http source URLs still supported?

No. They used to be supported, but when distcp was converted to accept any Path this stopped working, since there is no FileSystem implementation mapped to http: paths. Implementing an HttpFileSystem that supports read-only access to files and no directory listings is fairly trivial, but without directory listings, distcp would not work well.

https://issues.apache.org/jira/browse/HADOOP-1563 includes a now long-stale patch that implements an HTTP filesystem, where directory listings are implemented, assuming that:
  - directories are represented by slash-terminated urls;
  - GET of a directory contains the URLs of its children
This works for the directory listings returned by many HTTP servers.

Perhaps someone can update this patch, and, if folks find it useful, we can include it.

Doug

Reply via email to