Christopher Tubbs created HADOOP-19816:
------------------------------------------
Summary: Hadoop should ship with a URLStreamHandlerProvider to
handle hdfs: URLs
Key: HADOOP-19816
URL: https://issues.apache.org/jira/browse/HADOOP-19816
Project: Hadoop Common
Issue Type: Wish
Reporter: Christopher Tubbs
Much of Hadoop's APIs seem to be centered around the use of a URI. However,
`hdfs:` URIs are actually locations (URLs) and can be treated as such.
Unfortunately, Hadoop doesn't ship with a URLStreamHandlerProvider, so
applications (including the Hadoop client API itself) cannot use these URLs
directly, because there's no provider in the system capable of interpreting
these URIs as URLs, and `hdfs:` is not a built-in URL scheme in Java.
The Apache Accumulo project wrote a simple implementation to handle URLs with
the `hdfs:` scheme (it does not handle other HDFS schemes, like `viewfs:`,
etc., which may be nice to support, though `file:` URLs that Hadoop also
supports, are probably best left handled by the built-in
URLStreamHandlerProvider in Java, rather than Hadoop's LocalFileSystem or
RawLocalFileSystem implementations).
I think this would be homed better inside the Hadoop project, rather than left
as an exercise to the user. For reference, our implementation is located at:
[https://github.com/apache/accumulo-classloaders/blob/21626c06236d6a96ed0c53ef7fc9c38c3d449a9a/modules/hdfs-urlstreamhandler-provider/README.md]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]