[
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15977537#comment-15977537
]
Stephen Sisk commented on BEAM-2005:
------------------------------------
There are a couple possible ways we could adapt to this:
* Potentially we could give different connections different schemas, but that
falls apart if someone wants to use URIs generated elsewhere
* Start passing in the FileSystem object as an option on the read transform
(like hadoop does) - this also incidentally solves the problem of "how will
people know if hdfs is in the set of modules loaded on your system" problem
that was discussed above - they'll need to instantiate the instance themselves
and they'll go through their normal discovery mechanism for doing so.
> Add a Hadoop FileSystem implementation of Beam's FileSystem
> -----------------------------------------------------------
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
> Issue Type: New Feature
> Components: sdk-java-extensions
> Reporter: Stephen Sisk
> Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many
> different places.
> We should add a Hadoop FileSystem implementation
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
> - that would enable us to read from any file system that implements
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)