[
https://issues.apache.org/jira/browse/BEAM-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239745#comment-16239745
]
ASF GitHub Bot commented on BEAM-2500:
--------------------------------------
GitHub user jacobmarble opened a pull request:
https://github.com/apache/beam/pull/4080
[BEAM-2500] Add S3 FileSystem to Java SDK
My first contribution, will submit ICLA later today.
Ran `mvn clean verify` against sdks/java/io only, because I'm currently in
a place with limited power and bandwidth.
There is some technical discussion in the JIRA.
https://issues.apache.org/jira/browse/BEAM-2500
Please notice that there is one persistent thread pool, and that copy
operations can create a new thread pool per copy. Happy to hear feedback and
discuss.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Kochava/beam s3
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/4080.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4080
----
commit b7964d1420f9f80da254960f332c077b65642667
Author: Jacob Marble <[email protected]>
Date: 2017-10-01T23:31:25Z
Add S3FileSystem to SDKs/Java/IO
This module was originally modeled after GcsFileSystem and related, but
has diverged quite a bit.
commit fc4cd7564de68b40d92c04a02657428fc62aee2e
Author: Jacob Marble <[email protected]>
Date: 2017-11-05T22:12:01Z
bump Beam version to 2.3.0-SNAPSHOT
----
> Add support for S3 as a Apache Beam FileSystem
> ----------------------------------------------
>
> Key: BEAM-2500
> URL: https://issues.apache.org/jira/browse/BEAM-2500
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-extensions
> Reporter: Luke Cwik
> Assignee: Jacob Marble
> Priority: Minor
> Attachments: hadoop_fs_patch.patch
>
>
> Note that this is for providing direct integration with S3 as an Apache Beam
> FileSystem.
> There is already support for using the Hadoop S3 connector by depending on
> the Hadoop File System module[1], configuring HadoopFileSystemOptions[2] with
> a S3 configuration[3].
> 1: https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system
> 2:
> https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.java#L53
> 3: https://wiki.apache.org/hadoop/AmazonS3
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)