Re: S3 support?

Lukasz Cwik Tue, 13 Jun 2017 09:03:07 -0700

Yes you can use the HadoopFileSystem and use the Hadoop S3A connector.

Documentation about options/configuration for S3 Hadoop connectors:
https://wiki.apache.org/hadoop/AmazonS3

Build a valid Hadoop configuration for S3 and set it on the
HadoopFileSystemOptions:
Configuration s3Configuration = // load from file or create programmatically
PipelineOptions options = ...
options.as
(HadoopFileSystemOptions.class).setHdfsConfiguration(Arrays.asList(s3Configuration));
TextIO.read().from("s3a://my/path");

See HadoopFileSystemOptions for more details on how it can automatically
find Hadoop installations and configure itself and also how it can be
configured from the command line:
https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.java

On Tue, Jun 13, 2017 at 8:53 AM, André Pinto <[email protected]>
wrote:

> Hi,
>
> I have seen in the documentation that you have support for:
>
> TextIO.read().from("gs://some/inputData.txt")
>
> Is a similar S3 support planned too?
>
> Thanks,
> André
>

Re: S3 support?

Reply via email to