[
https://issues.apache.org/jira/browse/BEAM-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hideyuki Okada updated BEAM-3958:
---------------------------------
Environment:
Hardware Overview:
Model Name: MacBook Pro
Model Identifier: MacBookPro14,3
Processor Name: Intel Core i7
Processor Speed: 2.8 GHz
maven-compiler-plugin: 3.6.1
- source: 1.8
- target: 1.8
maven-shade-plugin: 3.1.0
exec-maven-plugin: 1.5.0
slf4j-api: 1.7.14
slf4j-jdk14: 1.7.14
google-cloud-dataflow-java-sdk-all: 2.3.0
google-cloud-bigquery: 0.26.0-beta
grpc-google-common-protos: 1.0.0
beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0
was:
maven-compiler-plugin: 3.6.1
- source: 1.8
- target: 1.8
maven-shade-plugin: 3.1.0
exec-maven-plugin] 1.5.0
google-cloud-dataflow-java-sdk-all: 2.3.0
google-cloud-bigquery: 0.26.0-beta
grpc-google-common-protos: 1.0.0
beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0
> beam-sdks-java-io-amazon-web-services may be global pollution.
> --------------------------------------------------------------
>
> Key: BEAM-3958
> URL: https://issues.apache.org/jira/browse/BEAM-3958
> Project: Beam
> Issue Type: Bug
> Components: io-java-aws, io-java-gcp, runner-dataflow
> Affects Versions: 2.3.0
> Environment: Hardware Overview:
> Model Name: MacBook Pro
> Model Identifier: MacBookPro14,3
> Processor Name: Intel Core i7
> Processor Speed: 2.8 GHz
> maven-compiler-plugin: 3.6.1
> - source: 1.8
> - target: 1.8
> maven-shade-plugin: 3.1.0
> exec-maven-plugin: 1.5.0
> slf4j-api: 1.7.14
> slf4j-jdk14: 1.7.14
> google-cloud-dataflow-java-sdk-all: 2.3.0
> google-cloud-bigquery: 0.26.0-beta
> grpc-google-common-protos: 1.0.0
> beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0
> Reporter: Hideyuki Okada
> Assignee: Ismaël Mejía
> Priority: Major
> Fix For: 2.3.0
>
>
> Note:
> I am sorry if it is difficult to read this report because I am not good at
> English.
> Thank you for implementing S3FileSystem.
> I tried implementing a program which performs FileIO with AWS S3 on Dataflow,
> and, It works.
> But other Dataflow Pipeline which moved correctly until adding the SDK to
> dependencies has not working.
> Specifically, the next log will flow after program that has not working
> execution starts.
> `Info: The AWS S3 Beam extension was included in this build, but the
> awsRegion flag was not specified. If you do not plan to use S3, then ignore
> this message. [Date]`
> In practice, jobs that do not end on Dataflow are created. It keeps running
> without spilling out errors or logs.
> And, If you pass 'awsRegion' as an argument, this will works successfully.
> But it is a strange workaround.
> This means that aws sdk is requesting the connection information to a program
> not accessing S3. Is not it contaminated?
> As far as I've investigated, this Log seems to be spitting out in this part
> https://github.com/apache/beam/blob/7fa6292a21564744011fe94a7e50f7e074564b71/sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/s3/S3FileSystem.java#L108-L112
> It must pass the region as an argument?
> I want you to tell me if I'm wrong. And If it is contaminated, I hope this
> problem will be fixed.
> The version of sdks that I tried.
> google-cloud-dataflow-java-sdk-all: 2.3.0
> beam-sdks-java-io-amazon-web-services: 2.3.0 and 2.4.0
> Thank you for reading.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)