[
https://issues.apache.org/jira/browse/BEAM-10335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17270754#comment-17270754
]
Tao Li edited comment on BEAM-10335 at 1/23/21, 9:00 PM:
---------------------------------------------------------
Hi [~jalmeida] thanks for contributing to this nice feature to enable assume
role. I am doing some testing against this feature by specifying an IAM role
and access s3 data. I am using:
# beam 2.25 version
# direct runner for the testing.
# ParquetIO to read parquet files
Below is the command.
java -cp <jar file> --inputPath="s3://output/path/*.parquet"
--awsCredentialsProvider='\{"@type":"STSAssumeRoleSessionCredentialsProvider",
"roleArn":"<iam role name>", "roleSessionName":"<session name>"}'
Is this the right way to specify the role? I am seeing a potential problem with
this command. I have made sure the specified IAM role has read access to the s3
files (I can access these files by using aws sdk directly). But I am seeing
below error with the java command above. Please advise. Thanks!
{noformat}
Exception in thread "main"
org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException:
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 8E7E2BFE1F794A36;
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=Exception
in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException:
java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception:
Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
Request ID: 8E7E2BFE1F794A36; S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ= at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:353)
at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:321)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:216) at
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67) at
org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) at
org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) at
com.zillow.pipeler.utils.ExecutionContext.runPipeline(ExecutionContext.java:44)
at
com.zillow.pipeler.orchestrator.BaseOrchestrator.run(BaseOrchestrator.java:65)
at
com.zillow.pipeler.orchestrator.transform.DatasetFlattenerOrchestrator.main(DatasetFlattenerOrchestrator.java:105)Caused
by: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception:
Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
Request ID: 8E7E2BFE1F794A36; S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ= at
org.apache.beam.sdk.io.aws.s3.S3FileSystem.getPathContentEncoding(S3FileSystem.java:346)
at
org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$matchGlobPaths$1(S3FileSystem.java:199)
at
org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:104)
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)Caused
by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
8E7E2BFE1F794A36; S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ= at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1742)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1371)
{noformat}
was (Author: sekiforever):
Hi [~jalmeida] thanks for contributing to this nice feature to enable assume
role. I am doing some testing against this feature by specifying an IAM role
and access s3 data. I am using:
# beam 2.25 version
# direct runner for the testing.
# ParquetIO to read parquet files
Below is the command.
java -cp <jar file> --inputPath="s3://output/path/*.parquet"
--awsCredentialsProvider='\{"@type":"STSAssumeRoleSessionCredentialsProvider",
"roleArn":"<iam role name>", "roleSessionName":"<session name>"}'
Is this the right way to specify the role? I am seeing a potential problem with
this command. I have made sure the specified IAM role has read access to the s3
files (I can access these files by using aws sdk directly). But I am seeing
below error with the java command above. Please advise. Thanks!
Exception in thread "main"
org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.io.IOException:
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: 8E7E2BFE1F794A36;
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=Exception
in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException:
java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception:
Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
Request ID: 8E7E2BFE1F794A36; S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ= at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:353)
at
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:321)
at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:216) at
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67) at
org.apache.beam.sdk.Pipeline.run(Pipeline.java:317) at
org.apache.beam.sdk.Pipeline.run(Pipeline.java:303) at
com.zillow.pipeler.utils.ExecutionContext.runPipeline(ExecutionContext.java:44)
at
com.zillow.pipeler.orchestrator.BaseOrchestrator.run(BaseOrchestrator.java:65)
at
com.zillow.pipeler.orchestrator.transform.DatasetFlattenerOrchestrator.main(DatasetFlattenerOrchestrator.java:105)Caused
by: java.io.IOException: com.amazonaws.services.s3.model.AmazonS3Exception:
Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden;
Request ID: 8E7E2BFE1F794A36; S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ= at
org.apache.beam.sdk.io.aws.s3.S3FileSystem.getPathContentEncoding(S3FileSystem.java:346)
at
org.apache.beam.sdk.io.aws.s3.S3FileSystem.lambda$matchGlobPaths$1(S3FileSystem.java:199)
at
org.apache.beam.sdk.util.MoreFutures.lambda$supplyAsync$0(MoreFutures.java:104)
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)Caused
by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service:
Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID:
8E7E2BFE1F794A36; S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ=),
S3 Extended Request ID:
w5Vqpj3rzt3OxRQXyhXNxpJLI/AoVn2Q9v0vvQFHFvKAh3yBvtOGEFiC9m3CeZlDZ3fVhKv/qBQ= at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1742)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1371)
> Add STS Assume role credentials provider to AwsModule
> -----------------------------------------------------
>
> Key: BEAM-10335
> URL: https://issues.apache.org/jira/browse/BEAM-10335
> Project: Beam
> Issue Type: Improvement
> Components: io-java-aws
> Reporter: Julius Almeida
> Assignee: Julius Almeida
> Priority: P2
> Labels: starter
> Fix For: 2.24.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> In order to perform multi account s3 write, we need to assume role.
> Current implementation of AwsModule has no options to serialize & deserialize
> credentials provided by STSAssumeRoleSessionCredentialsProvider.
> Need to add support for STSAssumeRoleSessionCredentialsProvider in AwsModule.
> AwsModule.class :
> [https://github.com/apache/beam/blob/master/sdks/java/io/amazon-web-services/src/main/java/org/apache/beam/sdk/io/aws/options/AwsModule.java]
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)