[ 
https://issues.apache.org/jira/browse/HADOOP-18095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17494687#comment-17494687
 ] 

Daniel Carl Jones commented on HADOOP-18095:
--------------------------------------------

There's quite a few items to tackle in one patch, so I'd be inclined to propose 
separating it into some manageable chunks.

Below are some rough ideas of how to break this down. I'm new to the project 
and code base so feedback is most welcome.
 * Simple fixes to SSE tests which do not assume there will be “aws” in ARN. 
Keep it simple - wildcard? "arn:aws*:kms:"
 * Add (unstable?) API for grabbing partition to StoreContext
 * Introduction of PublicDatasetTestUtils as proposed previously in some of the 
ideas for refactoring S3A incrementally. Some of its responsibilities:
 ** Source of truth for getting URI based on public data set.
 ** Maybe keep the methods specific to their purpose where possible? We might 
need landsat files specifically, but other tests may just need a bucket with a 
bunch of keys.
 ** Introduce test assumptions about the S3 endpoint or AWS partition. If we’re 
not looking at 'aws' partition, skip test.
 ** Ideally allow for future extension to provide some easy ways to override 
the bucket if tester has an alternative source? I see 
"fs.s3a.scale.test.csvfile" already has a little bit of this.
 * Potentially need to tackle DelegationTokens, not sure if this is covered 
with default test config. I can see the need to change RolePolicies a bit to 
take in the partition and allow StoreContext to provide bucket ARNs, as [the 
current ARN has aws 
partition|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/auth/RolePolicies.java#L328-L346].
 * Take another look at Access Point probing logic. There appears to be more 
403 errors than [the assumed inaccessible error 
message|https://github.com/bogthe/hadoop/blob/0aa3bec35fe2fcca50d2f5cb481ebac6611989cc/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L789-L792].
 In this case, we have encountered 403 for invalid credentials. Possibly [take 
the SDK 
code|https://github.com/aws/aws-sdk-java/blob/fd409dee8ae23fb8953e0bb4dbde65536a7e0514/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1402-L1419]
 and add the extra case for Access Points.

Does this sound reasonable and, if so, how would we model this with Jira?

> s3a connector to fully support AWS partitions,
> ----------------------------------------------
>
>                 Key: HADOOP-18095
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18095
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.2
>            Reporter: Steve Loughran
>            Priority: Minor
>
> There are some minor issues in using the S3A connector's more advanced 
> features in china
> see https://docs.aws.amazon.com/general/latest/gr/aws-arns-and-namespaces.html
> Specifically, that "arn:aws:" prefix we use for all arns needs to be 
> configurable so that aws-cn can be used instead.
> This means finding where we create and use these in production code 
> (dynamically creating IAM role policies) and in tests, and making it 
> configurable.  
> proposed
> * add an option {{fs.s3a.aws.partition}}, default aws.
> * new StoreContext methods to query this, and create the arn for the current 
> bucket (string concat or from the bucket's ARN if created with an AP ARN)
> * docs
> I remember ABFS had a problem with oauth endpoints, that was a lot more 
> serious.
> Can't think of real tests for this, other than verifying that if you create 
> an invalid partition "aws-mars" some things break.
> someone needs to run all our existing tests in china, including those with 
> IAM roles and SSE-KMS.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to