[jira] [Commented] (HADOOP-19044) AWS SDK V2 - Update S3A region logic

ASF GitHub Bot (Jira) Mon, 29 Jan 2024 10:41:05 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812019#comment-17812019
 ]


ASF GitHub Bot commented on HADOOP-19044:
-----------------------------------------

virajjasani commented on code in PR #6479:
URL: https://github.com/apache/hadoop/pull/6479#discussion_r1470040175


##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java:
##########
@@ -289,17 +290,35 @@ private <BuilderT extends S3BaseClientBuilder<BuilderT, 
ClientT>, ClientT> void
     builder.fipsEnabled(fipsEnabled);
 
     if (endpoint != null) {
+      boolean overrideEndpoint = true;
       checkArgument(!fipsEnabled,
           "%s : %s", ERROR_ENDPOINT_WITH_FIPS, endpoint);
-      builder.endpointOverride(endpoint);
       // No region was configured, try to determine it from the endpoint.
       if (region == null) {
-        region = getS3RegionFromEndpoint(parameters.getEndpoint());
+        boolean endpointEndsWithCentral =

Review Comment:
   > What if we never override if endpoint is s3.amazonaws.com?
   
   That sounds right, let me test with various combinations of endpoint and 
region before making the changes:
   
   e.g.
   1. endpoint central and region null
   2. endpoint central and region anything other than us-east-2
   3. endpoint central and region us-east-2
   4. endpoint null and region null
   5. endpoint s3-us-east-2.amazonaws.com and region us-east-2 (and null)
   6. endpoint s3-us-east-1.amazonaws.com and region us-east-1 (and null)





> AWS SDK V2 - Update S3A region logic 
> -------------------------------------
>
>                 Key: HADOOP-19044
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19044
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.4.0
>            Reporter: Ahmar Suhail
>            Assignee: Viraj Jasani
>            Priority: Major
>              Labels: pull-request-available
>
> If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set 
> fs.s3a.endpoint to 
> s3.amazonaws.com here:
> [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540]
>  
>  
> HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is 
> set, or if a region can be parsed from fs.s3a.endpoint (which will happen in 
> this case, region will be US_EAST_1), cross region access is not enabled. 
> This will cause 400 errors if the bucket is not in US_EAST_1. 
>  
> Proposed: Updated the logic so that if the endpoint is the global 
> s3.amazonaws.com , cross region access is enabled.  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-19044) AWS SDK V2 - Update S3A region logic

Reply via email to