[
https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809495#comment-17809495
]
ASF GitHub Bot commented on HADOOP-19044:
-----------------------------------------
ahmarsuhail commented on code in PR #6482:
URL: https://github.com/apache/hadoop/pull/6482#discussion_r1461990948
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java:
##########
@@ -291,15 +291,25 @@ private <BuilderT extends S3BaseClientBuilder<BuilderT,
ClientT>, ClientT> void
if (endpoint != null) {
checkArgument(!fipsEnabled,
"%s : %s", ERROR_ENDPOINT_WITH_FIPS, endpoint);
- builder.endpointOverride(endpoint);
- // No region was configured, try to determine it from the endpoint.
- if (region == null) {
- region = getS3RegionFromEndpoint(parameters.getEndpoint());
- if (region != null) {
- origin = "endpoint";
+ if(parameters.getEndpoint().equals(CENTRAL_ENDPOINT)){
Review Comment:
if someone configures `fs.s3a.endpoint` to `s3.amazonaws.com` and sets
region in `fs.s3a.endpoint.region` to `eu-west-1`, this code will start
ignoring what we have in `fs.s3a.endpoint.region` and just enable cross region
for everything. I think this could be risky as people could just have
`s.s3a.endpoint` to `s3.amazonaws.com` in their core-site.xml as it doesn't
make a difference if you've set your region right.
> AWS SDK V2 - Update S3A region logic
> -------------------------------------
>
> Key: HADOOP-19044
> URL: https://issues.apache.org/jira/browse/HADOOP-19044
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Ahmar Suhail
> Assignee: Viraj Jasani
> Priority: Major
> Labels: pull-request-available
>
> If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set
> fs.s3a.endpoint to
> s3.amazonaws.com here:
> [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540]
>
>
> HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is
> set, or if a region can be parsed from fs.s3a.endpoint (which will happen in
> this case, region will be US_EAST_1), cross region access is not enabled.
> This will cause 400 errors if the bucket is not in US_EAST_1.
>
> Proposed: Updated the logic so that if the endpoint is the global
> s3.amazonaws.com , cross region access is enabled.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]