[
https://issues.apache.org/jira/browse/HADOOP-19044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812710#comment-17812710
]
ASF GitHub Bot commented on HADOOP-19044:
-----------------------------------------
steveloughran commented on code in PR #6479:
URL: https://github.com/apache/hadoop/pull/6479#discussion_r1472852413
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java:
##########
@@ -290,35 +290,34 @@ private <BuilderT extends S3BaseClientBuilder<BuilderT,
ClientT>, ClientT> void
builder.fipsEnabled(fipsEnabled);
if (endpoint != null) {
- boolean overrideEndpoint = true;
checkArgument(!fipsEnabled,
"%s : %s", ERROR_ENDPOINT_WITH_FIPS, endpoint);
boolean endpointEndsWithCentral =
endpointStr.endsWith(CENTRAL_ENDPOINT);
- // No region was configured or the endpoint is central,
+
+ // No region was configured,
// determine the region from the endpoint.
- if (region == null || endpointEndsWithCentral) {
+ if (region == null) {
region = getS3RegionFromEndpoint(endpointStr,
endpointEndsWithCentral);
if (region != null) {
origin = "endpoint";
- if (endpointEndsWithCentral) {
- // No need to override endpoint with "s3.amazonaws.com".
- // Let the client take care of endpoint resolution. Overriding
- // the endpoint with "s3.amazonaws.com" causes 400 Bad Request
- // errors for non-existent buckets and objects.
- // ref: https://github.com/aws/aws-sdk-java-v2/issues/4846
- overrideEndpoint = false;
- builder.crossRegionAccessEnabled(true);
- origin = "origin with cross region access";
- LOG.debug("Enabling cross region access for endpoint {}",
- endpointStr);
- }
}
}
- if (overrideEndpoint) {
+
+ // No need to override endpoint with "s3.amazonaws.com".
+ // Let the client take care of endpoint resolution. Overriding
+ // the endpoint with "s3.amazonaws.com" causes 400 Bad Request
+ // errors for non-existent buckets and objects.
+ // ref: https://github.com/aws/aws-sdk-java-v2/issues/4846
+ if (!endpointEndsWithCentral) {
builder.endpointOverride(endpoint);
LOG.debug("Setting endpoint to {}", endpoint);
+ } else {
+ builder.crossRegionAccessEnabled(true);
+ origin = "origin with cross region access";
Review Comment:
"central endpoint with cross region access"
##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AEndpointRegion.java:
##########
@@ -257,6 +283,83 @@ public void testWithVPCE() throws Throwable {
expectInterceptorException(client);
}
+ @Test
+ public void testCentralEndpointWithUSWest2Region() throws Throwable {
+ describe("Access bucket using central endpoint and us-west-2 region");
+ final Configuration conf = getConfiguration();
+ removeBaseAndBucketOverrides(conf, ENDPOINT, AWS_REGION);
+
+ final Configuration newConf = new Configuration(conf);
+
+ newConf.set(ENDPOINT, CENTRAL_ENDPOINT);
+ newConf.set(AWS_REGION, US_WEST_2);
+
+ newFS = new S3AFileSystem();
+ newFS.initialize(getFileSystem().getUri(), newConf);
+
+ assertOpsUsingNewFs();
+ }
+
+ @Test
+ public void testCentralEndpointWithEUWest2Region() throws Throwable {
Review Comment:
let's go with a public bucket
##########
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/aws_sdk_v2_changelog.md:
##########
@@ -93,6 +93,20 @@ the bucket. The behaviour has now been changed to:
response. Otherwise it will throw an error with status code 301 permanently
moved. This error
contains the region of the bucket in its header, which we can then use to
configure the client.
Review Comment:
* put it in `connecting.md`, link in index.md as "how to connect"
* Highlight that it is complex and improving, so may be considered unstable.
* Link to `third_party_stores.html` to make clear if you are working with
third party stores to look there.
> AWS SDK V2 - Update S3A region logic
> -------------------------------------
>
> Key: HADOOP-19044
> URL: https://issues.apache.org/jira/browse/HADOOP-19044
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Ahmar Suhail
> Assignee: Viraj Jasani
> Priority: Major
> Labels: pull-request-available
>
> If both fs.s3a.endpoint & fs.s3a.endpoint.region are empty, Spark will set
> fs.s3a.endpoint to
> s3.amazonaws.com here:
> [https://github.com/apache/spark/blob/9a2f39318e3af8b3817dc5e4baf52e548d82063c/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L540]
>
>
> HADOOP-18908, updated the region logic such that if fs.s3a.endpoint.region is
> set, or if a region can be parsed from fs.s3a.endpoint (which will happen in
> this case, region will be US_EAST_1), cross region access is not enabled.
> This will cause 400 errors if the bucket is not in US_EAST_1.
>
> Proposed: Updated the logic so that if the endpoint is the global
> s3.amazonaws.com , cross region access is enabled.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]