[
https://issues.apache.org/jira/browse/HADOOP-18908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-18908:
------------------------------------
Description:
S3A region logic improved for better inference and
to be compatible with previous releases
1. If you are using an AWS S3 AccessPoint, its region is determined
from the ARN itself.
2. If fs.s3a.endpoint.region is set and non-empty, it is used.
3. If fs.s3a.endpoint is an s3.*.amazonaws.com url,
the region is determined by by parsing the URL
Note: vpce endpoints are not handled by this.
4. If fs.s3a.endpoint.region==null, and none could be determined
from the endpoint, use us-east-2 as default.
5. If fs.s3a.endpoint.region=="" then it is handed off to
The default AWS SDK resolution process.
Consult the AWS SDK documentation for the details on its resolution
process, knowing that it is complicated and may use environment variables,
entries in ~/.aws/config, IAM instance information within
EC2 deployments and possibly even JSON resources on the classpath.
Put differently: it is somewhat brittle across deployments.
was:
s3a now requires the fs.s3a.endpoint.region to be set; and while it can
determine it from a network call, this takes time and doesn't work for third
party stores.
proposed
* reinstate parsing of the fs.3a.endpoint url to automatically determine region
from well known endoints (and vplink ones)
* don't try to talk to AWS S3 if endpoint isn't an aws one: for that caller
must declare (HADOOP-18673)
* document this in v2 migration, including stack traces of falures
> Improve s3a region handling, including determining from endpoint
> ----------------------------------------------------------------
>
> Key: HADOOP-18908
> URL: https://issues.apache.org/jira/browse/HADOOP-18908
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.0
> Reporter: Steve Loughran
> Assignee: Ahmar Suhail
> Priority: Major
> Labels: pull-request-available
>
> S3A region logic improved for better inference and
> to be compatible with previous releases
> 1. If you are using an AWS S3 AccessPoint, its region is determined
> from the ARN itself.
> 2. If fs.s3a.endpoint.region is set and non-empty, it is used.
> 3. If fs.s3a.endpoint is an s3.*.amazonaws.com url,
> the region is determined by by parsing the URL
> Note: vpce endpoints are not handled by this.
> 4. If fs.s3a.endpoint.region==null, and none could be determined
> from the endpoint, use us-east-2 as default.
> 5. If fs.s3a.endpoint.region=="" then it is handed off to
> The default AWS SDK resolution process.
> Consult the AWS SDK documentation for the details on its resolution
> process, knowing that it is complicated and may use environment variables,
> entries in ~/.aws/config, IAM instance information within
> EC2 deployments and possibly even JSON resources on the classpath.
> Put differently: it is somewhat brittle across deployments.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]