bogthe commented on a change in pull request #3260: URL: https://github.com/apache/hadoop/pull/3260#discussion_r684147891
########## File path: hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md ########## @@ -1576,6 +1576,81 @@ Why explicitly declare a bucket bound to the central endpoint? It ensures that if the default endpoint is changed to a new region, data store in US-east is still reachable. +## <a name="accesspoints"></a>Configuring S3 AccessPoints usage with S3a +S3a now supports [S3 Access Point](https://aws.amazon.com/s3/features/access-points/) usage which +improves VPC integration with S3 and simplifies your data's permission model because different +policies can be applied now on the Access Point level. For more information about why to use them +make sure to read the official documentation. + +Accessing data through an access point, is done by using its ARN, as opposed to just the bucket name. +You can set the Access Point ARN property using the following configuration property: +```xml +<property> + <name>fs.s3a.accesspoint.arn</name> + <value> {ACCESSPOINT_ARN_HERE} </value> + <description>Configure S3a traffic to use this AccessPoint</description> +</property> +``` + +Be mindful that this configures **all access** to S3a, and in turn S3, to go through that ARN. +So for example `s3a://yourbucket/key` will now use your configured ARN when getting data from S3 +instead of your bucket. The flip side to this is that if you're working with multiple buckets +`s3a://yourbucket` and `s3a://yourotherbucket` both of their requests will go through the same +Access Point ARN. To configure different Access Point ARNs, per bucket overrides can be used with +access point names instead of bucket names as such: + +- Let's assume you have an existing workflow with the following paths `s3a://data-bucket`, +`s3a://output-bucket` and you want to work with a new Access Point called `finance-accesspoint`. All +you would then need to add is the following per bucket configuration change: +```xml +<property> + <name>fs.s3a.bucket.finance-accesspoint.accesspoint.arn</name> + <value> arn:aws:s3:eu-west-1:123456789101:accesspoint/finance-accesspoint </value> +</property> +``` + +While keeping the global `accesspoint.arn` property set to empty `" "` which is the default. Review comment: Yeah, my fault for mixing it up. I thought the default was `" "` not `""` for properties. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
