[
https://issues.apache.org/jira/browse/HADOOP-18154?focusedWorklogId=745954&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-745954
]
ASF GitHub Bot logged work on HADOOP-18154:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 22/Mar/22 17:22
Start Date: 22/Mar/22 17:22
Worklog Time Spent: 10m
Work Description: steveloughran commented on pull request #4070:
URL: https://github.com/apache/hadoop/pull/4070#issuecomment-1075401037
Looking at the `WebIdentityTokenCredentialsProvider` I see that if it
doesn't get the parameters then it will fall back to environment variables. We
absolutely do not want to be picking up env vars as it will only create support
issues where configurations only work on a certain machines. (actually, we can
ignore the session name settings as they are harmless)
I'm going to propose we go with @dannycjones's suggestion and support the
whole set of values and have the prefix `fs.s3a.webidentity` for all of them.
for the arn, we could have a property `fs.s3a.webidentity.role.arn`
but, what should we do if it wasn't set?
1. fail to initialize
2. have that null value force the env var lookup.
I don't see any way to a completely block the environment variable
resolution, which is a pain.
I also see in the internal Library classes that sometimes roles are set up
with an external ID, but it is not possible here. Is that an issue?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 745954)
Time Spent: 2h 50m (was: 2h 40m)
> S3A Authentication to support WebIdentity
> -----------------------------------------
>
> Key: HADOOP-18154
> URL: https://issues.apache.org/jira/browse/HADOOP-18154
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Affects Versions: 2.10.1
> Reporter: Ju Clarysse
> Assignee: Ju Clarysse
> Priority: Major
> Labels: pull-request-available
> Time Spent: 2h 50m
> Remaining Estimate: 0h
>
> We are using the latest version of
> [delta-sharing|https://github.com/delta-io/delta-sharing] which takes
> advantage of
> [hadoop-aws|https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html]
> (S3A) connector in [Hadoop release version
> 2.10.1|https://github.com/apache/hadoop/tree/rel/release-2.10.1] to mount an
> AWS S3 File System. In our particular setup, all services are operated in
> Amazon Elastic Kubernetes Service (EKS) and need to comply to the AWS
> security concept [IAM roles for service
> accounts|https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html]
> (IRSA).
> As [Delta sharing S3 connection|https://github.com/delta-io/delta-sharing#s3]
> doesn't offer any corresponding support, we patched hadoop-aws-2.10.1 to
> address this need via a new credentials provider class
> org.apache.hadoop.fs.s3a.OIDCTokenCredentialsProvider. We also upgraded
> dependency aws-java-sdk-bundle to its latest version 1.12.167 as [AWS
> WebIdentityTokenCredentialsProvider
> class|https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/WebIdentityTokenCredentialsProvider.html%E2%80%A6]
> was not yet available in original version 1.11.271.
> We believe that other delta-sharing users could benefit from this short-term
> contribution. Then sooner or later, delta-sharing owners will have to upgrade
> their project to a more recent version of hadoop-aws that is probably more
> widely used. The effort to promote this change is probably low.
> Additional note: AWS WebIdentityTokenCredentialsProvider class is directly
> supported by Spark applications submitted with configuration properties
> `spark.hadoop.fs.s3a.aws.credentials.provider`and
> `spark.kubernetes.authenticate.submission.oauthToken`
> ([doc|https://spark.apache.org/docs/latest/running-on-kubernetes.html#spark-properties]).
> So bringing this support to Hadoop will primarily be interesting for
> non-Spark users.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]