[ 
https://issues.apache.org/jira/browse/HADOOP-18154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ju Clarysse updated HADOOP-18154:
---------------------------------
    Description: 
We are using the latest version of 
[delta-sharing|https://github.com/delta-io/delta-sharing] which takes advantage 
of 
[hadoop-aws|https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html]
 (S3A) connector in [Hadoop release version 
2.10.1|https://github.com/apache/hadoop/tree/rel/release-2.10.1] to mount an 
AWS S3 File System. In our particular setup, all services are operated in 
Amazon Elastic Kubernetes Service (EKS) and need to comply to the AWS security 
concept [IAM roles for service 
accounts|https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html]
 (IRSA).

As [Delta sharing S3 connection|https://github.com/delta-io/delta-sharing#s3] 
doesn't offer any corresponding support, we patched hadoop-aws-2.10.1 to 
address this need via a new credentials provider class 
org.apache.hadoop.fs.s3a.OIDCTokenCredentialsProvider. We also upgraded 
dependency aws-java-sdk-bundle to its latest version 1.12.167 as [AWS 
WebIdentityTokenCredentialsProvider 
class|https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/WebIdentityTokenCredentialsProvider.html%E2%80%A6]
 was not yet available in original version 1.11.271.

We believe that other delta-sharing users could benefit from this short-term 
contribution. Then sooner or later, delta-sharing owners will have to upgrade 
their project to a more recent version of hadoop-aws that is probably more 
widely used. The effort to promote this change is probably low.

Additional note: AWS WebIdentityTokenCredentialsProvider class is directly 
supported by Spark applications submitted with configuration properties 
`spark.hadoop.fs.s3a.aws.credentials.provider`and 
`spark.kubernetes.authenticate.submission.oauthToken` 
([doc|https://spark.apache.org/docs/latest/running-on-kubernetes.html#spark-properties]).
 So bringing this support to Hadoop will primarily be interesting for non-Spark 
users.

  was:
We are using the latest version of 
[delta-sharing|https://github.com/delta-io/delta-sharing] which takes advantage 
of 
[hadoop-aws|https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html]
 (S3A) connector in [Hadoop release version 
2.10.1|https://github.com/apache/hadoop/tree/rel/release-2.10.1] to mount an 
AWS S3 File System. In our particular setup, all services are operated in 
Amazon Elastic Kubernetes Service (EKS) and need to comply to the AWS security 
concept [IAM roles for service 
accounts|https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html]
 (IRSA).

As [Delta sharing S3 connection|https://github.com/delta-io/delta-sharing#s3] 
doesn't offer any corresponding support, we patched hadoop-aws-2.10.1 to 
address this need via a new credentials provider class 
org.apache.hadoop.fs.s3a.OIDCTokenCredentialsProvider. We also upgraded 
dependency aws-java-sdk-bundle to its latest version 1.12.167 as [AWS 
WebIdentityTokenCredentialsProvider 
class|https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/WebIdentityTokenCredentialsProvider.html%E2%80%A6]
 was not yet available in original version 1.11.271.

We believe that other delta-sharing users could benefit from this short-term 
contribution. Sooner or later, delta-sharing owners will then have to upgrade 
to a more recent version of hadoop-aws that is probably more widely used. The 
effort to promote this change could be limited while the opportunity to make 
other folks happy could be great.


> Extend S3A to WebIdentity
> -------------------------
>
>                 Key: HADOOP-18154
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18154
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/s3
>    Affects Versions: 2.10.1
>            Reporter: Ju Clarysse
>            Assignee: Ju Clarysse
>            Priority: Major
>
> We are using the latest version of 
> [delta-sharing|https://github.com/delta-io/delta-sharing] which takes 
> advantage of 
> [hadoop-aws|https://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html]
>  (S3A) connector in [Hadoop release version 
> 2.10.1|https://github.com/apache/hadoop/tree/rel/release-2.10.1] to mount an 
> AWS S3 File System. In our particular setup, all services are operated in 
> Amazon Elastic Kubernetes Service (EKS) and need to comply to the AWS 
> security concept [IAM roles for service 
> accounts|https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html]
>  (IRSA).
> As [Delta sharing S3 connection|https://github.com/delta-io/delta-sharing#s3] 
> doesn't offer any corresponding support, we patched hadoop-aws-2.10.1 to 
> address this need via a new credentials provider class 
> org.apache.hadoop.fs.s3a.OIDCTokenCredentialsProvider. We also upgraded 
> dependency aws-java-sdk-bundle to its latest version 1.12.167 as [AWS 
> WebIdentityTokenCredentialsProvider 
> class|https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/WebIdentityTokenCredentialsProvider.html%E2%80%A6]
>  was not yet available in original version 1.11.271.
> We believe that other delta-sharing users could benefit from this short-term 
> contribution. Then sooner or later, delta-sharing owners will have to upgrade 
> their project to a more recent version of hadoop-aws that is probably more 
> widely used. The effort to promote this change is probably low.
> Additional note: AWS WebIdentityTokenCredentialsProvider class is directly 
> supported by Spark applications submitted with configuration properties 
> `spark.hadoop.fs.s3a.aws.credentials.provider`and 
> `spark.kubernetes.authenticate.submission.oauthToken` 
> ([doc|https://spark.apache.org/docs/latest/running-on-kubernetes.html#spark-properties]).
>  So bringing this support to Hadoop will primarily be interesting for 
> non-Spark users.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to