Re: Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)

2023-12-13 Thread Koert Kuipers
yes it does using IAM roles for service accounts.
see:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html

i wrote a little bit about this also here:
https://technotes.tresata.com/spark-on-k8s/

On Wed, Dec 13, 2023 at 7:52 AM Atul Patil  wrote:

> Hello Team,
>
>
>
> Does Spark support role-based authentication and access to Amazon S3 for
> Kubernetes deployment?
>
> *Note: we have deployed our spark application in the Kubernetes cluster.*
>
>
>
> Below are the Hadoop-AWS dependencies we are using:
>
> 
>org.apache.hadoop
>hadoop-aws
>3.3.4
> 
>
>
>
> We are using the following configuration when creating the spark session,
> but it is not working::
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider",
> "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider");
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn",
> System.getenv("AWS_ROLE_ARN"));
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider",
> "com.amazonaws.auth.WebIdentityTokenCredentialsProvider");
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint",
> "s3.eu-central-1.amazonaws.com");
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region",
> Regions.EU_CENTRAL_1.getName());
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file",
> System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"));
>
> sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration",
> "30m");
>
>
>
> Thank you!
>
>
>
> Regards,
>
> Atul
>

-- 
CONFIDENTIALITY NOTICE: This electronic communication and any files 
transmitted with it are confidential, privileged and intended solely for 
the use of the individual or entity to whom they are addressed. If you are 
not the intended recipient, you are hereby notified that any disclosure, 
copying, distribution (electronic or otherwise) or forwarding of, or the 
taking of any action in reliance on the contents of this transmission is 
strictly prohibited. Please notify the sender immediately by e-mail if you 
have received this email by mistake and delete this email from your system.


Is it necessary to print this email? If you care about the environment 
like we do, please refrain from printing emails. It helps to keep the 
environment forested and litter-free.


Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)

2023-12-13 Thread Atul Patil
Hello Team,



Does Spark support role-based authentication and access to Amazon S3 for
Kubernetes deployment?

*Note: we have deployed our spark application in the Kubernetes cluster.*



Below are the Hadoop-AWS dependencies we are using:


   org.apache.hadoop
   hadoop-aws
   3.3.4




We are using the following configuration when creating the spark session,
but it is not working::

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider",
"org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider");

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn",
System.getenv("AWS_ROLE_ARN"));

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider",
"com.amazonaws.auth.WebIdentityTokenCredentialsProvider");

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint",
"s3.eu-central-1.amazonaws.com");

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region",
Regions.EU_CENTRAL_1.getName());

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file",
System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"));

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration",
"30m");



Thank you!



Regards,

Atul


Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)

2023-12-13 Thread Patil, Atul
Hello Team,

Does Spark support role-based authentication and access to Amazon S3 for 
Kubernetes deployment?
Note: we have deployed our spark application in the Kubernetes cluster.

Below are the Hadoop-AWS dependencies we are using:

   org.apache.hadoop
   hadoop-aws
   3.3.4


We are using the following configuration when creating the spark session, but 
it is not working::
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider",
 "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider");
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn",
 System.getenv("AWS_ROLE_ARN"));
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider",
 "com.amazonaws.auth.WebIdentityTokenCredentialsProvider");
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint",
 "s3.eu-central-1.amazonaws.com");
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region",
 Regions.EU_CENTRAL_1.getName());
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file",
 System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"));
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration",
 "30m");

Thank you!

Regards,
Atul



Unsubscribe

2023-12-13 Thread kritika jain