Re: Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
yes it does using IAM roles for service accounts. see: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html i wrote a little bit about this also here: https://technotes.tresata.com/spark-on-k8s/ On Wed, Dec 13, 2023 at 7:52 AM Atul Patil wrote: > Hello Team, > > > > Does Spark support role-based authentication and access to Amazon S3 for > Kubernetes deployment? > > *Note: we have deployed our spark application in the Kubernetes cluster.* > > > > Below are the Hadoop-AWS dependencies we are using: > > >org.apache.hadoop >hadoop-aws >3.3.4 > > > > > We are using the following configuration when creating the spark session, > but it is not working:: > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider", > "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider"); > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn", > System.getenv("AWS_ROLE_ARN")); > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider", > "com.amazonaws.auth.WebIdentityTokenCredentialsProvider"); > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint", > "s3.eu-central-1.amazonaws.com"); > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region", > Regions.EU_CENTRAL_1.getName()); > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file", > System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE")); > > sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration", > "30m"); > > > > Thank you! > > > > Regards, > > Atul > -- CONFIDENTIALITY NOTICE: This electronic communication and any files transmitted with it are confidential, privileged and intended solely for the use of the individual or entity to whom they are addressed. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution (electronic or otherwise) or forwarding of, or the taking of any action in reliance on the contents of this transmission is strictly prohibited. Please notify the sender immediately by e-mail if you have received this email by mistake and delete this email from your system. Is it necessary to print this email? If you care about the environment like we do, please refrain from printing emails. It helps to keep the environment forested and litter-free.
Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Hello Team, Does Spark support role-based authentication and access to Amazon S3 for Kubernetes deployment? *Note: we have deployed our spark application in the Kubernetes cluster.* Below are the Hadoop-AWS dependencies we are using: org.apache.hadoop hadoop-aws 3.3.4 We are using the following configuration when creating the spark session, but it is not working:: sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider"); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn", System.getenv("AWS_ROLE_ARN")); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider", "com.amazonaws.auth.WebIdentityTokenCredentialsProvider"); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint", "s3.eu-central-1.amazonaws.com"); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region", Regions.EU_CENTRAL_1.getName()); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file", System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE")); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration", "30m"); Thank you! Regards, Atul
Does Spark support role-based authentication and access to Amazon S3? (Kubernetes cluster deployment)
Hello Team, Does Spark support role-based authentication and access to Amazon S3 for Kubernetes deployment? Note: we have deployed our spark application in the Kubernetes cluster. Below are the Hadoop-AWS dependencies we are using: org.apache.hadoop hadoop-aws 3.3.4 We are using the following configuration when creating the spark session, but it is not working:: sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider"); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn", System.getenv("AWS_ROLE_ARN")); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider", "com.amazonaws.auth.WebIdentityTokenCredentialsProvider"); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint", "s3.eu-central-1.amazonaws.com"); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region", Regions.EU_CENTRAL_1.getName()); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file", System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE")); sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration", "30m"); Thank you! Regards, Atul