Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

Jörn Franke Sun, 01 Oct 2023 04:15:04 -0700

With oidc sth comparable is possible: https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

Am 01.10.2023 um 11:13 schrieb Mich Talebzadeh <mich.talebza...@gmail.com>:

It seems that workload identity is not available on AWS. Workload Identity replaces the need to use Metadata concealment on exposed storage such as s3 and gcs. The sensitive metadata protected by metadata concealment is also protected by Workload Identity.
Both Google Cloud Kubernetes (GKE) and Azure Kubernetes Service support Workload Identity. Taking notes from Google Cloud: "Workload Identity is the recommended way for your workloads running on Google Kubernetes Engine (GKE) to access Google Cloud services in a secure and manageable way."

HTH

Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom

view my Linkedin profile

https://en.everybodywiki.com/Mich_Talebzadeh

Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

On Sun, 1 Oct 2023 at 06:36, Jayabindu Singh <jayabi...@gmail.com> wrote:
Hi Jon,

Using IAM as suggested by Jorn is the best approach.
We recently moved our spark workload from HDP to Spark on K8 and utilizing IAM.
It will save you from secret management headaches and also allows a lot more flexibility on access control and option to allow access to multiple S3 buckets in the same pod.
We have implemented this across Azure, Google and AWS. Azure does require some extra work to make it work.

On Sat, Sep 30, 2023 at 12:05 PM Jörn Franke <jornfra...@gmail.com> wrote:
Don’t use static iam (s3) credentials. It is an outdated insecure method - even AWS recommend against using this for anything (cf eg https://docs.aws.amazon.com/cli/latest/userguide/cli-authentication-user.html).
It is almost a guarantee to get your data stolen and your account manipulated.

If you need to use kubernetes (which has its own very problematic security issues) then assign AWS IAM roles with minimal permissions to the pods (for EKS it means using OIDC, cf https://docs.aws.amazon.com/eks/latest/userguide/service_IAM_role.html).

Am 30.09.2023 um 03:41 schrieb Jon Rodríguez Aranguren <jon.r.arangu...@gmail.com>:

Dear Spark Community Members,

I trust this message finds you all in good health and spirits.

I'm reaching out to the collective expertise of this esteemed community with a query regarding Spark on Kubernetes. As a newcomer, I have always admired the depth and breadth of knowledge shared within this forum, and it is my hope that some of you might have insights on a specific challenge I'm facing.

I am currently trying to configure multiple Kubernetes secrets, notably multiple S3 keys, at the SparkConf level for a Spark application. My objective is to understand the best approach or methods to ensure that these secrets can be smoothly accessed by the Spark application.

If any of you have previously encountered this scenario or possess relevant insights on the matter, your guidance would be highly beneficial.

Thank you for your time and consideration. I'm eager to learn from the experiences and knowledge present within this community.

Warm regards,
Jon

Re: Seeking Guidance on Spark on Kubernetes Secrets Configuration

Reply via email to