Hi all, I'd like to propose a SPIP for adding delegated credential propagation to Spark on K8s, generalizing the existing delegation-token model to OIDC/OAuth-based environments.
SPIP Doc: https://docs.google.com/document/d/1usJKncCPMiyFUg7aIdpZ0HQsklXIHow_sU_6dfFMjN0/edit?usp=sharing ## Problem Spark on K8s has no mechanism to take a user/session identity available at thedriver, exchange it for short-lived storage credentials, and propagate those credentials to dynamically created executors. All Jobs access cloud storage as the pod's service account,making per-user authorization and audit logging impossible without workarounds. This is the equivalent gap to what Kerberos + allegation tokens solve on YARN + HDFS but for K8s + cloud storage. ## Proposal Introduce a CredentialProvider SPI and propagation mechanism that: 1. Reads an OIDC identity token from a configured file path on the driver 2. Exchange it for short-lived service credentials (via STS or compatible) 3. Distributes only those credential (not the raw token) to executors via a new RPC 4. Automatically refreshes credentials for long-running jobs The raw identity token never leaves the driver. Executors receive only short-lived delegated service credentials - mirroring how kerberos propagates delegation tokens rather than the TGT. The design mirrors the existing HadoopDelegationTokenManager / UpdateDelegationTokens pattern, coexists with Kerberos, and is gated by spark.security.credentials.enabled=false (default). A reference provider for S3/STS-compatible storage (AWS, MinIO, Ceph) is included. Azure/GCP providers and Spark Connect integration are out of scope but the SPI is designed to accommodate them without changes. Both workload-level SA tokens and per-user identity tokens are supported. With per-user tokens, STS trust policies can enforce access control based on the user's identity - enabling true per-user authorization that is impossible with IRSA/Pod Identity alone. ## Key design decisions - Core SPI is cloud-agnostic (no AWS/Azure/GCP SDK in core) - Reference provider lives in connector/credential-aws - Raw identity token stays on the driver; executors get only delegated service credentials - Works with any STS-compatible endpoint (not just AWS) - @DeveloperApi annotation allows SPI evolution - Platform-agnostic core, with K8s as the primary target Full details, architecture diagram, and sequence diagram are in the design document linked above. I welcome any feedback on the approach. Thanks, Kousuke
