[ 
https://issues.apache.org/jira/browse/HADOOP-19750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HADOOP-19750:
----------------------------------
    Description: 
Given the zookeeper based delegation token manager that we have: 
[ZKDelegationTokenSecretManager|https://github.com/apache/hadoop/blob/7b8b3e4e2324f42b37d981ca060bec79b1e9fe3c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L62-L68]

The proposal is to create a Kubernetes-based secret manager for delegation 
tokens.
The motivation is to enable running workloads coming from the Hadoop era on 
Kubernetes clusters (without actually using YARN/HDFS) without requiring 
ZooKeeper to store tokens, master keys, or related metadata. This hybrid 
approach is valid for workloads that are moving to Kubernetes but are not yet 
ready to eliminate Kerberos entirely — which would be a much larger undertaking 
than building this secret manager.

Key points of the migration:

Master Key → Kubernetes Secret
The master key would be stored as a Kubernetes Secret instead of in ZooKeeper.

{code}
apiVersion: v1
kind: Secret
metadata:
  name: delegation-master-keys
type: Opaque
data:
  currentKeyId: OTE=        # "91" base64-encoded
  keys.json: 
ewogICI5MSI6ICJiYXNlNjQtc2VjcmV0LWtleSIsCiAgIjkwIjogIm9sZC1rZXkiCn0=
  # keys.json = { "91": "<base64-secret>", "90": "<base64-secret>" }
{code}

Secrets are bound to a specific namespace, which by design provides good 
separation in a larger cluster.


Delegation Token → Kubernetes Custom Resource (CRD)
Delegation tokens would be represented as Kubernetes custom resources.
Through the Kubernetes API server, these tokens would be stored in the 
cluster’s backing store (etcd), replacing ZooKeeper as the persistence layer.

An instance of a Delegation token:
{code}
apiVersion: security.example.com/v1
kind: DelegationToken
metadata:
  name: dt-9f2c1a87e4b94c5a  # this is your "seqNum"/ID
spec:
  owner: alice
  renewer: yarn
  issueTime: "2025-12-08T09:00:00Z"
  maxExpiryTime: "2025-12-15T09:00:00Z"
  currentExpiryTime: "2025-12-09T09:00:00Z"
  masterKeyId: "91"
status:
  phase: Active
{code}

Schema of the DelegationToken CRD:
{code}
...
later
...
{code}

Custom Resource instances can be bound to a specific namespace, which by design 
provides good separation in a larger cluster.


Users of this delegation token manager should accept the possible performance 
characteristics of using K8s control plane vs. Zookeeper storage.

  was:
Given the zookeeper based delegation token manager that we have: 
[ZKDelegationTokenSecretManager|https://github.com/apache/hadoop/blob/7b8b3e4e2324f42b37d981ca060bec79b1e9fe3c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L62-L68]

The proposal is to create a Kubernetes-based secret manager for delegation 
tokens.
The motivation is to enable running Hadoop workloads on Kubernetes clusters 
(without Hadoop/YARN components) without requiring ZooKeeper to store tokens, 
master keys, or related metadata. This hybrid approach is valid for workloads 
that are moving to Kubernetes but are not yet ready to eliminate Kerberos 
entirely — which would be a much larger undertaking than building this secret 
manager.

Key points of the migration:

Master Key → Kubernetes Secret
The master key would be stored as a Kubernetes Secret instead of in ZooKeeper.

{code}
apiVersion: v1
kind: Secret
metadata:
  name: delegation-master-keys
type: Opaque
data:
  currentKeyId: OTE=        # "91" base64-encoded
  keys.json: 
ewogICI5MSI6ICJiYXNlNjQtc2VjcmV0LWtleSIsCiAgIjkwIjogIm9sZC1rZXkiCn0=
  # keys.json = { "91": "<base64-secret>", "90": "<base64-secret>" }
{code}

Secrets are bound to a specific namespace, which by design provides good 
separation in a larger cluster.


Delegation Token → Kubernetes Custom Resource (CRD)
Delegation tokens would be represented as Kubernetes custom resources.
Through the Kubernetes API server, these tokens would be stored in the 
cluster’s backing store (etcd), replacing ZooKeeper as the persistence layer.

An instance of a Delegation token:
{code}
apiVersion: security.example.com/v1
kind: DelegationToken
metadata:
  name: dt-9f2c1a87e4b94c5a  # this is your "seqNum"/ID
spec:
  owner: alice
  renewer: yarn
  issueTime: "2025-12-08T09:00:00Z"
  maxExpiryTime: "2025-12-15T09:00:00Z"
  currentExpiryTime: "2025-12-09T09:00:00Z"
  masterKeyId: "91"
status:
  phase: Active
{code}

Schema of the DelegationToken CRD:
{code}
...
later
...
{code}

Custom Resource instances can be bound to a specific namespace, which by design 
provides good separation in a larger cluster.


Users of this delegation token manager should accept the possible performance 
characteristics of using K8s control plane vs. Zookeeper storage.


> Delegation token secret manager for Kubernetes
> ----------------------------------------------
>
>                 Key: HADOOP-19750
>                 URL: https://issues.apache.org/jira/browse/HADOOP-19750
>             Project: Hadoop Common
>          Issue Type: New Feature
>            Reporter: László Bodor
>            Priority: Major
>
> Given the zookeeper based delegation token manager that we have: 
> [ZKDelegationTokenSecretManager|https://github.com/apache/hadoop/blob/7b8b3e4e2324f42b37d981ca060bec79b1e9fe3c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L62-L68]
> The proposal is to create a Kubernetes-based secret manager for delegation 
> tokens.
> The motivation is to enable running workloads coming from the Hadoop era on 
> Kubernetes clusters (without actually using YARN/HDFS) without requiring 
> ZooKeeper to store tokens, master keys, or related metadata. This hybrid 
> approach is valid for workloads that are moving to Kubernetes but are not yet 
> ready to eliminate Kerberos entirely — which would be a much larger 
> undertaking than building this secret manager.
> Key points of the migration:
> Master Key → Kubernetes Secret
> The master key would be stored as a Kubernetes Secret instead of in ZooKeeper.
> {code}
> apiVersion: v1
> kind: Secret
> metadata:
>   name: delegation-master-keys
> type: Opaque
> data:
>   currentKeyId: OTE=        # "91" base64-encoded
>   keys.json: 
> ewogICI5MSI6ICJiYXNlNjQtc2VjcmV0LWtleSIsCiAgIjkwIjogIm9sZC1rZXkiCn0=
>   # keys.json = { "91": "<base64-secret>", "90": "<base64-secret>" }
> {code}
> Secrets are bound to a specific namespace, which by design provides good 
> separation in a larger cluster.
> Delegation Token → Kubernetes Custom Resource (CRD)
> Delegation tokens would be represented as Kubernetes custom resources.
> Through the Kubernetes API server, these tokens would be stored in the 
> cluster’s backing store (etcd), replacing ZooKeeper as the persistence layer.
> An instance of a Delegation token:
> {code}
> apiVersion: security.example.com/v1
> kind: DelegationToken
> metadata:
>   name: dt-9f2c1a87e4b94c5a  # this is your "seqNum"/ID
> spec:
>   owner: alice
>   renewer: yarn
>   issueTime: "2025-12-08T09:00:00Z"
>   maxExpiryTime: "2025-12-15T09:00:00Z"
>   currentExpiryTime: "2025-12-09T09:00:00Z"
>   masterKeyId: "91"
> status:
>   phase: Active
> {code}
> Schema of the DelegationToken CRD:
> {code}
> ...
> later
> ...
> {code}
> Custom Resource instances can be bound to a specific namespace, which by 
> design provides good separation in a larger cluster.
> Users of this delegation token manager should accept the possible performance 
> characteristics of using K8s control plane vs. Zookeeper storage.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to