[
https://issues.apache.org/jira/browse/HADOOP-19750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HADOOP-19750:
----------------------------------
Description:
Given the zookeeper based delegation token manager that we have:
[ZKDelegationTokenSecretManager|https://github.com/apache/hadoop/blob/7b8b3e4e2324f42b37d981ca060bec79b1e9fe3c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L62-L68]
The proposal is to create a Kubernetes-based secret manager for delegation
tokens.
The motivation is to enable running workloads coming from the Hadoop era on
Kubernetes clusters (without actually using YARN/HDFS) without requiring
ZooKeeper to store tokens, master keys, or related metadata. This hybrid
approach is valid for workloads that are moving to Kubernetes but are not yet
ready to eliminate Kerberos entirely — which would be a much larger undertaking
than building this secret manager.
Key points of the migration:
Master Key → Kubernetes Secret
The master key would be stored as a Kubernetes Secret instead of in ZooKeeper.
{code}
apiVersion: v1
kind: Secret
metadata:
name: delegation-master-keys
type: Opaque
data:
currentKeyId: OTE= # "91" base64-encoded
keys.json:
ewogICI5MSI6ICJiYXNlNjQtc2VjcmV0LWtleSIsCiAgIjkwIjogIm9sZC1rZXkiCn0=
# keys.json = { "91": "<base64-secret>", "90": "<base64-secret>" }
{code}
Secrets are bound to a specific namespace, which by design provides good
separation in a larger cluster.
Delegation Token → Kubernetes Custom Resource (CRD)
Delegation tokens would be represented as Kubernetes custom resources.
Through the Kubernetes API server, these tokens would be stored in the
cluster’s backing store (etcd), replacing ZooKeeper as the persistence layer.
An instance of a Delegation token:
{code}
apiVersion: security.example.com/v1
kind: DelegationToken
metadata:
name: dt-9f2c1a87e4b94c5a # this is your "seqNum"/ID
spec:
owner: alice
renewer: yarn
issueTime: "2025-12-08T09:00:00Z"
maxExpiryTime: "2025-12-15T09:00:00Z"
currentExpiryTime: "2025-12-09T09:00:00Z"
masterKeyId: "91"
status:
phase: Active
{code}
Schema of the DelegationToken CRD:
{code}
...
later
...
{code}
Custom Resource instances can be bound to a specific namespace, which by design
provides good separation in a larger cluster.
Users of this delegation token manager should accept the possible performance
characteristics of using K8s control plane vs. Zookeeper storage.
was:
Given the zookeeper based delegation token manager that we have:
[ZKDelegationTokenSecretManager|https://github.com/apache/hadoop/blob/7b8b3e4e2324f42b37d981ca060bec79b1e9fe3c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L62-L68]
The proposal is to create a Kubernetes-based secret manager for delegation
tokens.
The motivation is to enable running Hadoop workloads on Kubernetes clusters
(without Hadoop/YARN components) without requiring ZooKeeper to store tokens,
master keys, or related metadata. This hybrid approach is valid for workloads
that are moving to Kubernetes but are not yet ready to eliminate Kerberos
entirely — which would be a much larger undertaking than building this secret
manager.
Key points of the migration:
Master Key → Kubernetes Secret
The master key would be stored as a Kubernetes Secret instead of in ZooKeeper.
{code}
apiVersion: v1
kind: Secret
metadata:
name: delegation-master-keys
type: Opaque
data:
currentKeyId: OTE= # "91" base64-encoded
keys.json:
ewogICI5MSI6ICJiYXNlNjQtc2VjcmV0LWtleSIsCiAgIjkwIjogIm9sZC1rZXkiCn0=
# keys.json = { "91": "<base64-secret>", "90": "<base64-secret>" }
{code}
Secrets are bound to a specific namespace, which by design provides good
separation in a larger cluster.
Delegation Token → Kubernetes Custom Resource (CRD)
Delegation tokens would be represented as Kubernetes custom resources.
Through the Kubernetes API server, these tokens would be stored in the
cluster’s backing store (etcd), replacing ZooKeeper as the persistence layer.
An instance of a Delegation token:
{code}
apiVersion: security.example.com/v1
kind: DelegationToken
metadata:
name: dt-9f2c1a87e4b94c5a # this is your "seqNum"/ID
spec:
owner: alice
renewer: yarn
issueTime: "2025-12-08T09:00:00Z"
maxExpiryTime: "2025-12-15T09:00:00Z"
currentExpiryTime: "2025-12-09T09:00:00Z"
masterKeyId: "91"
status:
phase: Active
{code}
Schema of the DelegationToken CRD:
{code}
...
later
...
{code}
Custom Resource instances can be bound to a specific namespace, which by design
provides good separation in a larger cluster.
Users of this delegation token manager should accept the possible performance
characteristics of using K8s control plane vs. Zookeeper storage.
> Delegation token secret manager for Kubernetes
> ----------------------------------------------
>
> Key: HADOOP-19750
> URL: https://issues.apache.org/jira/browse/HADOOP-19750
> Project: Hadoop Common
> Issue Type: New Feature
> Reporter: László Bodor
> Priority: Major
>
> Given the zookeeper based delegation token manager that we have:
> [ZKDelegationTokenSecretManager|https://github.com/apache/hadoop/blob/7b8b3e4e2324f42b37d981ca060bec79b1e9fe3c/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/ZKDelegationTokenSecretManager.java#L62-L68]
> The proposal is to create a Kubernetes-based secret manager for delegation
> tokens.
> The motivation is to enable running workloads coming from the Hadoop era on
> Kubernetes clusters (without actually using YARN/HDFS) without requiring
> ZooKeeper to store tokens, master keys, or related metadata. This hybrid
> approach is valid for workloads that are moving to Kubernetes but are not yet
> ready to eliminate Kerberos entirely — which would be a much larger
> undertaking than building this secret manager.
> Key points of the migration:
> Master Key → Kubernetes Secret
> The master key would be stored as a Kubernetes Secret instead of in ZooKeeper.
> {code}
> apiVersion: v1
> kind: Secret
> metadata:
> name: delegation-master-keys
> type: Opaque
> data:
> currentKeyId: OTE= # "91" base64-encoded
> keys.json:
> ewogICI5MSI6ICJiYXNlNjQtc2VjcmV0LWtleSIsCiAgIjkwIjogIm9sZC1rZXkiCn0=
> # keys.json = { "91": "<base64-secret>", "90": "<base64-secret>" }
> {code}
> Secrets are bound to a specific namespace, which by design provides good
> separation in a larger cluster.
> Delegation Token → Kubernetes Custom Resource (CRD)
> Delegation tokens would be represented as Kubernetes custom resources.
> Through the Kubernetes API server, these tokens would be stored in the
> cluster’s backing store (etcd), replacing ZooKeeper as the persistence layer.
> An instance of a Delegation token:
> {code}
> apiVersion: security.example.com/v1
> kind: DelegationToken
> metadata:
> name: dt-9f2c1a87e4b94c5a # this is your "seqNum"/ID
> spec:
> owner: alice
> renewer: yarn
> issueTime: "2025-12-08T09:00:00Z"
> maxExpiryTime: "2025-12-15T09:00:00Z"
> currentExpiryTime: "2025-12-09T09:00:00Z"
> masterKeyId: "91"
> status:
> phase: Active
> {code}
> Schema of the DelegationToken CRD:
> {code}
> ...
> later
> ...
> {code}
> Custom Resource instances can be bound to a specific namespace, which by
> design provides good separation in a larger cluster.
> Users of this delegation token manager should accept the possible performance
> characteristics of using K8s control plane vs. Zookeeper storage.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]