[
https://issues.apache.org/jira/browse/FLINK-39274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ruiliang updated FLINK-39274:
-----------------------------
Priority: Minor (was: Major)
> TM It is impossible to bypass the KDC login process, yet the TOKEN issued by
> JM has not been actually utilized.
> ---------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-39274
> URL: https://issues.apache.org/jira/browse/FLINK-39274
> Project: Flink
> Issue Type: Bug
> Affects Versions: 1.17.2
> Environment: flink on yarn
> Reporter: ruiliang
> Priority: Minor
>
> From the document, it can be seen that the allocation did not distinguish
> between JM and TM.
> flink-conf.yaml
> {code:java}
> security.kerberos.login.keytab=xx.keytab
> security.kerberos.login.principal=xx_principal{code}
> launch_container.sh
> {code:java}
> # It is clearly evident here that AM has successfully issued the TOKEN.
> export
> HADOOP_TOKEN_FILE_LOCATION="/data2/hadoop/yarn/local/usercache/hiidoagent/appcache/application_1773803886076_15646/container_e268_1773803886076_15646_01_000003/container_tokens"
> ..
> # But keytab files will still be downloaded here.
> export
> _REMOTE_KEYTAB_PATH="hdfs://xx/user/hiidoagent/.flink/application_1773803886076_15646/hiidoagent.keytab"
> export HADOOP_USER_NAME="[email protected]"
> export _LOCAL_KEYTAB_PATH="krb5.keytab"
> export _KEYTAB_PRINCIPAL="hiidoagent"{code}
> TM log
> {code:java}
> 2026-03-18 17:49:23,394 INFO
> org.apache.flink.runtime.state.changelog.StateChangelogStorageLoader [] -
> StateChangelogStorageLoader initialized with shortcut names
> {memory,filesystem}.
> 2026-03-18 17:49:23,441 INFO
> org.apache.flink.runtime.security.token.hadoop.KerberosLoginProvider [] -
> Attempting to login to KDC using principal: hiidoagent keytab:
> /data2/hadoop/yarn/local/usercache/hiidoagent/appcache/application_1773803886076_15646/container_e268_1773803886076_15646_01_000003/krb5.keytab
> 2026-03-18 17:49:23,717 INFO org.apache.hadoop.security.UserGroupInformation
> [] - Login successful for user hiidoagent using keytab file
> /data2/hadoop/yarn/local/usercache/hiidoagent/appcache/application_1773803886076_15646/container_e268_1773803886076_15646_01_000003/krb5.keytab
> 2026-03-18 17:49:23,717 INFO
> org.apache.flink.runtime.security.token.hadoop.KerberosLoginProvider [] -
> Successfully logged into KDC
> 2026-03-18 17:49:23,719 INFO
> org.apache.flink.runtime.security.modules.HadoopModule [] - Starting
> TGT renewal task
> 2026-03-18 17:49:23,719 INFO
> org.apache.flink.runtime.security.modules.HadoopModule [] - TGT renewal
> task started and reoccur in 60000 ms
> 2026-03-18 17:49:23,719 INFO
> org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop user
> set to [email protected] (auth:KERBEROS)
> 2026-03-18 17:49:23,720 INFO
> org.apache.flink.runtime.security.modules.HadoopModule [] - Kerberos
> security is enabled.
> 2026-03-18 17:49:23,720 INFO
> org.apache.flink.runtime.security.modules.HadoopModule [] - Kerberos
> credentials are valid.
> 2026-03-18 17:49:23,726 INFO
> org.apache.flink.runtime.security.modules.JaasModule [] - Jaas file
> will be created as
> /data1/hadoop/yarn/local/usercache/hiidoagent/appcache/application_1773803886076_15646/jaas-7581660068545285667.conf.
> ...
> 2026-03-18 17:49:25,228 INFO
> org.apache.flink.runtime.externalresource.ExternalResourceUtils [] - Enabled
> external resources: []
> 2026-03-18 17:49:25,229 INFO
> org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository []
> - Loading delegation token receivers
> 2026-03-18 17:49:25,232 INFO
> org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository []
> - Delegation token receiver hadoopfs loaded and initialized
> 2026-03-18 17:49:25,233 INFO
> org.apache.flink.runtime.security.token.DelegationTokenReceiverRepository []
> - Delegation token receiver hbase loaded and initialized {code}
> 代码:
> [https://github.com/apache/flink/blob/6fc5c97ec3a89975ee44b1b084efc8fbc25c73ee/flink-yarn/src/main/java/org/apache/flink/yarn/YarnTaskExecutorRunner.java#L132]
> Looking at the source code, there is no configuration or judgment logic in
> the code. Here, we should configure controllability instead of writing it
> completely in a fixed manner.
> KDC
> The concurrent volume of KDC = number of Flink apps * total number of
> containers.
> If it involves a large number of short-term Flink tasks, this will be a fatal
> pressure on KDC. KDC will become severely sluggish and affect the overall
> security and stability of the cluster.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)