[ 
https://issues.apache.org/jira/browse/HDFS-17849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18035308#comment-18035308
 ] 

ASF GitHub Bot commented on HDFS-17849:
---------------------------------------

Hexiaoqiao commented on code in PR #8054:
URL: https://github.com/apache/hadoop/pull/8054#discussion_r2490088948


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##########
@@ -858,7 +858,11 @@ private void removeExpiredToken() throws IOException {
         long renewDate = entry.getValue().getRenewDate();
         if (renewDate < now) {
           expiredTokens.add(entry.getKey());
-          removeTokenForOwnerStats(entry.getKey());
+          try {
+            removeTokenForOwnerStats(entry.getKey());

Review Comment:
   Thanks @arunreddyav for your report and contribution, I am a little confused 
the token could be leak when thrown exception here. I think the smooth way is 
config the `hadoop.security.auth_to_local` when change the realm. What do you 
think about? Thanks again.





> Namenode crashed while cleaning up Expired Delegation tokens
> ------------------------------------------------------------
>
>                 Key: HDFS-17849
>                 URL: https://issues.apache.org/jira/browse/HDFS-17849
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.4.1
>            Reporter: Kanaka Kumar Avvaru
>            Priority: Major
>
> We are facing NN crashed issue during token cleanup after updating the kerb 
> auth rules to pickup new realm configuration from existing one.
>  
> Here is the stack trace
> {noformat}
> 2025-08-11 02:28:06,448 ERROR delegation.AbstractDelegationTokenSecretManager 
> (AbstractDelegationTokenSecretManager.java:run(856)) - ExpiredTokenRemover 
> thread received unexpected exception
> java.lang.IllegalArgumentException: Illegal principal name 
> spark/<hostname>@<old_realm>: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to spark/<hostname>@<old_realm>
>         at org.apache.hadoop.security.User.<init>(User.java:51)
>         at org.apache.hadoop.security.User.<init>(User.java:43)
>         at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1458)
>         at 
> org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1441)
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenIdentifier.getUser(AbstractDelegationTokenIdentifier.java:80)
>         at 
> org.apache.hadoop.hdfs.security.token.delegation.DelegationTokenIdentifier.getUser(DelegationTokenIdentifier.java:81)
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.getTokenRealOwner(AbstractDelegationTokenSecretManager.java:914)
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeTokenForOwnerStats(AbstractDelegationTokenSecretManager.java:936)
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.removeExpiredToken(AbstractDelegationTokenSecretManager.java:773)
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.access$400(AbstractDelegationTokenSecretManager.java:71)
>         at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager$ExpiredTokenRemover.run(AbstractDelegationTokenSecretManager.java:846)
>         at java.lang.Thread.run(Thread.java:750)
> Caused by: 
> org.apache.hadoop.security.authentication.util.KerberosName$NoMatchingRule: 
> No rules applied to spark/<hostname>@<old_realm>
>         at 
> org.apache.hadoop.security.authentication.util.KerberosName.getShortName(KerberosName.java:429)
>         at org.apache.hadoop.security.User.<init>(User.java:48)
>         ... 11 more
> 2025-08-11 02:28:06,450 INFO  provider.AuditProviderFactory 
> (AuditProviderFactory.java:run(537)) - ==> JVMShutdownHook.run(){noformat}
>  
>  HDFS-17138 attempted to avoid crash during token logging but 
> getTokenRealOwner to update the token owner stats failing now in 3.4.1
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to