[ 
https://issues.apache.org/jira/browse/HIVE-28042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812151#comment-17812151
 ] 

Vikram Ahuja edited comment on HIVE-28042 at 1/30/24 8:56 AM:
--------------------------------------------------------------

*Another instance of this issue:*

 
{code:java}
2024-01-24T02:11:21,324 ERROR [TThreadPoolServer WorkerProcess-760394]: 
transport.TSaslTransport (TSaslTransport.java:open) 
- SASL negotiation failurejavax.security.sasl.SaslException: DIGEST-MD5: IO 
error acquiring password
at 
com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)
        
at 
com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java)
        
at 
org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java)
        
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java)        
at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java)
        
at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)
        
at java.security.AccessController.doPrivileged(Native Method)        
at javax.security.auth.Subject.doAs(Subject.javA)        
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java) 
       
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java)
        
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java)
        
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java)   
     
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)  
      
at java.lang.Thread.run(Thread.java)Caused by: 
org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or 
does not exist: HIVE_DELEGATION_TOKEN owner=***, renewer=***, 
realUser=*****************, issueDate=1705973286139, maxDate=1706578086139, 
sequenceNumber=3294063, masterKeyId=7601        
at 
org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)
        
at 
org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java)
        at 
com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)
        ... 15 more {code}
 

 

*Analysis of the issue:*

This particular issue is only happening when the HS2 tries to open a new Digest 
MD5 based Thrift TSaslClientTransport in cases where the session is open for a 
long time. In such cases whenever a new transport is opened it actually tries 
to authenticate and uses a retrieve password call with the token that it is 
storing in the tokenStore, The tokenStore has zookeeper, DB and memory based 
implementation. However this issue is regardless of the implementations.

HS2 uses the same metaStoreClient object across all the connections that is 
embedded in Hive.java but in some cases we have observed that is recreating a 
new metaStoreClient with a fresh connection(TSaslClientTransport). Two use 
cases that I discovered which were leading to these issues were:
 # 
 ## MSCK repair
 ## RetryingMetaStoreClient in case of any HMS issues(applicable to any sql 
query which interacts with the HMS)

 

*Root cause of this issue:*

There is a background thread called ExpiredTokenRemover running in HMS (class: 
TokenStoreDelegationTokenSecretManager.java ). This expiry thread itself is 
removing the token from the tokenStore after the renewal time has passed and 
also removing it after expiry time, but is should only remove it post expiry 
time as the token can be renewed till then.

 

Will be raising a fix for the same by changing the code where token is deleted 
after renewal time itself has passed.


was (Author: vikramahuja_):
*Another instance of this issue:*

 
{code:java}
2024-01-24T02:11:21,324 ERROR [TThreadPoolServer WorkerProcess-760394]: 
transport.TSaslTransport (TSaslTransport.java:open) 
- SASL negotiation failurejavax.security.sasl.SaslException: DIGEST-MD5: IO 
error acquiring password
at 
com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)
        
at 
com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java)
        
at 
org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java)
        
at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java)        
at 
org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java)
        
at 
org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)
        
at java.security.AccessController.doPrivileged(Native Method)        
at javax.security.auth.Subject.doAs(Subject.javA)        
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java) 
       
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java)
        
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java)
        
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java)   
     
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)  
      
at java.lang.Thread.run(Thread.java)Caused by: 
org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or 
does not exist: HIVE_DELEGATION_TOKEN owner=***, renewer=***, 
realUser=*****************, issueDate=1705973286139, maxDate=1706578086139, 
sequenceNumber=3294063, masterKeyId=7601        
at 
org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)
        
at 
org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java)
        
at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java)
        at 
com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)
        ... 15 more {code}
 

 

*Analysis of the issue:*

This particular issue is only happening when the HS2 tries to open a new Digest 
MD5 based Thrift TSaslClientTransport in cases where the session is open for a 
long time. In such cases whenever a new transport is opened it actually tries 
to do a retrieve password call with the token that it is storing in the 
tokenStore, The tokenStore has zookeeper, DB and memory based implementation. 
However this issue is regardless of the implementations.

HS2 uses the same metaStoreClient object across all the connections that is 
embedded in Hive.java but in some cases we have observed that is recreating a 
new metaStoreClient with a fresh connection(TSaslClientTransport). Two use 
cases that I discovered which were leading to these issues were:
 # 
 ## MSCK repair
 ## RetryingMetaStoreClient in case of any HMS issues(applicable to any sql 
query which interacts with the HMS)

 

*Root cause of this issue:*

There is a background thread called ExpiredTokenRemover running in HMS (class: 
TokenStoreDelegationTokenSecretManager.java ). This expiry thread itself is 
removing the token from the tokenStore after the renewal time has passed and 
also removing it after expiry time, but is should only remove it post expiry 
time as the token can be renewed till then.

 

Will be raising a fix for the same by changing the code where token is deleted 
after renewal time itself has passed.

> DigestMD5 token expired or does not exist error while opening a new 
> connection to HMS
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-28042
>                 URL: https://issues.apache.org/jira/browse/HIVE-28042
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Vikram Ahuja
>            Assignee: Vikram Ahuja
>            Priority: Major
>
> Hello,
> In our deployment we are facing the following exception in the HMS logs when 
> a HMS connection is opened from the HS2 in cases where a session is open for 
> a long time leading to query failures:
> {code:java}
> 2024-01-24T02:11:21,324 ERROR [TThreadPoolServer WorkerProcess-760394]: 
> transport.TSaslTransport (TSaslTransport.java:open) - SASL negotiation 
> failurejavax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring 
> password        
> at 
> com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)
>         
> at 
> com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java)
>         
> at 
> org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java)
>         at 
> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java)        
> at 
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java)
>         
> at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java)
>         
> at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)
>         
> at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java)
>         
> at java.security.AccessController.doPrivileged(Native Method)        
> at javax.security.auth.Subject.doAs(Subject.javA)        
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java)
>         
> at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java)
>         
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java)
>         
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java) 
>        
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java)   
>      
> at java.lang.Thread.run(Thread.java)Caused by: 
> org.apache.hadoop.security.token.SecretManager$InvalidToken: token expired or 
> does not exist: HIVE_DELEGATION_TOKEN owner=***, renewer=***, 
> realUser=*****************, issueDate=1705973286139, maxDate=1706578086139, 
> sequenceNumber=3294063, masterKeyId=7601        
> at 
> org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)
>         
> at 
> org.apache.hadoop.hive.metastore.security.TokenStoreDelegationTokenSecretManager.retrievePassword(TokenStoreDelegationTokenSecretManager.java)
>         
> at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.getPassword(HadoopThriftAuthBridge.java)
>         
> at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$SaslDigestCallbackHandler.handle(HadoopThriftAuthBridge.java)
>         
> at 
> com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java)
>         ... 15 more {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to