[ 
https://issues.apache.org/jira/browse/SPARK-46566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-46566:
-----------------------------------
    Labels: pull-request-available  (was: )

> Session level config was not loaded when isolation is enable.
> -------------------------------------------------------------
>
>                 Key: SPARK-46566
>                 URL: https://issues.apache.org/jira/browse/SPARK-46566
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.5.0
>            Reporter: Chenyu Zheng
>            Priority: Major
>              Labels: pull-request-available
>
> I setup thriftserver based on v3.5.0, when I execute command, will throw this 
> error:
> {code:java}
> 15:10:53.400 [HiveServer2-Handler-Pool: Thread-293] ERROR 
> org.apache.thrift.transport.TSaslTransport - SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed
> ## ignore very long stack trace ## {code}
> With some debugging and analysis, I found that proxyuser should use token to 
> access metastore, but actually uses kerberos. The direct reason is that 
> "hive.metastore.token.signature" is lost.
> In fact, we have set "hive.metastore.token.signature" to 
> "HiveServer2ImpersonationToken" for config when construct 
> HiveSessionImplwithUGI, and store the config in 
> HiveSessionImplwithUGI::sessionHive and HiveSessionImplwithUGI::sessionState
> When session is acquire, we should set sessionState and sessionHive to 
> thread-level variables. Then the execution statements will use their own 
> sessionHive and sessionState, so use the right config.
> But if isolation is enable, a new SessionState and Hive will be constructed 
> using the specified hive version. Config is not passed from 
> HiveSessionImplwithUGI::sessionState to this SessionState. And config is not 
> passed from HiveSessionImplwithUGI::sessionHive to new Hive. So 
> hive.metastore.token.signature is missing.
> How to fix?
> For `spark.sql.hive.metastore.jars` is 'builtin', we can directly obtain the 
> session-level config which is threadlocal variable by SessionState.get() or 
> Hive.get().
> For `spark.sql.hive.metastore.jars` is 'maven' or 'path', we will use 
> IsolatedClientLoader to reload Hive metastore related class, It means the 
> thread-local variable in SessionState and Hive will be missing. So we need a 
> new structure to store threadlocal  config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to