pan3793 commented on code in PR #49814:
URL: https://github.com/apache/spark/pull/49814#discussion_r1947554775
##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -171,6 +172,11 @@ private[hive] class HiveClientImpl(
private def newState(): SessionState = {
val hiveConf = newHiveConf(sparkConf, hadoopConf, extraConfig,
Some(initClassLoader))
val state = new SessionState(hiveConf)
+ // When SessionState is initialized, the caller context is overridden by
hive
+ // so we need to reset it back to the DRIVER
Review Comment:
> If we find there are broader problems with Spark's Hive clients leaking
state, then an alternative solution might be to isolate
org.apache.hadoop.ipc.CallerContext via the Hive client IsolatedClientLoader.
Currently it treats all the Hadoop classes as shared.
@cnauroth after SPARK-42539, Spark does not use IsolatedClientLoader by
default, the current Hive 2.3.10 is compatible with HMS 1.2 to 4.0, and in
practice, IsolatedClientLoader is rare to use due to setup complexity and some
known issues(see SPARK-42539)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]