Re: [PR] [SPARK-51095][CORE][SQL] Include caller context for hdfs audit logs for calls from driver [spark]

via GitHub Fri, 07 Feb 2025 21:35:26 -0800


pan3793 commented on code in PR #49814:
URL: https://github.com/apache/spark/pull/49814#discussion_r1947485021



##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -171,6 +172,11 @@ private[hive] class HiveClientImpl(
   private def newState(): SessionState = {
     val hiveConf = newHiveConf(sparkConf, hadoopConf, extraConfig, 
Some(initClassLoader))
     val state = new SessionState(hiveConf)
+    // When SessionState is initialized, the caller context is overridden by 
hive
+    // so we need to reset it back to the DRIVER

Review Comment:
   @sririshindra the usage of Hive in Spark is a little bit complex, and the 
isolated classloader does not always take effect. For example, you could set 
the following configuration to let the Spark session catalog use a different 
version of HMS client
   
   ```
   spark.sql.hive.metastore.version
   spark.sql.hive.metastore.jars
   ```
   
   But Iceberg's HMS client does not respect that, it always uses the compiled 
Hive classes.
   
   `spark-sql` and Spark Thrift Server also have different behaviors ...
   
   > I don't have a hadoop cluster with upstream versions of spark/hive.
   
   https://github.com/awesome-kyuubi/hadoop-testing this project might be 
helpful



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-51095][CORE][SQL] Include caller context for hdfs audit logs for calls from driver [spark]

Reply via email to