sririshindra commented on code in PR #49814:
URL: https://github.com/apache/spark/pull/49814#discussion_r1955487024
##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -171,6 +172,11 @@ private[hive] class HiveClientImpl(
private def newState(): SessionState = {
val hiveConf = newHiveConf(sparkConf, hadoopConf, extraConfig,
Some(initClassLoader))
val state = new SessionState(hiveConf)
+ // When SessionState is initialized, the caller context is overridden by
hive
+ // so we need to reset it back to the DRIVER
Review Comment:
@pan3793 , @cnauroth I was finally able to properly test on this upstream
version in a docker based Cluster with this current branch. Looks like the
change in hiveClinetImpl is not needed in Spark 4. I checked if the caller
context is being set during the sessionSate initialization in HiveClinetImpl
and it looks like it is not. So, Once the CallerContext is set inside the
SparkContext class its is not being overridden by anything else from the Driver
process.
```
2025-02-14 02:26:23,249 INFO FSNamesystem.audit: allowed=true ugi=root
(auth:SIMPLE) ip=/192.168.97.4 cmd=getfileinfo src=/warehouse/sample
dst=null perm=null proto=rpc
callerContext=SPARK_DRIVER_application_1739496632907_0005
2025-02-14 02:26:23,265 INFO FSNamesystem.audit: allowed=true ugi=root
(auth:SIMPLE) ip=/192.168.97.4 cmd=listStatus src=/warehouse/sample
dst=null perm=null proto=rpc
callerContext=SPARK_DRIVER_application_1739496632907_0005
2025-02-14 02:26:25,519 INFO FSNamesystem.audit: allowed=true ugi=root
(auth:SIMPLE) ip=/192.168.97.5 cmd=open
src=/warehouse/sample/part-00000-dd473344-76b1-4179-91ae-d15a8da4a888-c000
dst=null perm=null proto=rpc
callerContext=SPARK_TASK_application_1739496632907_0005_JId_0_SId_0_0_TId_0_0
2025-02-14 02:26:26,345 INFO FSNamesystem.audit: allowed=true ugi=root
(auth:SIMPLE) ip=/192.168.97.5 cmd=open
src=/warehouse/sample/part-00000-dd473344-76b1-4179-91ae-d15a8da4a888-c000
dst=null perm=null proto=rpc
callerContext=SPARK_TASK_application_1739496632907_0005_JId_1_SId_1_0_TId_1_0
```
I wasn't able to test with Iceberg though. This is because Iceberg doesn't
support Spark4 yet. Once an Iceberg release with Spark 4 support is released, I
will retest it and make any changes needed in a separate PR. But For now, I
removed the change that was in HiveClientImpl.scala .
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]