LuciferYang commented on code in PR #44153:
URL: https://github.com/apache/spark/pull/44153#discussion_r1413534980


##########
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala:
##########
@@ -3270,7 +3270,10 @@ class HiveDDLSuite
       val jarName = "TestUDTF.jar"
       val jar = 
spark.asInstanceOf[TestHiveSparkSession].getHiveFile(jarName).toURI.toString
       spark.sparkContext.allAddedJars.keys.find(_.contains(jarName))
-        .foreach(spark.sparkContext.addedJars("default").remove)
+        .foreach(k => spark.sparkContext.addedJars.get("default") match {

Review Comment:
   @HyukjinKwon 
   
   I noticed that this logic was added in 
https://github.com/apache/spark/pull/41495. If I'm not mistaken, this is a 
defensive deletion to avoid including `TestUDTF.jar` in `default` during 
testing? However, I found that running only this test:
   
   ```
   build/sbt "hive/testOnly org.apache.spark.sql.hive.execution.HiveDDLSuite" 
-Phive
   ```
   
   throws an error:
   
   ```
   [info] - SPARK-34261: Avoid side effect if create exists temporary function 
*** FAILED *** (4 milliseconds)
   [info]   java.util.NoSuchElementException: key not found: default
   [info]   at scala.collection.MapOps.default(Map.scala:274)
   [info]   at scala.collection.MapOps.default$(Map.scala:273)
   [info]   at scala.collection.AbstractMap.default(Map.scala:405)
   [info]   at scala.collection.MapOps.apply(Map.scala:176)
   [info]   at scala.collection.MapOps.apply$(Map.scala:175)
   [info]   at scala.collection.AbstractMap.apply(Map.scala:405)
   [info]   at 
org.apache.spark.sql.hive.execution.HiveDDLSuite.$anonfun$new$445(HiveDDLSuite.scala:3275)
   [info]   at 
org.apache.spark.sql.test.SQLTestUtilsBase.withUserDefinedFunction(SQLTestUtils.scala:256)
   [info]   at 
org.apache.spark.sql.test.SQLTestUtilsBase.withUserDefinedFunction$(SQLTestUtils.scala:254)
   [info]   at 
org.apache.spark.sql.execution.command.DDLSuite.withUserDefinedFunction(DDLSuite.scala:326)
   [info]   at 
org.apache.spark.sql.hive.execution.HiveDDLSuite.$anonfun$new$444(HiveDDLSuite.scala:3267)
   ```
   
   I manually printed the contents of `spark.sparkContext.addedJars` and it's 
an empty Map.
   
   But when I execute:
   
   ```
   build/sbt "hive/testOnly org.apache.spark.sql.hive.execution.SQLQuerySuite 
org.apache.spark.sql.hive.execution.HiveDDLSuite" -Phive
   ```
   
   all tests pass, and the content of `spark.sparkContext.addedJars` is:
   
   ```
   Map(default -> Map(spark://localhost:54875/jars/SPARK-21101-1.0.jar -> 
1701676986594, spark://localhost:54875/jars/hive-contrib-2.3.9.jar -> 
1701676944590, spark://localhost:54875/jars/TestUDTF.jar -> 1701676921340))
   ```
   
   In GitHub Action tests, `SQLQuerySuite` does indeed execute before 
`HiveDDLSuite`, so the failure is not reproduced.
   
   So in this PR, I added a `case match` to only execute `remove` when 
`default` is not `None`.
   
   I didn't directly clear all the contents in 
`spark.sparkContext.allAddedJars` after `SQLQuerySuite` in this PR, because I'm 
not sure about the impact of this behavior.
   
   
   Do you think this is ok? Or do you have any better suggestions?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to