MonkeyCanCode commented on code in PR #4588:
URL: https://github.com/apache/polaris/pull/4588#discussion_r3359957594
##########
plugins/spark/v3.5/integration/build.gradle.kts:
##########
@@ -162,3 +171,69 @@ tasks.named<Test>("intTest").configure {
// For Spark integration tests
addSparkJvmOptions()
}
+
+// Bundle-jar sanity test
+testing {
+ suites {
+ register<JvmTestSuite>("sparkBundleTest") {
+ useJUnitJupiter()
+ dependencies {
+
implementation("org.apache.spark:spark-sql_${scalaVersion}:${spark35Version}") {
+ exclude("org.apache.logging.log4j", "log4j-slf4j2-impl")
+ exclude("org.apache.logging.log4j", "log4j-1.2-api")
+ exclude("org.apache.logging.log4j", "log4j-core")
+ exclude("org.slf4j", "jul-to-slf4j")
+ }
+ implementation(
+
"org.apache.iceberg:iceberg-spark-runtime-${sparkMajorVersion}_${scalaVersion}:${icebergVersion}"
+ )
+
implementation(enforcedPlatform("org.scala-lang:scala-library:${scalaLibraryVersion}"))
+
implementation(enforcedPlatform("org.scala-lang:scala-reflect:${scalaLibraryVersion}"))
+ implementation(libs.antlr4.runtime)
+ implementation(libs.javax.servlet.api)
+ runtimeOnly("org.apache.logging.log4j:log4j-core:2.26.0")
+
+ // Bundle jar as a file artifact so the shadow jar is the only source
of
+ // polaris-spark/polaris-core classes
+ runtimeOnly(files(sparkBundleJarTask.flatMap { it.archiveFile }))
Review Comment:
Hi @dimas-b and @gh-yzou , I ran some tests and it appears neither
`spark.jars` nor `spark.driver.extraClassPath` works here as both are read by
spark-submit to build the driver JVM's launch command, and since we call
`SparkSession.builder()` directly there's no launcher to act on them. Looking
at
[SparkSubmit.getSubmitClassLoader](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L939),
we can replicate what it does by wrapping the bundle in a `URLClassLoader` and
set it as the thread's context classloader before `getOrCreate()`. Should we
proceed with this route?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]