snazy commented on code in PR #4588:
URL: https://github.com/apache/polaris/pull/4588#discussion_r3386443211


##########
plugins/spark/v3.5/integration/build.gradle.kts:
##########
@@ -134,6 +136,13 @@ dependencies {
   testImplementation(libs.antlr4.runtime)
 }
 
+evaluationDependsOn(":polaris-spark-${sparkMajorVersion}_${scalaVersion}")
+
+val sparkBundleJarTask =
+  project(":polaris-spark-${sparkMajorVersion}_${scalaVersion}")
+    .tasks
+    .named<ShadowJar>("createPolarisSparkJar")

Review Comment:
   What you want here is a Gradle configuration and outgoing artifacts.
   
   Referencing Gradle tasks in other projects is a major anti-pattern (see also 
below), because it breaks [project 
isolation](https://docs.gradle.org/current/userguide/isolated_projects.html).
   Referencing other project tasks is also fragile due to configuration 
ordering, leading to flaky builds.
   It can also slow down the build.
   
   Tasks within a Gradle project are a project-internal concern, like an 
implementation detail, not like a public contract.
   
   Solution: [Reference outgoing 
artifacts](https://docs.gradle.org/current/userguide/multi_project_builds.html#declaring_dependencies_between_subprojects)
 from another project



##########
plugins/spark/v3.5/integration/build.gradle.kts:
##########
@@ -162,3 +171,76 @@ tasks.named<Test>("intTest").configure {
   // For Spark integration tests
   addSparkJvmOptions()
 }
+
+// Bundle-jar sanity test
+testing {
+  suites {
+    register<JvmTestSuite>("sparkBundleTest") {
+      useJUnitJupiter()
+      dependencies {
+        
implementation("org.apache.spark:spark-sql_${scalaVersion}:${spark35Version}") {
+          exclude("org.apache.logging.log4j", "log4j-slf4j2-impl")
+          exclude("org.apache.logging.log4j", "log4j-1.2-api")
+          exclude("org.apache.logging.log4j", "log4j-core")
+          exclude("org.slf4j", "jul-to-slf4j")
+        }
+        implementation(
+          
"org.apache.iceberg:iceberg-spark-runtime-${sparkMajorVersion}_${scalaVersion}:${icebergVersion}"
+        )
+        
implementation(enforcedPlatform("org.scala-lang:scala-library:${scalaLibraryVersion}"))
+        
implementation(enforcedPlatform("org.scala-lang:scala-reflect:${scalaLibraryVersion}"))
+        implementation(libs.antlr4.runtime)
+        implementation(libs.javax.servlet.api)
+        runtimeOnly("org.apache.logging.log4j:log4j-core:2.26.0")
+
+        implementation(project(":polaris-api-management-model"))
+        implementation(testFixtures(project(":polaris-runtime-service")))
+
+        implementation(platform(libs.quarkus.bom))
+        implementation("io.quarkus:quarkus-junit")
+        implementation(libs.assertj.core)
+      }
+      targets.all {
+        testTask.configure {
+          systemProperty("build.output.directory", 
layout.buildDirectory.asFile.get())
+          dependsOn(tasks.named("quarkusBuild"))
+          dependsOn(sparkBundleJarTask)
+          systemProperty(
+            "polaris.spark.bundle.jar",
+            sparkBundleJarTask.flatMap { it.archiveFile 
}.get().asFile.absolutePath,
+          )
+          systemProperty("polaris.version", project.version.toString())
+          systemProperty("polaris.scala.version", scalaVersion)
+          dependsOn(":publishToMavenLocal")
+          dependsOn(":polaris-core:publishMavenPublicationToMavenLocal")
+          dependsOn(
+            
":polaris-spark-${sparkMajorVersion}_${scalaVersion}:publishMavenPublicationToMavenLocal"
+          )

Review Comment:
   Those cross-project task dependencies can break Gradle's project isolation.
   These tasks also change a user's local Maven repository, which we should 
avoid, because it is a single and global resource.
   
   Using a separate Gradle configuration (`FileCollection`) with polaris-core 
and the spark-jar and pass the resolved path to the test should be enough.
   
   Publication tasks are also generally not cachable and never up-to-date, so 
this breaks Gradle's build cache functionality here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to