[GitHub] [spark] zhenlineo commented on a diff in pull request #40564: [SPARK-42519] [Test] [Connect] Add More WriteTo Tests In Spark Connect Client

via GitHub Tue, 28 Mar 2023 16:17:44 -0700


zhenlineo commented on code in PR #40564:
URL: https://github.com/apache/spark/pull/40564#discussion_r1151233941



##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala:
##########
@@ -683,7 +770,7 @@ class ClientE2ETestSuite extends RemoteSparkSession with 
SQLHelper {
     checkSameResult(list.asScala, session.createDataset(list))
     checkSameResult(
       list.asScala.map(kv => Row(kv.key, kv.value)),
-      session.createDataFrame(list.asScala.toSeq))
+      session.createDataFrame(list.asScala))

Review Comment:
   You need to revert this change to pass Scala 2.13 test.
   ```
   [error] 
/home/runner/work/spark/spark/connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/ClientE2ETestSuite.scala:773:36:
 type mismatch;
   [error]  found   : scala.collection.mutable.Buffer[org.apache.spark.sql.KV]
   [error]  required: Seq[?]
   [error]       session.createDataFrame(list.asScala))
   ```



##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/RemoteSparkSession.scala:
##########
@@ -58,10 +58,12 @@ object SparkConnectServerUtils {
 
   private lazy val sparkConnect: Process = {
     debug("Starting the Spark Connect Server...")
-    val jar = findJar(
+    val connectJar = findJar(
       "connector/connect/server",
       "spark-connect-assembly",
       "spark-connect").getCanonicalPath
+    val catalystTestJar =
+      findJar("sql/catalyst", "spark-catalyst", "spark-catalyst", test = 
true).getCanonicalPath

Review Comment:
   This would not work with a SBT only build, right?
   Does this mean, for all SBT users, they have to build with maven once? Can 
you make sure the following case works:
   ```
   build/sbt clean
   build/sbt "testOnly org.apache.spark.sql.ClientE2ETestSuite”
   ```
   
   Can you point me which is the test jar built by SBT?



##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/util/IntegrationTestUtils.scala:
##########
@@ -65,29 +65,38 @@ object IntegrationTestUtils {
    * @return
    *   the jar
    */
-  private[sql] def findJar(path: String, sbtName: String, mvnName: String): 
File = {
+  private[sql] def findJar(
+      path: String,
+      sbtName: String,
+      mvnName: String,
+      test: Boolean): File = {
     val targetDir = new File(new File(sparkHome, path), "target")
     assert(
       targetDir.exists(),
       s"Fail to locate the target folder: '${targetDir.getCanonicalPath}'. " +
         s"SPARK_HOME='${new File(sparkHome).getCanonicalPath}'. " +
         "Make sure the spark project jars has been built (e.g. using build/sbt 
package)" +
         "and the env variable `SPARK_HOME` is set correctly.")
+    val suffix = if (test) "-tests.jar" else ".jar"
     val jars = recursiveListFiles(targetDir).filter { f =>
       // SBT jar
       (f.getParentFile.getName == scalaDir &&
-        f.getName.startsWith(sbtName) && f.getName.endsWith(".jar")) ||
+        f.getName.startsWith(sbtName) && f.getName.endsWith(suffix)) ||
       // Maven Jar
       (f.getParent.endsWith("target") &&
         f.getName.startsWith(mvnName) &&
-        f.getName.endsWith(s"${org.apache.spark.SPARK_VERSION}.jar"))
+        f.getName.endsWith(s"${org.apache.spark.SPARK_VERSION}$suffix"))
     }
     // It is possible we found more than one: one built by maven, and another 
by SBT
     assert(jars.nonEmpty, s"Failed to find the jar inside folder: 
${targetDir.getCanonicalPath}")
     debug("Using jar: " + jars(0).getCanonicalPath)
     jars(0) // return the first jar found
   }
 
+  private[sql] def findJar(path: String, sbtName: String, mvnName: String): 
File = {

Review Comment:
   nit: Can we just set default `test = false` and remove this extra method?
   
   ```
   private[sql] def findJar(
         path: String,
         sbtName: String,
         mvnName: String,
         test: Boolean = false): File = {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] zhenlineo commented on a diff in pull request #40564: [SPARK-42519] [Test] [Connect] Add More WriteTo Tests In Spark Connect Client

Reply via email to