LuciferYang commented on PR #39850: URL: https://github.com/apache/spark/pull/39850#issuecomment-1413477716
I write a demo code out of Spark project to test SPARK-42283, the demo code in https://github.com/LuciferYang/SparkConnectDemo Build a spark client and Use the following commands to start a connect server: ``` bin/spark-shell --jars spark-connect_2.12-3.5.0-SNAPSHOT.jar --driver-class-path spark-connect_2.12-3.5.0-SNAPSHOT.jar --conf spark.plugins=org.apache.spark.sql.connect.SparkConnectPlugin --conf spark.connect.grpc.binding.port=15102 ``` Then when I run `ConnectDemo` or `ConnectDemoSuite`, I will found the follwoing error in the server side: ``` scala> 23/02/02 18:02:46 ERROR SparkConnectService: Error during: execute java.lang.ClassNotFoundException: personal.code.ConnectDemo$ at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:72) at java.lang.ClassLoader.loadClass(ClassLoader.java:419) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$$anon$1.resolveClass(Utils.scala:142) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1988) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1852) at java.io.ObjectInputStream.readClass(ObjectInputStream.java:1815) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1640) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2431) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2355) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2213) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1669) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) at org.apache.spark.util.Utils$.deserialize(Utils.scala:146) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformScalarScalaUDF(SparkConnectPlanner.scala:855) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformCommonInlineUserDefinedFunction(SparkConnectPlanner.scala:837) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformExpression(SparkConnectPlanner.scala:748) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.$anonfun$transformProject$1(SparkConnectPlanner.scala:698) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at scala.collection.IterableLike.foreach(IterableLike.scala:74) at scala.collection.IterableLike.foreach$(IterableLike.scala:73) at scala.collection.AbstractIterable.foreach(Iterable.scala:56) at scala.collection.TraversableLike.map(TraversableLike.scala:286) at scala.collection.TraversableLike.map$(TraversableLike.scala:279) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformProject(SparkConnectPlanner.scala:698) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformRelation(SparkConnectPlanner.scala:72) at org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handlePlan(SparkConnectStreamHandler.scala:58) at org.apache.spark.sql.connect.service.SparkConnectStreamHandler.handle(SparkConnectStreamHandler.scala:49) at org.apache.spark.sql.connect.service.SparkConnectService.executePlan(SparkConnectService.scala:135) at org.apache.spark.connect.proto.SparkConnectServiceGrpc$MethodHandlers.invoke(SparkConnectServiceGrpc.java:306) at org.sparkproject.connect.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182) at org.sparkproject.connect.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:352) at org.sparkproject.connect.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866) at org.sparkproject.connect.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.sparkproject.connect.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ``` Are there problems with my test way? @vicennial @hvanhovell The test code is similar as `simple udf test`: ```scala val spark = SparkSession.builder() .client(SparkConnectClient.builder().port(15102).build()).build() try { def dummyUdf(x: Int): Int = x + 5 val myUdf = udf(dummyUdf _) val df = spark.range(5).select(myUdf(Column("id"))) val result = df.collectResult() assert(result.length == 5) result.toArray.zipWithIndex.foreach { case (v, idx) => assert(v.getInt(0) == idx + 5) } } finally { spark.close() } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
