[GitHub] [spark] juliuszsompolski commented on a diff in pull request #42069: [SPARK-43744][CONNECT] Fix class loading problem caused by stub user classes not found on the server classpath

via GitHub Sat, 29 Jul 2023 06:30:41 -0700


juliuszsompolski commented on code in PR #42069:
URL: https://github.com/apache/spark/pull/42069#discussion_r1278299618



##########
core/src/main/scala/org/apache/spark/internal/config/package.scala:
##########
@@ -2547,4 +2547,18 @@ package object config {
       .version("3.5.0")
       .booleanConf
       .createWithDefault(false)
+
+  private[spark] val CONNECT_SCALA_UDF_STUB_CLASSES =
+    ConfigBuilder("spark.connect.scalaUdf.stubClasses")
+      .internal()
+      .doc("""
+          |Comma-separated list of binary names of classes/packages that 
should be stubbed during
+          |the Scala UDF serde and execution if not found on the server 
classpath.
+          |An empty list effectively disables stubbing for all missing classes.
+          |By default, the server stubs classes from the Scala client package.
+          |""".stripMargin)

Review Comment:
   So by default we will be stubbing if some Spark Connect client code is 
pulled into the UDF, but not if the serialization pulls some other class, 
unrelated to the client and not needed by the UDF, but just referenced in the 
contained class in a way that will make it pulled in?
   In that case the user would also get an error about ClassNotFound? 
   Do we in that case want the user to add that using an addArtifact, even 
though it might be unclear to the user why is that relevant to the UDF?
   What are the disadvantages of just stubbing everything?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] juliuszsompolski commented on a diff in pull request #42069: [SPARK-43744][CONNECT] Fix class loading problem caused by stub user classes not found on the server classpath

Reply via email to