Re: [PR] [SPARK-49789][SQL] Handling of generic parameter with bounds while creating encoders [spark]

via GitHub Fri, 15 Nov 2024 16:06:47 -0800


ahshahid commented on code in PR #48252:
URL: https://github.com/apache/spark/pull/48252#discussion_r1844712588



##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##########
@@ -137,8 +137,16 @@ class SparkSession private[sql] (
 
   /** @inheritdoc */
   def createDataFrame(data: java.util.List[_], beanClass: Class[_]): DataFrame 
= {
-    val encoder = 
JavaTypeInference.encoderFor(beanClass.asInstanceOf[Class[Any]])
-    createDataset(encoder, data.iterator().asScala).toDF()
+    JavaTypeInference.setSparkClientFlag()
+    val encoderTry = Try {
+      JavaTypeInference.encoderFor(beanClass.asInstanceOf[Class[Any]])
+    }
+    JavaTypeInference.unsetSparkClientFlag()

Review Comment:
   Sure. The thing is that @hvanhovell  mentioned that KryoSerializer is not 
supported for spark-connect client.
   If the Class for which Encoder is being created, implements 
KryoSerializable, then preference to Kryo based Encoder is to be given over 
Java Serializable based Encoder.  But if the thread which is creating the 
encoder is emanating from connect client, then Kryo based encoder should not 
get created. Instead for such cases Java Serializable based encoder should be 
created.
   Instead of adding extra function parameter, the information is passed using 
thread local. And once the encoder is created, the thread local is unset.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-49789][SQL] Handling of generic parameter with bounds while creating encoders [spark]

Reply via email to