ahshahid commented on code in PR #48252:
URL: https://github.com/apache/spark/pull/48252#discussion_r1844712588
##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##########
@@ -137,8 +137,16 @@ class SparkSession private[sql] (
/** @inheritdoc */
def createDataFrame(data: java.util.List[_], beanClass: Class[_]): DataFrame
= {
- val encoder =
JavaTypeInference.encoderFor(beanClass.asInstanceOf[Class[Any]])
- createDataset(encoder, data.iterator().asScala).toDF()
+ JavaTypeInference.setSparkClientFlag()
+ val encoderTry = Try {
+ JavaTypeInference.encoderFor(beanClass.asInstanceOf[Class[Any]])
+ }
+ JavaTypeInference.unsetSparkClientFlag()
Review Comment:
Sure. The thing is that @hvanhovell mentioned that KryoSerializer is not
supported for spark-connect client.
If the Class for which Encoder is being created, implements
KryoSerializable, then preference to Kryo based Encoder is to be given over
Java Serializable based Encoder. But if the thread which is creating the
encoder is emanating from connect client, then Kryo based encoder should not
get created. Instead for such cases Java Serializable based encoder should be
created.
Instead of adding extra function parameter, the information is passed using
thread local. And once the encoder is created, the thread local is unset.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]