bschell commented on a change in pull request #1760:
URL: https://github.com/apache/hudi/pull/1760#discussion_r444566173
##########
File path: hudi-spark/src/main/scala/org/apache/hudi/AvroConversionUtils.scala
##########
@@ -78,4 +79,21 @@ object AvroConversionUtils {
def convertAvroSchemaToStructType(avroSchema: Schema): StructType = {
SchemaConverters.toSqlType(avroSchema).dataType.asInstanceOf[StructType]
}
+
+ private def deserializeRow(encoder: ExpressionEncoder[Row], internalRow:
InternalRow): Row = {
+ // First attempt to use spark2 API for deserialization, otherwise attempt
with spark3 API
+ try {
+ val spark2method = encoder.getClass.getMethods.filter(method =>
method.getName.equals("fromRow")).last
+ spark2method.invoke(encoder, internalRow).asInstanceOf[Row]
+ } catch {
+ case e: NoSuchElementException => spark3Deserialize(encoder, internalRow)
Review comment:
I think checking spark version would be a good idea to prevent the
failed call everytime, let me look into it. I don't believe the reflection
performance is significant as a whole, especially if we can figure out the
spark version upfront. We do something similar here:
https://github.com/apache/hudi/pull/1638/commits/9f1284374e72717222c51ea681dc2a5ceb696a50
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]