[GitHub] [spark] gengliangwang commented on a change in pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

GitBox Mon, 28 Jun 2021 10:14:27 -0700


gengliangwang commented on a change in pull request #31490:
URL: https://github.com/apache/spark/pull/31490#discussion_r659970793




##########
File path: 
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
##########
@@ -335,36 +336,32 @@ private[sql] class AvroDeserializer(
       avroPath: Seq[String],
       catalystPath: Seq[String],
       applyFilters: Int => Boolean): (CatalystDataUpdater, GenericRecord) => 
Boolean = {
-    val validFieldIndexes = ArrayBuffer.empty[Int]
-    val fieldWriters = ArrayBuffer.empty[(CatalystDataUpdater, Any) => Unit]
-
-    val avroSchemaHelper = new AvroUtils.AvroSchemaHelper(avroType, avroPath)
-    val length = catalystType.length
-    var i = 0
-    while (i < length) {
-      val catalystField = catalystType.fields(i)
-      avroSchemaHelper.getFieldByName(catalystField.name) match {
-        case Some(avroField) =>
-          validFieldIndexes += avroField.pos()
 
+    val avroSchemaHelper =
+      new AvroUtils.AvroSchemaHelper(avroType, catalystType, avroPath, 
positionalFieldMatch)
+
+    avroSchemaHelper.getCatalystFieldsWithoutMatch.filterNot(_.nullable) match 
{

Review comment:
       Shall we have this improvement in another PR? And we can have test cases 
of incompatible schemas for both by-name and by-position matching. 
   Currently I find this PR a bit complicated and there is not test cases for 
this.
   Supporting the position matching in this one is enough.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] gengliangwang commented on a change in pull request #31490: [SPARK-34365][AVRO] Add support for positional Catalyst-to-Avro schema matching

Reply via email to