gengliangwang commented on a change in pull request #31490:
URL: https://github.com/apache/spark/pull/31490#discussion_r659970793
##########
File path:
external/avro/src/main/scala/org/apache/spark/sql/avro/AvroDeserializer.scala
##########
@@ -335,36 +336,32 @@ private[sql] class AvroDeserializer(
avroPath: Seq[String],
catalystPath: Seq[String],
applyFilters: Int => Boolean): (CatalystDataUpdater, GenericRecord) =>
Boolean = {
- val validFieldIndexes = ArrayBuffer.empty[Int]
- val fieldWriters = ArrayBuffer.empty[(CatalystDataUpdater, Any) => Unit]
-
- val avroSchemaHelper = new AvroUtils.AvroSchemaHelper(avroType, avroPath)
- val length = catalystType.length
- var i = 0
- while (i < length) {
- val catalystField = catalystType.fields(i)
- avroSchemaHelper.getFieldByName(catalystField.name) match {
- case Some(avroField) =>
- validFieldIndexes += avroField.pos()
+ val avroSchemaHelper =
+ new AvroUtils.AvroSchemaHelper(avroType, catalystType, avroPath,
positionalFieldMatch)
+
+ avroSchemaHelper.getCatalystFieldsWithoutMatch.filterNot(_.nullable) match
{
Review comment:
Shall we have this improvement in another PR? And we can have test cases
of incompatible schemas for both by-name and by-position matching.
Currently I find this PR a bit complicated and there is not test cases for
this.
Supporting the position matching in this one is enough.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]