zzzzming95 commented on code in PR #38068:
URL: https://github.com/apache/spark/pull/38068#discussion_r985168093
##########
connector/avro/src/test/scala/org/apache/spark/sql/avro/AvroSerdeSuite.scala:
##########
@@ -49,6 +49,22 @@ class AvroSerdeSuite extends SparkFunSuite {
}
}
+ test("Test byte conversion") {
+ withFieldMatchType { fieldMatch =>
+ val (top, nest) = fieldMatch match {
+ case BY_NAME => ("foo", "bar")
+ case BY_POSITION => ("NOTfoo", "NOTbar")
+ }
+ val avro = createNestedAvroSchemaWithFields(top, _.optionalInt(nest))
+ val record = new GenericRecordBuilder(avro)
+ .set(top, new
GenericRecordBuilder(avro.getField(top).schema()).set(nest, -128).build())
+ .build()
+ val serializer = Serializer.create(CATALYST_STRUCT_WITH_BYTE, avro,
fieldMatch)
+ val deserializer = Deserializer.create(CATALYST_STRUCT_WITH_BYTE, avro,
fieldMatch)
+ assert(serializer.serialize(deserializer.deserialize(record).get) ===
record)
Review Comment:
Thanks for your review @amaliujia
It's OK to write to the avro file directly, because Spark will automatically
convert the byte type to the int type when writing. When reading, it is normal
to read the int data type.
If it is a saveAsTable, an additional copy of hive metadata will be recorded
in the metastore, which will trigger the bug.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]