jonvex commented on code in PR #11802:
URL: https://github.com/apache/hudi/pull/11802#discussion_r1724177428
##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/AvroConversionUtils.scala:
##########
@@ -230,12 +230,23 @@ object AvroConversionUtils {
private def resolveUnion(schema: Schema, dataType: DataType): (Schema,
Boolean) = {
val innerFields = schema.getTypes.asScala
val containsNullSchema =
innerFields.foldLeft(false)((nullFieldEncountered, schema) =>
nullFieldEncountered | schema.getType == Schema.Type.NULL)
- (if (containsNullSchema) {
- Schema.createUnion((List(Schema.create(Schema.Type.NULL)) ++
innerFields.filter(innerSchema => !(innerSchema.getType == Schema.Type.NULL))
- .map(innerSchema => getAvroSchemaWithDefaults(innerSchema,
dataType))).asJava)
- } else {
- Schema.createUnion(schema.getTypes.asScala.map(innerSchema =>
getAvroSchemaWithDefaults(innerSchema, dataType)).asJava)
- }, containsNullSchema)
+ dataType match {
Review Comment:
kinda. It's because in the mdt we have a union of a bunch of different
types, but otherwise, we expect union to only be 2 types with one of them being
null. This is spark specific because this code is for converting spark schemas,
but we will probably have an issue like this if other engines use a different
schema than avro
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]