Re: [PR] [HUDI-8106] Use the new filegroup reader for the metadata table [hudi]

via GitHub Tue, 20 Aug 2024 18:42:58 -0700


jonvex commented on code in PR #11802:
URL: https://github.com/apache/hudi/pull/11802#discussion_r1724177428



##########
hudi-client/hudi-spark-client/src/main/scala/org/apache/hudi/AvroConversionUtils.scala:
##########
@@ -230,12 +230,23 @@ object AvroConversionUtils {
   private def resolveUnion(schema: Schema, dataType: DataType): (Schema, 
Boolean) = {
     val innerFields = schema.getTypes.asScala
     val containsNullSchema = 
innerFields.foldLeft(false)((nullFieldEncountered, schema) => 
nullFieldEncountered | schema.getType == Schema.Type.NULL)
-    (if (containsNullSchema) {
-      Schema.createUnion((List(Schema.create(Schema.Type.NULL)) ++ 
innerFields.filter(innerSchema => !(innerSchema.getType == Schema.Type.NULL))
-        .map(innerSchema => getAvroSchemaWithDefaults(innerSchema, 
dataType))).asJava)
-    } else {
-      Schema.createUnion(schema.getTypes.asScala.map(innerSchema => 
getAvroSchemaWithDefaults(innerSchema, dataType)).asJava)
-    }, containsNullSchema)
+    dataType match {

Review Comment:
   kinda. It's because in the mdt we have a union of a bunch of different 
types, but otherwise, we expect union to only be 2 types with one of them being 
null. This is spark specific because this code is for converting spark schemas, 
but we will probably have an issue like this if other engines use a different 
schema than avro



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-8106] Use the new filegroup reader for the metadata table [hudi]

Reply via email to