voonhous commented on code in PR #18938:
URL: https://github.com/apache/hudi/pull/18938#discussion_r3464878815


##########
hudi-common/src/main/java/org/apache/hudi/avro/VariantShreddingProvider.java:
##########
@@ -63,4 +86,24 @@ GenericRecord shredVariantRecord(
       GenericRecord unshreddedVariant,
       Schema shreddedSchema,
       HoodieSchema.Variant variantSchema);
+
+  /**
+   * Reconstruct an unshredded variant GenericRecord from a shredded one (the 
inverse of
+   * {@link #shredVariantRecord}).
+   * <p>
+   * Used on the read path: records read from an already-shredded base file 
(compaction/clustering)
+   * arrive with {@code typed_value} populated. This rebuilds the full variant 
binary so the record
+   * presents the standard unshredded {@code {metadata, value}} shape before 
it reaches the
+   * merger/writer.
+   *
+   * @param shreddedVariant  GenericRecord with {value, metadata, typed_value} 
read from a shredded base file
+   * @param shreddedSchema   the Avro schema of {@code shreddedVariant} 
(carries typed_value)
+   * @param unshreddedSchema target Avro schema with {value: ByteBuffer, 
metadata: ByteBuffer}
+   * @return a GenericRecord conforming to {@code unshreddedSchema} with the 
full reconstructed
+   *         variant binary in {@code value}, or null if the input metadata is 
missing

Review Comment:
   Good catch, fixed - updated the `@return` to "null when `shreddedVariant` is 
null" and added an `@throws HoodieException` for the missing-`metadata` case, 
matching the Spark4 impl.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to