Re: [PR] feat(variant): support reading shredded variant base files via the AVRO reader [hudi]

via GitHub Tue, 23 Jun 2026 17:55:38 -0700


voonhous commented on code in PR #18938:
URL: https://github.com/apache/hudi/pull/18938#discussion_r3463846891

##########
hudi-hadoop-common/src/main/java/org/apache/hudi/io/storage/hadoop/HoodieVariantReconstruction.java:
##########

Review Comment:
Good catch, this found a bug.

Added `TestHoodieVariantReconstructionRoundTrip` (in hudi-spark4-common, so
the real provider is auto-detected): a `create` -> `reconstruct` round-trip
over a record with a shredded variant column plus a non-variant column,
asserting the variant is rebuilt, the non-variant column passes through, and
the field positions line up.

It failed on first run: `create` reused the requested field instances for
non-target columns when assembling the intermediate read schema, and those Avro
`Field`s are already bound to the requested record - so `Schema.setFields`
throws `Field already used` as soon as a non-variant column sits alongside a
shredded variant (i.e. every real table). The target branch only avoided it
because `withSchema()` builds a fresh `Field`. Fixed by doing the same for
non-target fields. Both in 50ab440.

It stayed latent because nothing exercised this path: Spark compaction reads
base files via the InternalRow reader, and the Java/Flink AVRO path fails fast
first (no provider).

##########
hudi-hadoop-common/src/main/java/org/apache/hudi/io/storage/hadoop/HoodieVariantReconstruction.java:
##########

Review Comment:
Good catch, this found a bug.

It stayed latent because nothing exercised this path: Spark compaction reads
base files via the InternalRow reader, and the Java/Flink AVRO path fails fast
first (no provider).

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(variant): support reading shredded variant base files via the AVRO reader [hudi]

Reply via email to