Re: [PR] feat(variant): support reading shredded variant base files via the AVRO reader [hudi]

via GitHub Sat, 20 Jun 2026 07:28:11 -0700


wombatu-kun commented on code in PR #18938:
URL: https://github.com/apache/hudi/pull/18938#discussion_r3446636163



##########
hudi-spark-datasource/hudi-spark/src/test/scala/org/apache/spark/sql/hudi/dml/schema/TestVariantDataType.scala:
##########
@@ -142,6 +150,8 @@ class TestVariantDataType extends HoodieSparkSqlTestBase {
            |  primaryKey = 'id',
            |  type = 'mor',
            |  preCombineField = 'ts',
+           |  hoodie.parquet.variant.write.shredding.enabled = 'true',

Review Comment:
   This test does not exercise the AVRO reconstruction it is meant to cover. 
For a non-metadata Spark MOR table, compaction reads base files via 
`SparkReaderContextFactory` 
(`HoodieSparkEngineContext.getReaderContextFactory`), the native InternalRow 
reader, regardless of merger; `withRecordType` only swaps the record merger and 
the log block format, not the base-file reader, so `HoodieAvroParquetReader` 
(where `HoodieVariantReconstruction` runs) is never reached on either record 
type. `rebuildVariantRecord`, the 
`AvroVariantRow`/`AvroObjectRow`/`AvroArrayRow` accessors, and both new 
fail-fast branches are reachable only via `HoodieAvroReaderContext` (Java/Flink 
compaction, MDT, or `LOG_COMPACT`), which this PR does not test. Consider a 
Java-client compaction round-trip over a shredded base file, or a direct unit 
test of `rebuildVariantRecord` / `HoodieVariantReconstruction`, and cover the 
array/decimal/numeric variant shapes (current data is one string-valued object 
field).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(variant): support reading shredded variant base files via the AVRO reader [hudi]

Reply via email to