Re: [PR] feat(variant): Add support to write shredded variants for HoodieRecordType.AVRO [hudi]

via GitHub Thu, 04 Jun 2026 00:48:33 -0700


voonhous commented on code in PR #18065:
URL: https://github.com/apache/hudi/pull/18065#discussion_r3354349887



##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/row/HoodieRowParquetWriteSupport.java:
##########
@@ -129,6 +142,16 @@ public HoodieRowParquetWriteSupport(Configuration conf, 
StructType structType, O
     hadoopConf.set("spark.sql.parquet.writeLegacyFormat", 
writeLegacyFormatEnabled);
     hadoopConf.set("spark.sql.parquet.outputTimestampType", 
config.getStringOrDefault(HoodieStorageConfig.PARQUET_OUTPUT_TIMESTAMP_TYPE));
     hadoopConf.set("spark.sql.parquet.fieldId.write.enabled", 
config.getStringOrDefault(PARQUET_FIELD_ID_WRITE_ENABLED));
+
+    // Variant shredding configs
+    this.variantWriteShreddingEnabled = 
config.getBooleanOrDefault(PARQUET_VARIANT_WRITE_SHREDDING_ENABLED);
+    this.variantForceShreddingSchemaForTest = 
config.getString(PARQUET_VARIANT_FORCE_SHREDDING_SCHEMA_FOR_TEST);

Review Comment:
   This config mirrors Spark's own test-only 
`spark.sql.variant.forceShreddingSchemaForTest`. It is currently the only way 
to force an unshredded input into a shredded layout to exercise the write path 
in tests; in normal writes, shredding only applies when the input schema 
already declares `typed_value`, so there is no schema-inference path to drive 
it yet. It is marked `markAdvanced()` and the key ends in `.for.test`.
   
   I have removed the unused 
`HoodieStorageConfig.Builder.parquetVariantForceShreddingSchemaForTest` method 
so it is no longer part of the first-class builder API, and clarified the 
documentation that it is test-only / not for production. The `ConfigProperty` 
itself has to remain since it is read by key in the write path and set via SQL 
in the tests.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat(variant): Add support to write shredded variants for HoodieRecordType.AVRO [hudi]

Reply via email to