Re: [PR] Spark: Support writing shredded variant in Iceberg-Spark [iceberg]

via GitHub Wed, 24 Dec 2025 09:43:35 -0800


yguy-ryft commented on PR #14297:
URL: https://github.com/apache/iceberg/pull/14297#issuecomment-3690300751


   This PR is super exciting!
   Does this rely on variant shredding support in Spark? Is it supported in 
Spark 4.1 already, or planned for future releases?
   
   Regarding the heuristics - I'd like to propose adding table properties as 
hints for variant shredding.
   Similarly to properties used for bloom filters, it could be good to 
introduce something like `write.parquet.variant-shredding-enabled.column.col1`, 
which will hint to the writer that this column is important for shredding.
   Many variants have important fields for which shredding should be enforced, 
and other fields which are less central and can be managed with simpler 
heuristics.
   Would love to hear your thoughts!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Spark: Support writing shredded variant in Iceberg-Spark [iceberg]

Reply via email to