amogh-jahagirdar commented on issue #14914: URL: https://github.com/apache/iceberg/issues/14914#issuecomment-3687858095
Iceberg does require explicit materialization of columns in the data file, even those that are used in partitioning schemes. Ultimately partitioning is a transformation or derivation on a column (or columns) but the materialization in data files is helpful in case metadata gets corrupted. @JerAguilon Here's the current spec language https://iceberg.apache.org/spec/#writing-data-files ``` All columns must be written to data files even if they introduce redundancy with metadata stored in manifest files (e.g. columns with identity partition transforms). Writing all columns provides a backup in case of corruption or bugs in the metadata layer. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
