Gerrrr opened a new pull request, #16713:
URL: https://github.com/apache/iceberg/pull/16713
Closes #16599
Previously Iceberg's Parquet writer only exposed a global
dictionary-encoding toggle (write.parquet.dict-enabled), with no way to disable
dictionary encoding for a single column. parquet-java already supports
per-column control via ParquetProperties.Builder.withDictionaryEncoding(String
columnPath, boolean); this PR threads that through Iceberg.
Changes
1. New table property prefix `write.parquet.dict-enabled.column.<col>`,
matching the existing per-column conventions used by
`write.parquet.bloom-filter-enabled.column.*` and
`write.parquet.stats-enabled.column.*`. The per-column setting overrides the
global `write.parquet.dict-enabled` value for that column; absent entries fall
back to the global default.
2. `Context.dataContext` parses the new prefix into
`columnDictionaryEnabled` and exposes it via a getter. `Context.deleteContext`
keeps an empty map, mirroring how it handles the existing per-column
bloom-filter settings.
3. `Parquet.WriteBuilder.withDictionaryEncoding(String, boolean)` writes
into the same config map that `setAll/set` already use, giving programmatic
callers a path that does not require the property-prefix string.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]