alamb opened a new issue, #4693: URL: https://github.com/apache/arrow-rs/issues/4693
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We are implementing configurable parquet writing in DataFusion We want to be able to allow users to specify the parquet writing options (like compression) via a string like ``` set parquet.writer_version = 2.0 set parquet.compression = zstd(5) ``` **Describe the solution you'd like** Implement [`FromStr`](https://doc.rust-lang.org/std/str/trait.FromStr.html) for the following structures, with some tests. - [ ] [Encoding](https://docs.rs/parquet/45.0.0/parquet/format/struct.Encoding.html) - [ ] [Compression](https://docs.rs/parquet/45.0.0/parquet/basic/enum.Compression.html#) - [ ] [WriterVersion](https://docs.rs/parquet/45.0.0/parquet/file/properties/enum.WriterVersion.html#) - [ ] [EnabledStatistics](https://docs.rs/parquet/45.0.0/parquet/file/properties/enum.EnabledStatistics.html#) Each of these can be done via a separate PR The basic code can probably be ported from DataFusion here and add some unit tests: https://github.com/apache/arrow-datafusion/blob/ed85abbb878ef3d60e43797376cb9a40955cd89a/datafusion/core/src/datasource/file_format/parquet.rs#L13 Bonus points for good error messages that give example values (like "Invalid encoding. Valid values: plain_dictionary, rle, etc **Describe alternatives you've considered** <!-- A clear and concise description of any alternative solutions or features you've considered. --> **Additional context** @devinjdangelo implemented parsing for these in https://github.com/apache/arrow-datafusion/pull/7244 however, I think these features could be more generally useful to others -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
