alamb commented on issue #10806: URL: https://github.com/apache/datafusion/issues/10806#issuecomment-2154976973
Thanks @marvinlanhenke 🙏 To write the relevant structues into Parquet, the [statistics_enable](https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html#method.statistics_enabled) field needs to be [Page](https://docs.rs/parquet/latest/parquet/file/properties/enum.EnabledStatistics.html#variant.Page) To read them back, the reader needs to be configured with [with_page_index](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ArrowReaderOptions.html#method.with_page_index) I think Also I have a proposed change to the Statistics code in https://github.com/apache/datafusion/pull/10802 If that gets merged, then the API for extracting the mins from data pages might look like ```rust // get relevant index statistics somehow let data_page_statatistics: Vec<&Statistics> = todo!(); let converter = StatisticsConverter::try_new( column_name, reader.schema(), reader.parquet_schema(), ); // get mins from the ColumnIndex let mins = converter.column_index_mins(data_page_statatistics).unwrap(); ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org