alamb commented on issue #10806:
URL: https://github.com/apache/datafusion/issues/10806#issuecomment-2154976973

   Thanks @marvinlanhenke 🙏 
   
   To write the relevant structues into Parquet, the 
[statistics_enable](https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html#method.statistics_enabled)
 field needs to be 
[Page](https://docs.rs/parquet/latest/parquet/file/properties/enum.EnabledStatistics.html#variant.Page)
   
   To read them back, the reader needs to be configured with 
[with_page_index](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.ArrowReaderOptions.html#method.with_page_index)
 I think
   
   Also I have a proposed change to the Statistics code in 
https://github.com/apache/datafusion/pull/10802
   
   If that gets merged, then the API for extracting the mins from data pages 
might look like
   
   ```rust
           // get relevant index statistics somehow
           let data_page_statatistics: Vec<&Statistics> = todo!();
           let converter = StatisticsConverter::try_new(
               column_name,
               reader.schema(),
               reader.parquet_schema(),
           );
           // get mins from the ColumnIndex
           let mins = 
converter.column_index_mins(data_page_statatistics).unwrap();
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to