marvinlanhenke commented on issue #10806: URL: https://github.com/apache/datafusion/issues/10806#issuecomment-2156435325
@alamb ...just to confirm my current understanding. Since we can have more than one page per column per row group, I can get multiple statistics. Or put differently one ArrayRef per Page. For example if we have a column A with row_group R1 and Pages P1 & P2, I would retrieve statistics for P1 & P2? - A.R1.P1 {min: Some(2), max: Some(5), ...} - A.R1.P2 {min: Some(1), max: Some(4), ...} Thus, instead of extracting a single `Result<ArrayRef>` in the `StatisticsExtractor` we should return a `Result<Vec<ArrayRef>>`? I think we have to do this, in order to use the new API within `prune_pages_in_one_row_group` which creates an Vec<ArrayRef> when pruning against the predicate [here](https://github.com/apache/datafusion/blob/main/datafusion/core/src/physical_optimizer/pruning.rs#L885)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org