alamb commented on issue #4328: URL: https://github.com/apache/arrow-rs/issues/4328#issuecomment-2100283992
> I had a chance to go through the code on a high level, thanks @alamb for the pointers, it helped me to get started. What will call this new function please? Just trying to understand the whole flow if that's ok. Thanks The major usecase I have initially is to implement the [PruningStatistics API](https://docs.rs/datafusion/latest/datafusion/physical_optimizer/pruning/trait.PruningStatistics.html) in DataFusion which supports pruning(skipping) Row Groups based on a range anaylsis of min/max values, documented [here](https://docs.rs/datafusion/latest/datafusion/physical_optimizer/pruning/struct.PruningPredicate.html#introduction) So for example, given a filter in a query such as `a = 5`, DataFusion would use the min and max values of `a` in each row group to determine if there were any rows in that row group that could match Does that make sense? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
