Dandandan commented on code in PR #8126:
URL: https://github.com/apache/arrow-datafusion/pull/8126#discussion_r1389869661
##########
datafusion/physical-plan/src/filter.rs:
##########
@@ -194,11 +194,13 @@ impl ExecutionPlan for FilterExec {
fn statistics(&self) -> Result<Statistics> {
let predicate = self.predicate();
+ let input_stats = self.input.statistics()?;
let schema = self.schema();
if !check_support(predicate, &schema) {
- return Ok(Statistics::new_unknown(&schema));
+ // assume worst case, that the filter is highly selective and
+ // returns all the rows from its input
+ return Ok(input_stats.clone().into_inexact());
Review Comment:
I wonder if we can make a slightly different assumption that is a better
metric, e.g. each filter returning 50% or 20% of input rows?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]