Riza Suminto created IMPALA-13542:
-------------------------------------

             Summary: Raise default selectivity for HAVING predicates
                 Key: IMPALA-13542
                 URL: https://issues.apache.org/jira/browse/IMPALA-13542
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Riza Suminto
         Attachments: TPCDS-Q11_iter007-baseline.txt, 
TPCDS-Q11_iter007-test.txt, TPCDS-Q74_iter007-baseline.txt, 
TPCDS-Q74_iter007-test.txt, performance_result.txt

In my recent perf-AB-test, I found that the 10% default selectivity can regress 
output cardinality estimation of aggregation node that has a HAVING predicate.
https://gerrit.cloudera.org/c/22032/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q04.test#138

Attached are performance_result.txt and some profiles for comparison.

This was not an issue until now since NDV based cardinality estimation ofter 
overestimate already, such that the 10% default selectivity still results in 
higher estimate compared to the actual runtime cardinality. But as cardinality 
estimate gets better, this 10% default selectivity  can in-turn cause an 
underestimation. We should consider raising the default selectivity higher than 
10% for HAVING predicates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to