[
https://issues.apache.org/jira/browse/DRILL-7187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aman Sinha updated DRILL-7187:
------------------------------
Labels: ready-to-commit (was: )
> Improve selectivity estimates for range predicates when using histogram
> -----------------------------------------------------------------------
>
> Key: DRILL-7187
> URL: https://issues.apache.org/jira/browse/DRILL-7187
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Aman Sinha
> Assignee: Aman Sinha
> Priority: Major
> Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> 2 types of selectivity estimation improvements need to be done:
> 1. For range predicates on the same column, we need to collect all such
> predicates in 1 group and do a histogram lookup for them together.
> For instance:
> {noformat}
> WHERE a > 10 AND b < 20 AND c = 100 AND a <= 50 AND b < 50
> {noformat}
> Currently, the Drill behavior is to treat each of the conjuncts
> independently and multiply the individual selectivities. However, that will
> not give the accurate estimates. Here, we want to group the predicates on 'a'
> together and do a single lookup. Similarly for 'b'.
> 2. NULLs are not maintained by the histogram but when doing the selectivity
> calculations, the histogram should use the totalRowCount as the denominator
> rather than the non-null count.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)