[
https://issues.apache.org/jira/browse/IMPALA-8042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-8042:
-----------------------------------
Fix Version/s: Impala 4.4.1
> Better selectivity estimate for BETWEEN
> ---------------------------------------
>
> Key: IMPALA-8042
> URL: https://issues.apache.org/jira/browse/IMPALA-8042
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 3.1.0
> Reporter: Paul Rogers
> Assignee: Riza Suminto
> Priority: Minor
> Fix For: Impala 4.5.0, Impala 4.4.1
>
>
> The analyzer rewrites a BETWEEN expression into a pair of inequalities.
> IMPALA-8037 explains that the planner then groups all such non-quality
> conditions together and assigns a selectivity of 0.1. IMPALA-8031 explains
> that the analyzer should handle inequalities better.
> BETWEEN is a special case and informs the final result. If we assume a
> selectivity of s for inequality, then BETWEEN should be something like s/2.
> The intuition is that if c >= x includes, say, ⅓ of values, and c <= y
> includes a third of values, then c BETWEEN x AND y should be a narrower set
> of values, say ⅙.
> [Ramakrishnan an
> Gherke|http://pages.cs.wisc.edu/~dbbook/openAccess/Minibase/optimizer/costformula.html\
> recommend 0.4 for between, 0.3 for inequality, and 0.3^2 = 0.09 for the
> general expression x <= c AND c <= Y. Note the discrepancy between the
> compound inequality case and the BETWEEN case, likely reflecting the
> additional information we obtain when the user chooses to use BETWEEN.
> To implement a special BETWEEN selectivity in Impala, we must remember the
> selectivity of BETWEEN during the rewrite to a compound inequality.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]