[
https://issues.apache.org/jira/browse/IMPALA-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874359#comment-17874359
]
Michael Smith edited comment on IMPALA-8030 at 8/16/24 7:33 PM:
----------------------------------------------------------------
Inconsistency analyzing new predicates in rewrite rules leads to failing to
apply later rules. IMPALA-13302 addresses the simple case of
{code:java}
where c.c_custkey = 10 OR 10 = c.c_custkey {code}
via NormalizeBinaryPredicatesRule -> ExtractCommonConjunctRule. But the
existing rewrite rules aren't able to handle {{{}(10, 20, 30, 30, 10, 20){}}}.
was (Author: JIRAUSER288956):
Inconsistency analyzing new predicates in rewrite rules leads to failing to
apply later rules.
> Remove duplicate in-clause values for selectivity calcs
> -------------------------------------------------------
>
> Key: IMPALA-8030
> URL: https://issues.apache.org/jira/browse/IMPALA-8030
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 3.1.0
> Reporter: Paul Rogers
> Priority: Minor
>
> If an IN clause has duplicate values, they should be removed so that
> selectivity estimates are based only on unique values.
> {noformat}
> select *
> from tpch.customer c
> where c.c_custkey in (10, 20, 30, 30, 10, 20)
> ---- PLAN
> PLAN-ROOT SINK
> |
> 00:SCAN HDFS [tpch.customer c]
> partitions=1/1 files=1 size=23.08MB row-size=218B cardinality=6
> predicates: c.c_custkey IN (10, 20, 30, 30, 10, 20)
> {noformat}
> Expected:
> {noformat}
> 00:SCAN HDFS [tpch.customer c]
> partitions=1/1 files=1 size=23.08MB row-size=218B cardinality=3
> {noformat}
> Notice that in the current version, we treat each value, duplicate or not, as
> a match. In the expected result, we notice that duplicate values match only
> once and we return matches for the unique values.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]