[ 
https://issues.apache.org/jira/browse/IMPALA-8030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17874359#comment-17874359
 ] 

Michael Smith edited comment on IMPALA-8030 at 8/16/24 7:33 PM:
----------------------------------------------------------------

Inconsistency analyzing new predicates in rewrite rules leads to failing to 
apply later rules. IMPALA-13302 addresses the simple case of
{code:java}
where c.c_custkey = 10 OR 10 = c.c_custkey {code}
via NormalizeBinaryPredicatesRule -> ExtractCommonConjunctRule. But the 
existing rewrite rules aren't able to handle {{{}(10, 20, 30, 30, 10, 20){}}}.


was (Author: JIRAUSER288956):
Inconsistency analyzing new predicates in rewrite rules leads to failing to 
apply later rules.

> Remove duplicate in-clause values for selectivity calcs
> -------------------------------------------------------
>
>                 Key: IMPALA-8030
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8030
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.1.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> If an IN clause has duplicate values, they should be removed so that 
> selectivity estimates are based only on unique values.
> {noformat}
> select *
> from tpch.customer c
> where c.c_custkey in (10, 20, 30, 30, 10, 20)
> ---- PLAN
> PLAN-ROOT SINK
> |
> 00:SCAN HDFS [tpch.customer c]
>    partitions=1/1 files=1 size=23.08MB row-size=218B cardinality=6
>    predicates: c.c_custkey IN (10, 20, 30, 30, 10, 20)
> {noformat}
> Expected:
> {noformat}
> 00:SCAN HDFS [tpch.customer c]
>    partitions=1/1 files=1 size=23.08MB row-size=218B cardinality=3
> {noformat}
> Notice that in the current version, we treat each value, duplicate or not, as 
> a match. In the expected result, we notice that duplicate values match only 
> once and we return matches for the unique values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to