[
https://issues.apache.org/jira/browse/CALCITE-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yu Xu updated CALCITE-6887:
---------------------------
Description:
Currently IN operator would not distinct values.
for example *in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* transform to *in
(1,2,3)* is better, but currently would be *in
(1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* without distinct.
Like case follow:
*test case:*
{code:java}
@Test void testReduceExpressionsWithIn()
{ final String sql = "select deptno, sal " + "from emp " + "where deptno in
(1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1) ";
sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS) .check(); }
{code}
*plan would be:*
{code:java}
LogicalProject(DEPTNO=[$7], SAL=[$5])
LogicalFilter(condition=[IN($7,
{ LogicalValues(tuples=[[
{ 1 }
, { 1 }, { 2 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 },
{ 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 3 }, { 1 }]])
})])
LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
*we should distinct Values and consider a more generic and simple way we can
add a AggregateValueReduceRule to distinct Values*
*which can convert plan to:*
{code:java}
LogicalProject(DEPTNO=[$7], SAL=[$5])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalJoin(condition=[=($7, $9)], joinType=[inner])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
LogicalValues(tuples=[[{ 1 }, { 3 }]]){code}
was:
Currently IN operator would not distinct values.
for example *in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* transform to *in
(1,2,3)* is better, but currently would be *in
(1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* without distinct.
Like case follow:
*test case:*
@Test void testReduceExpressionsWithIn()
{ final String sql = "select deptno, sal " + "from emp " + "where deptno in
(1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1) ";
sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS) .check(); }
*plan would be:*
LogicalProject(DEPTNO=[$7], SAL=[$5])
LogicalFilter(condition=[IN($7,
{ LogicalValues(tuples=[[
{ 1 }, \{ 1 }, \{ 2 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 },
\{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 1 }, \{ 3 },
\{ 1 }]])
})])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
we should distinct Values, consider a
**
> ReduceExpressionsRule applied to 'IN subquery' should make the values
> distinct if the subquery is a constant Values
> -------------------------------------------------------------------------------------------------------------------
>
> Key: CALCITE-6887
> URL: https://issues.apache.org/jira/browse/CALCITE-6887
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.38.0
> Reporter: Yu Xu
> Assignee: Yu Xu
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.40.0
>
>
> Currently IN operator would not distinct values.
> for example *in (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* transform to
> *in (1,2,3)* is better, but currently would be *in
> (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1)* without distinct.
> Like case follow:
> *test case:*
>
> {code:java}
> @Test void testReduceExpressionsWithIn()
> { final String sql = "select deptno, sal " + "from emp " + "where deptno in
> (1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,3,1) ";
> sql(sql).withRule(CoreRules.PROJECT_REDUCE_EXPRESSIONS) .check(); }
> {code}
>
>
> *plan would be:*
> {code:java}
> LogicalProject(DEPTNO=[$7], SAL=[$5])
> LogicalFilter(condition=[IN($7,
> { LogicalValues(tuples=[[
> { 1 }
> , { 1 }, { 2 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1
> }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 1 }, { 3 }, { 1 }]])
> })])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code}
>
> *we should distinct Values and consider a more generic and simple way we can
> add a AggregateValueReduceRule to distinct Values*
> *which can convert plan to:*
> {code:java}
> LogicalProject(DEPTNO=[$7], SAL=[$5])
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4],
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalJoin(condition=[=($7, $9)], joinType=[inner])
> LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> LogicalValues(tuples=[[{ 1 }, { 3 }]]){code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)