[ 
https://issues.apache.org/jira/browse/CALCITE-4958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hujiahua updated CALCITE-4958:
------------------------------
    Description: 
When we using IN-list predicate in where clause and setting the 
DEFAULT_IN_SUB_QUERY_THRESHOLD less than the IN-list elements size, IN-list 
predicate will converted to Join. And also using dynamic parameters in a VALUES 
clause, I found each dynamic parameter converted to a LogicalValues (e.g. "x IN 
(?, ?, ... ?)").  Too many dynamic parameters will lead to poor performance.

Here is my test: // I set DEFAULT_IN_SUB_QUERY_THRESHOLD = 2
{code:java}
      final String sql = "select * from \"TEST\".\"DEPTS\" where \"NAME\" in ( 
?, ?, ?)";
      final PreparedStatement statement2 =
              calciteConnection.prepareStatement(sql);

      statement2.setString(1, "Sales");
      statement2.setString(2, "Sales2");
      statement2.setString(3, "Sales3");
      final ResultSet resultSet1 = statement2.executeQuery();{code}
Then Logical plan will like this:
{noformat}
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalJoin(condition=[=($1, $2)], joinType=[inner])
    LogicalTableScan(table=[[TEST, DEPTS]])
    LogicalAggregate(group=[{0}])
      LogicalUnion(all=[true])
        LogicalProject(EXPR$0=[?0])
          LogicalValues(tuples=[[{ 0 }]])
        LogicalProject(EXPR$0=[?1])
          LogicalValues(tuples=[[{ 0 }]])
        LogicalProject(EXPR$0=[?2])
          LogicalValues(tuples=[[{ 0 }]])
{noformat}

  was:
When we using IN-list predicate in where clause and setting the 
DEFAULT_IN_SUB_QUERY_THRESHOLD less than the IN-list elements size, IN-list 
predicate will converted to Join. And also using dynamic parameters in a VALUES 
clause, I found each dynamic parameter converted to a LogicalValues. Too many 
dynamic parameters will lead to poor performance

 

First I set SqlToRelConverter.DEFAULT_IN_SUB_QUERY_THRESHOLD = 2
And When I use dynamic parameters in query like this:
{noformat}
select * from DEPTS where NAME in ( ?, ?, ?)
{noformat}
The IN-list will convert to union three project.
{noformat}
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalJoin(condition=[=($1, $2)], joinType=[inner])
    LogicalTableScan(table=[[TEST, DEPTS]])
    LogicalAggregate(group=[{0}])
      LogicalUnion(all=[true])
        LogicalProject(EXPR$0=[?0])
          LogicalValues(tuples=[[{ 0 }]])
        LogicalProject(EXPR$0=[?1])
          LogicalValues(tuples=[[{ 0 }]])
        LogicalProject(EXPR$0=[?2])
          LogicalValues(tuples=[[{ 0 }]])
{noformat}
But if I not use dynamic parameters in query like this:
{noformat}
select * from DEPTS where NAME in ( 'a', 'b', 'c') 
{noformat}
The IN-list will a LogicalValues, this is what I wanted.
{noformat}
LogicalProject(DEPTNO=[$0], NAME=[$1])
  LogicalJoin(condition=[=($1, $2)], joinType=[inner])
    LogicalTableScan(table=[[TEST, DEPTS]])
    LogicalAggregate(group=[{0}])
      LogicalValues(tuples=[[{ 'a' }, { 'b' }, { 'c' }]])
{noformat}
Here is my test: // I set DEFAULT_IN_SUB_QUERY_THRESHOLD = 2
{code:java}
      final String sql = "select * from \"TEST\".\"DEPTS\" where \"NAME\" in ( 
?, ?, ?)";
      final PreparedStatement statement2 =
              calciteConnection.prepareStatement(sql);

      statement2.setString(1, "Sales");
      statement2.setString(2, "Sales2");
      statement2.setString(3, "Sales3");
      final ResultSet resultSet1 = statement2.executeQuery();
{code}


> Poor performance execute plan when use dynamic parameters in IN-list clause
> ---------------------------------------------------------------------------
>
>                 Key: CALCITE-4958
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4958
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: hujiahua
>            Priority: Minor
>
> When we using IN-list predicate in where clause and setting the 
> DEFAULT_IN_SUB_QUERY_THRESHOLD less than the IN-list elements size, IN-list 
> predicate will converted to Join. And also using dynamic parameters in a 
> VALUES clause, I found each dynamic parameter converted to a LogicalValues 
> (e.g. "x IN (?, ?, ... ?)").  Too many dynamic parameters will lead to poor 
> performance.
> Here is my test: // I set DEFAULT_IN_SUB_QUERY_THRESHOLD = 2
> {code:java}
>       final String sql = "select * from \"TEST\".\"DEPTS\" where \"NAME\" in 
> ( ?, ?, ?)";
>       final PreparedStatement statement2 =
>               calciteConnection.prepareStatement(sql);
>       statement2.setString(1, "Sales");
>       statement2.setString(2, "Sales2");
>       statement2.setString(3, "Sales3");
>       final ResultSet resultSet1 = statement2.executeQuery();{code}
> Then Logical plan will like this:
> {noformat}
> LogicalProject(DEPTNO=[$0], NAME=[$1])
>   LogicalJoin(condition=[=($1, $2)], joinType=[inner])
>     LogicalTableScan(table=[[TEST, DEPTS]])
>     LogicalAggregate(group=[{0}])
>       LogicalUnion(all=[true])
>         LogicalProject(EXPR$0=[?0])
>           LogicalValues(tuples=[[{ 0 }]])
>         LogicalProject(EXPR$0=[?1])
>           LogicalValues(tuples=[[{ 0 }]])
>         LogicalProject(EXPR$0=[?2])
>           LogicalValues(tuples=[[{ 0 }]])
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to