[jira] [Updated] (CALCITE-2683) ProjectMergeRule should not be performed when Nondeterministic udf has been referenced more than once

Hequn Cheng (JIRA) Sun, 18 Nov 2018 00:44:13 -0800


     [ 
https://issues.apache.org/jira/browse/CALCITE-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hequn Cheng updated CALCITE-2683:
---------------------------------
    Description: 
Currently, there are some merge rules for project, such as {{CalcMergeRule}}, 
{{ProjectMergeRule}}, and {{ProjectCalcMergeRule}}. I found that these merge 
rules should not be performed when Nondeterministic expression of the bottom 
project has been referenced more than once by the top project. Take the 
following test as an example:
{code:java}
  @Test public void testProjectMergeCalcMergeWithNonDeterministic() throws 
Exception {
    HepProgram program = new HepProgramBuilder()
            .addRuleInstance(FilterProjectTransposeRule.INSTANCE)
            .addRuleInstance(ProjectMergeRule.INSTANCE)
            .build();

    checkPlanning(program,
            "select name, a as a1, a as a2 from (\n"
                    + "  select *, rand() as a\n"
                    + "  from dept)\n"
                    + "where deptno = 10\n");
  }
{code}
The first select generates `a` from `rand()` and the second select generate 
`a1` and `a2` from `a`. From the SQL, `a1` should equal to `a2`.
Let's take a look at the result plan:
{code:java}
LogicalProject(NAME=[$1], A1=[RAND()], A2=[RAND()])
  LogicalFilter(condition=[=($0, 10)])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
In the plan, {color:#FF0000}a1{color} may not equal to {color:#FF0000}a2{color} 
due to the projects merge which is against the SQL(a1 equals to a2).

One option to solve the problem is to disable these merge rules in such cases, 
so that the result plan will be:
{code:java}
LogicalProject(NAME=[$1], A1=[$2], A2=[$2])
  LogicalProject(DEPTNO=[$0], NAME=[$1], A=[RAND()])
    LogicalFilter(condition=[=($0, 10)])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
Any suggestions are greatly appreciated.

  was:
Currently, there are some merge rules for project, such as {{CalcMergeRule}}, 
{{ProjectMergeRule}}, and {{ProjectCalcMergeRule}}. I found that these merge 
rules should not be performed when Nondeterministic expression of the bottom 
project has been referenced more than once by the top project. Take the 
following test as an example:
{code:java}
  @Test public void testProjectMergeCalcMergeWithNonDeterministic() throws 
Exception {
    HepProgram program = new HepProgramBuilder()
            .addRuleInstance(FilterProjectTransposeRule.INSTANCE)
            .addRuleInstance(ProjectMergeRule.INSTANCE)
            .build();

    checkPlanning(program,
            "select name, a as a1, a as a2 from (\n"
                    + "  select *, rand() as a\n"
                    + "  from dept)\n"
                    + "where deptno = 10\n");
  }
{code}
The result plan is 
{code:java}
LogicalProject(NAME=[$1], {color:red}A1{color}=[RAND()], 
{color:red}A2{color}=[RAND()])
  LogicalFilter(condition=[=($0, 10)])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{code}
In the plan, {color:red}A1{color} may not equal to {color:red}A2{color} since 
the two projects are merged which is against the SQL.
One option to solve the problem is to disable these merge rules in such cases. 
What do you guys think? Any suggestions are greatly appreciated.



> ProjectMergeRule should not be performed when Nondeterministic udf has been 
> referenced more than once
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2683
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2683
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Hequn Cheng
>            Assignee: Hequn Cheng
>            Priority: Major
>
> Currently, there are some merge rules for project, such as {{CalcMergeRule}}, 
> {{ProjectMergeRule}}, and {{ProjectCalcMergeRule}}. I found that these merge 
> rules should not be performed when Nondeterministic expression of the bottom 
> project has been referenced more than once by the top project. Take the 
> following test as an example:
> {code:java}
>   @Test public void testProjectMergeCalcMergeWithNonDeterministic() throws 
> Exception {
>     HepProgram program = new HepProgramBuilder()
>             .addRuleInstance(FilterProjectTransposeRule.INSTANCE)
>             .addRuleInstance(ProjectMergeRule.INSTANCE)
>             .build();
>     checkPlanning(program,
>             "select name, a as a1, a as a2 from (\n"
>                     + "  select *, rand() as a\n"
>                     + "  from dept)\n"
>                     + "where deptno = 10\n");
>   }
> {code}
> The first select generates `a` from `rand()` and the second select generate 
> `a1` and `a2` from `a`. From the SQL, `a1` should equal to `a2`.
> Let's take a look at the result plan:
> {code:java}
> LogicalProject(NAME=[$1], A1=[RAND()], A2=[RAND()])
>   LogicalFilter(condition=[=($0, 10)])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> In the plan, {color:#FF0000}a1{color} may not equal to 
> {color:#FF0000}a2{color} due to the projects merge which is against the 
> SQL(a1 equals to a2).
> One option to solve the problem is to disable these merge rules in such 
> cases, so that the result plan will be:
> {code:java}
> LogicalProject(NAME=[$1], A1=[$2], A2=[$2])
>   LogicalProject(DEPTNO=[$0], NAME=[$1], A=[RAND()])
>     LogicalFilter(condition=[=($0, 10)])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Any suggestions are greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (CALCITE-2683) ProjectMergeRule should not be performed when Nondeterministic udf has been referenced more than once

Reply via email to