[jira] [Commented] (CALCITE-2683) ProjectMergeRule should not be performed when Nondeterministic udf has been referenced more than once

Julian Hyde (JIRA) Sun, 18 Nov 2018 13:42:41 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16691102#comment-16691102
 ]


Julian Hyde commented on CALCITE-2683:
--------------------------------------

It's difficult to know how to handle non-deterministic UDFs. You can't please 
everyone. But I do find your argument compelling, that if the rewritten version 
contains the same number of calls to the UDF, it should be OK.

There are other possible semantics. For instance, you could allow rewrite only 
if the calls to the UDF are guaranteed to be the same number, and the same 
order.

Perhaps there could be variants of this rule, one for each semantics, and the 
semantics could be chosen via a connection property. Other rules would be 
affected too, but together they would ensure the semantics that the user wants.

Can you start a discussion on the dev list? I'd like to hear what other 
people's requirements are.

> ProjectMergeRule should not be performed when Nondeterministic udf has been 
> referenced more than once
> -----------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-2683
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2683
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Hequn Cheng
>            Assignee: Hequn Cheng
>            Priority: Major
>
> Currently, there are some merge rules for project, such as {{CalcMergeRule}}, 
> {{ProjectMergeRule}}, and {{ProjectCalcMergeRule}}. I found that these merge 
> rules should not be performed when Nondeterministic expression of the bottom 
> project has been referenced more than once by the top project. Take the 
> following test as an example:
> {code:java}
>   @Test public void testProjectMergeCalcMergeWithNonDeterministic() throws 
> Exception {
>     HepProgram program = new HepProgramBuilder()
>             .addRuleInstance(FilterProjectTransposeRule.INSTANCE)
>             .addRuleInstance(ProjectMergeRule.INSTANCE)
>             .build();
>     checkPlanning(program,
>             "select name, a as a1, a as a2 from (\n"
>                     + "  select *, rand() as a\n"
>                     + "  from dept)\n"
>                     + "where deptno = 10\n");
>   }
> {code}
> The first select generates `a` from `rand()` and the second select generate 
> `a1` and `a2` from `a`. From the SQL, `a1` should equal to `a2`.
> Let's take a look at the result plan:
> {code:java}
> LogicalProject(NAME=[$1], A1=[RAND()], A2=[RAND()])
>   LogicalFilter(condition=[=($0, 10)])
>     LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> In the plan, {color:#FF0000}a1{color} may not equal to 
> {color:#FF0000}a2{color} due to the projects merge which is against the 
> SQL(a1 equals to a2).
> One option to solve the problem is to disable these merge rules in such 
> cases, so that the result plan will be:
> {code:java}
> LogicalProject(NAME=[$1], A1=[$2], A2=[$2])
>   LogicalProject(DEPTNO=[$0], NAME=[$1], A=[RAND()])
>     LogicalFilter(condition=[=($0, 10)])
>       LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {code}
> Any suggestions are greatly appreciated.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (CALCITE-2683) ProjectMergeRule should not be performed when Nondeterministic udf has been referenced more than once

Reply via email to