[jira] [Commented] (CALCITE-2348) Handling non-deterministic operator in rules
[ https://issues.apache.org/jira/browse/CALCITE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17040298#comment-17040298 ] Julian Hyde commented on CALCITE-2348: -- I added some suggestions to CALCITE-3760 - namely to ensure that non-deterministic function calls only occur at the top of an expression in a Project - that I think would be useful here. For the record, I think this PR is pretty good. It could be re-worked to be consistent with the 'non-deterministic always on top' rule. I'm still not sure whether non-deterministic functions can be pushed down. But I am inclined to believe [~godfreyhe] that they can. > Handling non-deterministic operator in rules > > > Key: CALCITE-2348 > URL: https://issues.apache.org/jira/browse/CALCITE-2348 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.17.0 >Reporter: godfrey he >Priority: Major > Labels: pull-request-available > > Currently, rules do not handle non-deterministic operator, > e.g. FilterAggregateTransposeRule can't push down a non-deterministic filter > through an aggregate. > {code:java} > // rand_substr is a non-deterministic udf > @Test public void testPushFilterPastAggWithNondeterministicFilter() { > final String sql = "select ename, empno, c from\n" > + " (select ename, empno, count(*) as c from emp group by ename, empno) > t\n" > + " where rand_substr(ename, 1, 3) = 'Tom' and empno = 10"; > checkPlanning(FilterAggregateTransposeRule.INSTANCE, sql); > }{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (CALCITE-2348) handling non-deterministic operator in rules
[ https://issues.apache.org/jira/browse/CALCITE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501251#comment-16501251 ] godfrey he commented on CALCITE-2348: - Sorry I did not express clearly. Yes, I totally agree with you. There are two examples that I am thinking of now for different scenarios: 1. case that the non-deterministic operator should not be pushed down: {code:java} // rand_substr is a non-deterministic udf select ename, deptno from (select rand_substr(ename, 1, 3) as ename, deptno from emp) t where deptno > 10 and ename <> 'Tom'; before FilterProjectTransposeRule applied: LogicalProject(ENAME=[$0], DEPTNO=[$1]) LogicalFilter(condition=[AND(>($1, 10), <>($0, 'Tom'))]) LogicalProject(ENAME=[RAND_SUBSTR($1, 1, 3)], DEPTNO=[$7]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) after FilterProjectTransposeRule applied: LogicalProject(ENAME=[$0], DEPTNO=[$1]) LogicalProject(ENAME=[RAND_SUBSTR($1, 1, 3)], DEPTNO=[$7]) LogicalFilter(condition=[AND(>($7, 10), <>(RAND_SUBSTR($1, 1, 3), 'Tom'))]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} The values of 'ename' should not contain 'Tom'. However after FilterProjectTransposeRule applied, 'Tom' may be in the result. 2. case that the non-deterministic operator can be pushed down: {code:java} // rand_substr is a non-deterministic udf select ename, deptno from (select ename, deptno from emp) t where deptno > 10 and rand_substr(ename, 1, 3) <> 'Tom'; before FilterProjectTransposeRule applied: LogicalProject(ENAME=[$0], DEPTNO=[$1]) LogicalFilter(condition=[AND(>($1, 10), <>($0, 'Tom'))]) LogicalProject(ENAME=[RAND_SUBSTR($1, 1, 3)], DEPTNO=[$7]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) after FilterProjectTransposeRule applied: LogicalProject(ENAME=[$0], DEPTNO=[$1]) LogicalProject(ENAME=[$1], DEPTNO=[$7]) LogicalFilter(condition=[AND(<>(RAND_SUBSTR($1, 1, 3), 'Tom'), >($7, 10))]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} > handling non-deterministic operator in rules > > > Key: CALCITE-2348 > URL: https://issues.apache.org/jira/browse/CALCITE-2348 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.17.0 >Reporter: godfrey he >Assignee: Julian Hyde >Priority: Major > > Currently, rules do not handle non-deterministic operator, > e.g. FilterAggregateTransposeRule can't push down a non-deterministic filter > through an aggregate. > {code:java} > // rand_substr is a non-deterministic udf > @Test public void testPushFilterPastAggWithNondeterministicFilter() { > final String sql = "select ename, empno, c from\n" > + " (select ename, empno, count(*) as c from emp group by ename, empno) > t\n" > + " where rand_substr(ename, 1, 3) = 'Tom' and empno = 10"; > checkPlanning(FilterAggregateTransposeRule.INSTANCE, sql); > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2348) handling non-deterministic operator in rules
[ https://issues.apache.org/jira/browse/CALCITE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500589#comment-16500589 ] Julian Hyde commented on CALCITE-2348: -- Not sure what you mean by "take care of". If it's not valid to push an operator down (because the operator is non-deterministic), the rules don't push it down. Maybe there are some kinds of non-determinism where it would be OK to push an operator down. In which case, I'm happy to talk about other kinds of non-determinism. > handling non-deterministic operator in rules > > > Key: CALCITE-2348 > URL: https://issues.apache.org/jira/browse/CALCITE-2348 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.17.0 >Reporter: godfrey he >Assignee: Julian Hyde >Priority: Major > > Currently, rules do not handle non-deterministic operator, > e.g. FilterAggregateTransposeRule can't push down a non-deterministic filter > through an aggregate. > {code:java} > // rand_substr is a non-deterministic udf > @Test public void testPushFilterPastAggWithNondeterministicFilter() { > final String sql = "select ename, empno, c from\n" > + " (select ename, empno, count(*) as c from emp group by ename, empno) > t\n" > + " where rand_substr(ename, 1, 3) = 'Tom' and empno = 10"; > checkPlanning(FilterAggregateTransposeRule.INSTANCE, sql); > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2348) handling non-deterministic operator in rules
[ https://issues.apache.org/jira/browse/CALCITE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16499701#comment-16499701 ] godfrey he commented on CALCITE-2348: - It means the rules do't take care of non-deterministic operator ? Maybe we can ignore this transformation when the operator is non-deterministic to guarantee equivalent transformation for RelOptRule. just like [HiveFilterAggregateTransposeRule|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterAggregateTransposeRule.java], [HiveFilterJoinRule|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterJoinRule.java], [HiveFilterProjectTransposeRule|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java], [HiveFilterProjectTransposeRule|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterProjectTransposeRule.java], [HiveFilterSetOpTransposeRule|https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/rules/HiveFilterSetOpTransposeRule.java] in hive. PR for this: [https://github.com/apache/calcite/pull/717] Looking forward to your advice, many thanks! > handling non-deterministic operator in rules > > > Key: CALCITE-2348 > URL: https://issues.apache.org/jira/browse/CALCITE-2348 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.17.0 >Reporter: godfrey he >Assignee: Julian Hyde >Priority: Major > > Currently, rules do not handle non-deterministic operator, > e.g. FilterAggregateTransposeRule can't push down a non-deterministic filter > through an aggregate. > {code:java} > // rand_substr is a non-deterministic udf > @Test public void testPushFilterPastAggWithNondeterministicFilter() { > final String sql = "select ename, empno, c from\n" > + " (select ename, empno, count(*) as c from emp group by ename, empno) > t\n" > + " where rand_substr(ename, 1, 3) = 'Tom' and empno = 10"; > checkPlanning(FilterAggregateTransposeRule.INSTANCE, sql); > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2348) handling non-deterministic operator in rules
[ https://issues.apache.org/jira/browse/CALCITE-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16499133#comment-16499133 ] Julian Hyde commented on CALCITE-2348: -- This is by design. If the operator is non-deterministic and you push it down (or otherwise transform the query in such a way that it receives different calls, or the same calls in a different order) then the query will give different results. > handling non-deterministic operator in rules > > > Key: CALCITE-2348 > URL: https://issues.apache.org/jira/browse/CALCITE-2348 > Project: Calcite > Issue Type: Bug > Components: core >Affects Versions: 1.17.0 >Reporter: godfrey he >Assignee: Julian Hyde >Priority: Major > > Currently, rules do not handle non-deterministic operator, > e.g. FilterAggregateTransposeRule can't push down a non-deterministic filter > through an aggregate. > {code:java} > // rand_substr is a non-deterministic udf > @Test public void testPushFilterPastAggWithNondeterministicFilter() { > final String sql = "select ename, empno, c from\n" > + " (select ename, empno, count(*) as c from emp group by ename, empno) > t\n" > + " where rand_substr(ename, 1, 3) = 'Tom' and empno = 10"; > checkPlanning(FilterAggregateTransposeRule.INSTANCE, sql); > }{code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)