Re: Use of noDag parameter in HepPlanner
Stamatis, Just FYI, maybe it will be useful for you, Drill uses *noDAG: true *as default value for HepPlanner [1]. After changing it to false, a lot of Drill unit tests failed [2]. [1] https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java#L416 [2] https://travis-ci.org/vdiravka/drill/jobs/494499462 Kind regards Vitalii On Fri, Feb 15, 2019 at 2:43 PM Stamatis Zampetakis wrote: > FYI, what I concluded by going through the code and the various test cases > is the following. > > By allowing DAGs the planner can detect common sub expressions in queries > and re-use an existing result without re-applying a rule if that is not > necessary. This should lead to fewer object creations and rule > applications, which may in turn lead to improved performance. In the > existing use cases noDag=false should appear more often since it is the > default value for two out of three constructors in the HepPlanner. > > In principle it seems that using or not using DAGs should give the same > expression in the end so I would say that using DAGs is always a better > option. I tried setting noDag to be always true but various test fail with > StackOverflowError so it seems there are rules who tend to execute infinite > number of times as a result of this change. I would tend to thing that this > is a bug but I didn't look further. > > Στις Τρί, 12 Φεβ 2019 στις 9:35 μ.μ., ο/η Julian Hyde > έγραψε: > > > I don’t recall. > > > > Could you review the tests and see whether tests tend to use noDag=true > or > > false most of the time? Are there any tests that use the less popular > > value, and if so, is there a particular reason that those tests use that > > option? > > > > Julian > > > > > > > On Feb 12, 2019, at 6:47 AM, Stamatis Zampetakis > > wrote: > > > > > > Hi all, > > > > > > I don't understand what is the correct way to set the noDag [1] > parameter > > > in HepPlanner. I understand what it does (internal query graph becomes > a > > > tree or a DAG) but I don't see why should I use the one or the other > and > > > when. > > > > > > Is it performance related? > > > Are there implications on the rules that can be used with the planner? > > > Does it limit the class of queries that need to be transformed? > > > > > > Thanks in advance, > > > Stamatis > > > > > > [1] > > > > > > https://github.com/apache/calcite/blob/883666929478aabe07ee5b9e572c43a6f1a703e2/core/src/main/java/org/apache/calcite/plan/hep/HepPlanner.java#L131 > > > > >
Re: ANALYZE TABLE
It looks that it can be useful for Calcite based projects just to follow the common style for similar purposes, in our case for gathering statistics. But looks like Phoenix has other style [1]: UPDATE STATISTICS my_table ALL But Hive has [2]: ANALYZE TABLE Table1 PARTITION(ds='2008-04-09', hr=11) COMPUTE STATISTICS; So it could be difficult to find the common style here. [1] https://phoenix.apache.org/update_statistics.html [2] https://cwiki-test.apache.org/confluence/display/Hive/StatsDev#StatsDev-ExistingTables%E2%80%93ANALYZE On Thu, Sep 20, 2018 at 9:29 PM Julian Hyde wrote: > Yeah, we should probably go with the de facto standard. > > But a bigger question is: what would this command do? Calcite has nowhere > to store statistics currently. Are people asking for this just so they have > a template that they can copy-paste into their own Calcite-based project? > > Julian > > > > On Sep 20, 2018, at 11:25 AM, Gautam Parai wrote: > > > > Oracle no longer recommends using ANALYZE TABLE except for certain cases > > and for preserving backwards compatibility. Instead, they now have a > > DBMS_STATS package with several methods for collecting statistics. [1] > > > > ALTER TABLE is usually associated with DDLs. If several projects/vendors > > already use ANALYZE for gathering statistics it may still be worthwhile > > since a majority of folks would be familiar with it? > > > > [1] > > > https://docs.oracle.com/cd/B28359_01/server.111/b28310/general002.htm#ADMIN11524 > < > https://docs.oracle.com/cd/B28359_01/server.111/b28310/general002.htm#ADMIN11524 > > > > > > Gautam > > > > On Thu, Sep 20, 2018 at 11:14 AM Julian Hyde jh...@apache.org>> wrote: > > > >> I can’t believe that Microsoft’s command is “UPDATE STATISTICS”. > >> Especially considering STATISTICS is not an ISO reserved word, so some > >> folks might actually have a table called STATISTICS. > >> > >> In every other database, UPDATE STATISTICS would be a DML command. > >> > >>> On Sep 20, 2018, at 11:07 AM, Julian Hyde wrote: > >>> > >>> The Babel parser doesn’t really do DDL (because there is too much > >> variation among dialects). > >>> > >>> The “server” parser might be a better place for this. It has a few, > >> Calcite-specific DDL statements. It could have ANALYZE too. > >>> > >>> In my opinion, Oracle made a mistake when they introduced ANALYE TABLE. > >> A "ALTER TABLE … COMPUTE STATISTICS” command makes just as much sense. > >>> > >>> Julian > >>> > >>> > >>>> On Sep 20, 2018, at 7:18 AM, Vitalii Diravka > >> wrote: > >>>> > >>>> ANALYZE TABLE statement is commonly used by different SQL engines for > >>>> collecting table statistics: PostgeSQL [1], MySQL [2], ORACLE [3], > >>>> Microsoft Transact-SQL - UPDATE STATISTICS - [4], SPARK SQL [5], Hive > >> [6]. > >>>> But I didn't find ANALYZE TABLE statement in SQL specification. > >>>> > >>>> Is there any sense to add it to Calcite (for instance for query > >>>> validation)? > >>>> Or maybe it can be part of the "babel" parser? If so what dialect > >> should be > >>>> selected? > >>>> > >>>> [1] > >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.postgresql.org_docs_8.1_static_sql-2Danalyze.html=DwIFaQ=cskdkSMqhcnjZxdQVpwTXg=sNneUV1qeI9SCmBsUtXeUfboyubppHxYtOoUl6KCGdI=PkqrIAegua3OOSuznxIeNaCL0IDLXgeaKRGxKADXRoE=LJ-hvGyGPSMjPWwJqlJTMhosJXRswUIChATBjl_7o8o= > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.postgresql.org_docs_8.1_static_sql-2Danalyze.html=DwIFaQ=cskdkSMqhcnjZxdQVpwTXg=sNneUV1qeI9SCmBsUtXeUfboyubppHxYtOoUl6KCGdI=PkqrIAegua3OOSuznxIeNaCL0IDLXgeaKRGxKADXRoE=LJ-hvGyGPSMjPWwJqlJTMhosJXRswUIChATBjl_7o8o= > > > >>>> [2] > >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.mysql.com_doc_refman_8.0_en_analyze-2Dtable.html=DwIFaQ=cskdkSMqhcnjZxdQVpwTXg=sNneUV1qeI9SCmBsUtXeUfboyubppHxYtOoUl6KCGdI=PkqrIAegua3OOSuznxIeNaCL0IDLXgeaKRGxKADXRoE=_40GURZMv45K-ZjtBniCfnQbaEUViMyxSA-yCiLYcNg= > < > https://urldefense.proofpoint.com/v2/url?u=https-3A__dev.mysql.com_doc_refman_8.0_en_analyze-2Dtable.html=DwIFaQ=cskdkSMqhcnjZxdQVpwTXg=sNneUV1qeI9SCmBsUtXeUfboyubppHxYtOoUl6KCGdI=PkqrIAegua3OOSuznxIeNaCL0IDLXgeaKRGxKADXRoE=_40GURZMv45K-ZjtBniCfnQbaEUViMyxSA-yCiLYcNg= > > > >>>> [3] > >>>> > >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.o
ANALYZE TABLE
ANALYZE TABLE statement is commonly used by different SQL engines for collecting table statistics: PostgeSQL [1], MySQL [2], ORACLE [3], Microsoft Transact-SQL - UPDATE STATISTICS - [4], SPARK SQL [5], Hive [6]. But I didn't find ANALYZE TABLE statement in SQL specification. Is there any sense to add it to Calcite (for instance for query validation)? Or maybe it can be part of the "babel" parser? If so what dialect should be selected? [1] https://www.postgresql.org/docs/8.1/static/sql-analyze.html [2] https://dev.mysql.com/doc/refman/8.0/en/analyze-table.html [3] https://docs.oracle.com/cd/B28359_01/server.111/b28310/general002.htm#ADMIN11524 [4] https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql?view=sql-server-2017 [5] https://docs.microsoft.com/en-us/sql/t-sql/statements/update-statistics-transact-sql?view=sql-server-2017 [6] https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-ANALYZETABLE%3Ctable1%3ECACHEMETADATA
[jira] [Created] (CALCITE-2410) Enable ProjectSetOpTransposeRule for UNION (distinct) SetOp operator
Vitalii Diravka created CALCITE-2410: Summary: Enable ProjectSetOpTransposeRule for UNION (distinct) SetOp operator Key: CALCITE-2410 URL: https://issues.apache.org/jira/browse/CALCITE-2410 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Fix For: 1.17.0 Currently {{ProjectSetOpTransposeRule}} rule can transpose UnionAll SetOp and Project RelNodes. It disabled for Union (distinct) SetOp operator. {code} // cannot push project past a distinct if (!setOp.all) { return; } {code} After removing this check the rule works fine and transposes Union (distinct) SetOp and Project RelNodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: JoinPushThroughJoinRule not applied on plan
Check yours intermediate RelNode plan, which is going to be improved by this rule. The operands and their ordering are important in matching of the rules. https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/JoinPushThroughJoinRule.java#L87 >From your case the rule should match. Could you share please the plan? Kind regards Vitalii On Tue, May 8, 2018 at 4:24 PM aishwaryaa...@gmail.com < aishwaryaa...@gmail.com> wrote: > Okay let me rephrase my question. > > I am executing a query having inner and left join. The plan generated by > calcite for that query is like > >inner > / \ > left C > /\ >A B > > But it would be better if it proceeds inner join first, as the row count > will be reduced further to proceed left join. So the plan needed is, > > left > / \ >inner B > /\ >A C > > So I read through the documentation and found JoinPushThroughJoinRule will > do so. So I tried applying them (Till now, I am giving only row count to > statistics). But no change in the plan. > > Then I found onMatch() in JoinPushThroughJoinRule.java has > not been called at all. > > How can this be resolved? > > > On 2018/05/04 14:23:00, Michael Miorwrote: > > I have no immediate answers, but it would be helpful if you could > provide a > > complete working example of code that exhibits the problem. > > > > -- > > Michael Mior > > mm...@uwaterloo.ca > > > > > > Le ven. 4 mai 2018 à 03:58, Valli Annamalai a > > écrit : > > > > > I wanted to apply JoinPushThroughJoinRule.LEFT and > > > JoinPushThroughJoinRule.RIGHT to the plan for optimization. So I added > > > those 2 rules to Program and ran it as a sequence of programs. > > > > > > But I am not getting plan applied with these rules > > > > > > I did read [here > > > < > > > > https://issues.apache.org/jira/browse/CALCITE-457?focusedCommentId=14205221=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14205221 > > > >] > > > that adding rules does not mean that the plan will be better. So sanity > > > checks may fail. In my case, during onMatch() checks (in matchRecurse() > > > from ValcanoRuleCall.java), it fails because of no parents for the > subsets > > > (getParentRels() returns empty list) . > > > > > > Thus there are no successors and the recursion has not been trigerred. > So > > > the rules has not been fired (onMatch() in > JoinPushThroughJoinRule.java has > > > not been called at all). What can be done for that? > > > > > > Any other flags need to be enabled or rules to be added?? > > > > > > > > > Thanks in advance > > > > > >
[jira] [Created] (CALCITE-2296) Extra logic to derive additional filters from for FilterJoinRule
Vitalii Diravka created CALCITE-2296: Summary: Extra logic to derive additional filters from for FilterJoinRule Key: CALCITE-2296 URL: https://issues.apache.org/jira/browse/CALCITE-2296 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Fix For: next There is necessary to add logic to derive additional filters from for FilterJoinRule in the case of using disjunction of expressions: https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/rules/FilterJoinRule.java#L155 Also it can solve CALCITE-2241. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2275) Using logical NOT operator in Join condition leads to mistakenly push down this condition.
Vitalii Diravka created CALCITE-2275: Summary: Using logical NOT operator in Join condition leads to mistakenly push down this condition. Key: CALCITE-2275 URL: https://issues.apache.org/jira/browse/CALCITE-2275 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Fix For: 1.17.0 Using logical NOT operator in Join condition leads to mistakenly push down this condition. Then LogicalProject is created with RexCall expression. It prevents for further inferring predicates (by using JoinPushTransitivePredicatesRule, for instance). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2274) Filter predicates aren't inferred while using dynamic star in subquery
Vitalii Diravka created CALCITE-2274: Summary: Filter predicates aren't inferred while using dynamic star in subquery Key: CALCITE-2274 URL: https://issues.apache.org/jira/browse/CALCITE-2274 Project: Calcite Issue Type: Improvement Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Fix For: next Filter predicates are not inferred (in RelMdPredicates#getPredicates(), for example by using JoinPushTransitivePredicatesRule), in case when above LogicalProject has dynamic star. Query example: {code:java} SELECT * FROM sales.emp d JOIN (SELECT * FROM sales.emp WHERE deptno = 4) e ON e.deptno = d.deptno {code} By using JoinPushTransitivePredicatesRule the expected result is: {code:java} LogicalProject(**=[$0]) LogicalJoin(condition=[=($16, $7)], joinType=[inner]) LogicalFilter(condition=[=($7, 4)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalProject(**=[$0]) LogicalFilter(condition=[=($7, 4)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} and actual result: {code:java} LogicalProject(**=[$0]) LogicalJoin(condition=[=($16, $7)], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalProject(**=[$0]) LogicalFilter(condition=[=($7, 4)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2257) Combination of predicates can be proved to be always true
Vitalii Diravka created CALCITE-2257: Summary: Combination of predicates can be proved to be always true Key: CALCITE-2257 URL: https://issues.apache.org/jira/browse/CALCITE-2257 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Fix For: 1.17.0 I have found the case, when Filter operator is not necessary since filter condition is always true, but that is not detected by current version of Calcite. {code} select SAL from EMPNULLABLES_20 where SAL IS NOT NULL OR SAL is null {code} {code} LogicalProject(SAL=[$5]) LogicalFilter(condition=[OR(IS NOT NULL($5), IS NULL($5))]) LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], SLACKER=[$8]) LogicalFilter(condition=[AND(=($7, 20), >($5, 1000))]) LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) {code} But filter condition _OR(IS NOT NULL($5), IS NULL($5))_ can be proved to be always true. I have tried _ReduceExpressionsRule_, but it doesn't give effect. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2242) Using custom RelBuilder for FilterRemoveIsNotDistinctFromRule. Test case for the rule.
Vitalii Diravka created CALCITE-2242: Summary: Using custom RelBuilder for FilterRemoveIsNotDistinctFromRule. Test case for the rule. Key: CALCITE-2242 URL: https://issues.apache.org/jira/browse/CALCITE-2242 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Vitalii Diravka Fix For: 1.17.0 FilterRemoveIsNotDistinctFromRule doesn't leverage custom RelBuilder. There is no test case for this rule. {code} LogicalFilter(condition=[IS NOT DISTINCT FROM($7, 20)]) LogicalTableScan(table=[[scott, EMP]]) {code} --> {code} LogicalFilter(condition=[CASE(IS NULL($7), IS NULL(20), =(CAST($7):TINYINT NOT NULL, 20))]) LogicalTableScan(table=[[scott, EMP]]) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2241) Join equality condition should be considered while pushing down filter conditions with expressions disjunction for both tables
Vitalii Diravka created CALCITE-2241: Summary: Join equality condition should be considered while pushing down filter conditions with expressions disjunction for both tables Key: CALCITE-2241 URL: https://issues.apache.org/jira/browse/CALCITE-2241 Project: Calcite Issue Type: Improvement Components: core Affects Versions: 1.16.0 Reporter: Vitalii Diravka Assignee: Julian Hyde Fix For: 1.17.0 The filter condition on Join with expressions for both join inputs can be pushed further to inputs only if that condition is conjunction of expressions. It will be good if filter condition with disjunction of such expressions will be pushed past a Join by taking into account equality inference of join condition. Query example: {code:java} SELECT t1.deptno FROM sales.emp t1 join sales.emp t2 ON t1.deptno = t2.deptno WHERE t1.deptno = 1 OR t2.deptno = 4 {code} Expected plan: {code:java} LogicalProject(DEPTNO=[$7]) LogicalJoin(condition=[AND(=($7, $16))], joinType=[inner]) LogicalFilter(condition=[OR(=($7, 1), =($7, 4))]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalFilter(condition=[OR(=($16, 1), =($16, 4))]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} Actual plan: {code:java} LogicalProject(DEPTNO=[$7]) LogicalJoin(condition=[AND(=($7, $16), OR(=($7, 1), =($16, 4)))], joinType=[inner]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) {code} The changes can be done for _RelMdPredicates.JoinConditionBasedPredicateInference_ code and will be applied by using _JoinPushTransitivePredicatesRule_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: Infinite loop with JoinPushTransitivePredicatesRule
The cases is similar to mine. I have described it in CALCITE-2205 ticket. Thank you. Kind regards Vitalii On Wed, Mar 7, 2018 at 9:29 AM, Julian Hyde <jh...@apache.org> wrote: > Do you mean, for example, given > > Filter(x > 5, Scan(Foo)) > > does it make sense to add the same filter, as follows: > > Filter(x > 5, Filter(x > 5, Scan(Foo))) > > > The second Filter has no effect, because all of its incoming rows already > obey the predicate “x > 5”. So clearly it doesn’t make sense to add it. > > In https://issues.apache.org/jira/browse/CALCITE-1995 < > https://issues.apache.org/jira/browse/CALCITE-1995> we added that > capability to the RelBuilder.filter method. I’m not sure why it isn’t > working in your case. Maybe you’re not using RelBuilder.filter, or maybe > you didn’t provide an executor, or maybe RelOptPredicateList does not have > the correct value. > > Julian > > > > On Mar 6, 2018, at 12:25 PM, Vitalii Diravka <vitalii.dira...@gmail.com> > wrote: > > > > Hi all. > > > > I found two issues which are related to this: CALCITE-2200 (is resolved > > already) and CALCITE-2205. > > > > I have general question: does it makes sense to create LogicalFilter with > > particular condition for some HepRelVertex with currentRel, > > which is actually the same desired LogicalFilter? > > > > Kind regards > > Vitalii > > > > On Fri, Mar 2, 2018 at 7:42 PM, Chunhui Shi <c...@mapr.com> wrote: > > > >> Could you file correspondent JIRAs to both Calcite and Drill with > detailed > >> repro steps? This sounds a bug to me. I think this is an issue. We are > >> seeing some stuck planning when run on latest Calcite with Drill too - > not > >> sure if this is relevant though. We could exchange details directly. > >> > >> > >> From: Vitalii Diravka <vitalii.dira...@gmail.com> > >> Sent: Thursday, March 1, 2018 4:10:31 PM > >> To: dev@calcite.apache.org > >> Subject: Infinite loop with JoinPushTransitivePredicatesRule > >> > >> Hi all! > >> > >> I got the infinite loop while using FilterIntoJoinRule + > >> JoinPushTransitivePredicatesRule in HEP planner for some correlated > >> queries, for instance: > >> *select d.deptno from sales.emp d where d.deptno IN (select e.deptno > from > >> sales.emp e where e.deptno = d.deptno or e.deptno = 4)* > >> > >> I have a reproduce in Calcite. We always get HepRelVertex for one side > of > >> LogicalJoin relNode or LogicalFilter, therefore the rule is fired every > >> time and infinitely in result. > >> > >> I've noticed that the rule works fine after JoinToCorrelateRule. But I > >> think this is not a decision for using JoinPushTransitivePredicatesRu > le, > >> even from the performance perspective. > >> > >> How I can resolve it or maybe I missed something else? > >> > >> Kind regards > >> Vitalii > >> > >
Re: Infinite loop with JoinPushTransitivePredicatesRule
Hi all. I found two issues which are related to this: CALCITE-2200 (is resolved already) and CALCITE-2205. I have general question: does it makes sense to create LogicalFilter with particular condition for some HepRelVertex with currentRel, which is actually the same desired LogicalFilter? Kind regards Vitalii On Fri, Mar 2, 2018 at 7:42 PM, Chunhui Shi <c...@mapr.com> wrote: > Could you file correspondent JIRAs to both Calcite and Drill with detailed > repro steps? This sounds a bug to me. I think this is an issue. We are > seeing some stuck planning when run on latest Calcite with Drill too - not > sure if this is relevant though. We could exchange details directly. > > ________ > From: Vitalii Diravka <vitalii.dira...@gmail.com> > Sent: Thursday, March 1, 2018 4:10:31 PM > To: dev@calcite.apache.org > Subject: Infinite loop with JoinPushTransitivePredicatesRule > > Hi all! > > I got the infinite loop while using FilterIntoJoinRule + > JoinPushTransitivePredicatesRule in HEP planner for some correlated > queries, for instance: > *select d.deptno from sales.emp d where d.deptno IN (select e.deptno from > sales.emp e where e.deptno = d.deptno or e.deptno = 4)* > > I have a reproduce in Calcite. We always get HepRelVertex for one side of > LogicalJoin relNode or LogicalFilter, therefore the rule is fired every > time and infinitely in result. > > I've noticed that the rule works fine after JoinToCorrelateRule. But I > think this is not a decision for using JoinPushTransitivePredicatesRule, > even from the performance perspective. > > How I can resolve it or maybe I missed something else? > > Kind regards > Vitalii >
[jira] [Created] (CALCITE-2205) One more Infinite loop for JoinPushTransitivePredicatesRule
Vitalii Diravka created CALCITE-2205: Summary: One more Infinite loop for JoinPushTransitivePredicatesRule Key: CALCITE-2205 URL: https://issues.apache.org/jira/browse/CALCITE-2205 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Vitalii Diravka Assignee: Julian Hyde Fix For: 1.16.0 CALCITE-2200 resolves some cases of infinite loop via stopping of recursion in HepPlanner#applyRules, when newVertex is the same as vertex. In this jira one more case of infinite loop is described: JoinPushTransitivePredicatesRule#onMatch generates new right or left inputs via using RelBuilder#filter method on top of LogicalFilter RelNode with the same condition. In this case a new RelNode shouldn't be created. Possible fix to change logic in onMatch method of the rule or the logic of RelBuilder#filter method. TestCase for reproduce: {code} @Test public void testJoinPushTransitivePredicatesRule2() { HepProgramBuilder builder = new HepProgramBuilder(); builder.addRuleInstance(JoinPushTransitivePredicatesRule.INSTANCE); HepProgram build = builder.build(); HepPlanner hepPlanner = new HepPlanner(build); final String sql = "select n1.SAL from EMPNULLABLES_20 n1 where n1.SAL\n" + "IN (select n2.SAL from EMPNULLABLES_20 n2 " + "where n1.SAL = n2.SAL or n1.SAL = 4)"; sql(sql) .withDecorrelation(true) .with(hepPlanner) .check(); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (CALCITE-2200) Infinite loop for JoinPushTransitivePredicatesRule
Vitalii Diravka created CALCITE-2200: Summary: Infinite loop for JoinPushTransitivePredicatesRule Key: CALCITE-2200 URL: https://issues.apache.org/jira/browse/CALCITE-2200 Project: Calcite Issue Type: Bug Components: core Affects Versions: 1.15.0 Reporter: Vitalii Diravka Assignee: Julian Hyde Fix For: next Infinite loop is obtained while using FilterIntoJoinRule + JoinPushTransitivePredicatesRule in HEP planner for some correlated queries, for instance: select d.deptno from sales.emp d where d.deptno IN (select e.deptno from sales.emp e where e.deptno = d.deptno or e.deptno = 4) {code} @Test public void testJoinPushTransitivePredicatesRule() { HepProgram preProgram = new HepProgramBuilder() .addRuleInstance(FilterJoinRule.FILTER_ON_JOIN) .addRuleInstance(FilterJoinRule.JOIN) .addRuleInstance(JoinPushTransitivePredicatesRule.INSTANCE) .build(); HepProgramBuilder builder = new HepProgramBuilder(); HepPlanner hepPlanner = new HepPlanner(builder.build()); final String sql = "select d.deptno from sales.emp d where d.deptno\n" + "IN (select e.deptno from sales.emp e " + "where e.deptno = d.deptno or e.deptno = 4)"; TesterImpl tester = new TesterImpl(getDiffRepos(), true, false, true, false, null, null); checkPlanning(tester, preProgram, hepPlanner, sql); } {code} {code} java.lang.StackOverflowError at org.apache.calcite.util.mapping.Mappings$PartialFunctionImpl.(Mappings.java:1568) at org.apache.calcite.util.mapping.Mappings.create(Mappings.java:76) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:185) at GeneratedMetadataHandler_Predicates.getPredicates_$(Unknown Source) at GeneratedMetadataHandler_Predicates.getPredicates(Unknown Source) at GeneratedMetadataHandler_Predicates.getPredicates_$(Unknown Source) at GeneratedMetadataHandler_Predicates.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:802) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:344) at GeneratedMetadataHandler_Predicates.getPredicates_$(Unknown Source) at GeneratedMetadataHandler_Predicates.getPredicates(Unknown Source) at GeneratedMetadataHandler_Predicates.getPredicates_$(Unknown Source) at GeneratedMetadataHandler_Predicates.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:802) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:318) at GeneratedMetadataHandler_Predicates.getPredicates_$(Unknown Source) ... {code} Note1: for the same query, but with AND operator the rule works fine. Note2: the rile works also after firing JoinToCorrelateRule rule. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Infinite loop with JoinPushTransitivePredicatesRule
Hi all! I got the infinite loop while using FilterIntoJoinRule + JoinPushTransitivePredicatesRule in HEP planner for some correlated queries, for instance: *select d.deptno from sales.emp d where d.deptno IN (select e.deptno from sales.emp e where e.deptno = d.deptno or e.deptno = 4)* I have a reproduce in Calcite. We always get HepRelVertex for one side of LogicalJoin relNode or LogicalFilter, therefore the rule is fired every time and infinitely in result. I've noticed that the rule works fine after JoinToCorrelateRule. But I think this is not a decision for using JoinPushTransitivePredicatesRule, even from the performance perspective. How I can resolve it or maybe I missed something else? Kind regards Vitalii
Re: [ANNOUNCE] New committer: Volodymyr Vysotskyi
Good news! Congratulations, Vova! Kind regards Vitalii On Fri, Dec 22, 2017 at 8:53 AM, Arina Ielchiievawrote: > Congratulations, Vova! Well deserved. > > Kind regards > Arina > > On Fri, Dec 22, 2017 at 5:01 AM, Zhiqiang He > wrote: > > > Congratulations and Welcome! > > > > Regards > > James > > > > > > From: Julian Hyde > > Sent: Friday, December 22, 2017 07:45 > > To: dev@calcite.apache.org > > Cc: Vova Vysotskyi > > Subject: [ANNOUNCE] New committer: Volodymyr Vysotskyi > > > > Apache Calcite's Project Management Committee (PMC) has invited Volodymyr > > (Vova) Vysotskyi to become a committer, and we are pleased to announce > that > > he has accepted. > > > > Volodymyr has made a steady stream of high-quality contributions over the > > past few months. I believe that he is is working to get Drill onto the > > latest Calcite release. The improvements he has made to Calcite will help > > Drill and many other users. > > > > Volodymr, > > > > Welcome, thank you for your contributions, and we look forward your > > further interactions with the community! > > > > If you wish, please feel free to tell us more about yourself and what you > > are working on. > > > > Julian (on behalf of the Apache Calcite PMC) > > > > >
Re: Adding a new keyword "IF" for sql parser
Hmm...in code I found "IF" in non-reserved list. But anyway "IF" is function name in hive. That's why it works. Isn't it the good option to add "IF" into function names list in calcite? And what about suggestion to make changes in parser? Kind regards Vitalii 2016-06-27 16:06 GMT+00:00 Julian Hyde <jhyde.apa...@gmail.com>: > IF is reserved in Hive: > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL > > Julian > > > On Jun 27, 2016, at 06:30, Vitalii Diravka <vitalii.dira...@gmail.com> > wrote: > > > > Thank's Julian > > > > This isn't drill UDF, this is hive UDF. Dril can use hive UDF while query > > hive tables. That's why most likely I can't use IIF as UDF name. > > Can I make changes in calcite parser in next manner: by default IF - > > reserved word but in case when IF appears after TABLE or VIEW - > recognize > > IF as function name? > > Also I found that hive uses SQL2011 and IF is a non-reserved word there. > > > > Kind regards > > Vitalii > > > > 2016-06-17 16:35 GMT+00:00 Julian Hyde <jh...@apache.org>: > > > >> Ah, you're right. It's reserved in 2011. However, I'm looking at a > >> draft of 2014 and it's gone. > >> > >> Calcite attempts to follow the latest version of the standard, but we > >> don't do so slavishly. People want something that complies with all > >> versions of the standard (at least in the features that they care > >> about). > >> > >> Given that 4 versions of the SQL standard treat "IF" as a reserved > >> word, I think it would be foolish to create a function called "IF". > >> Most databases that have such a function call it "IIF". Could you use > >> that instead? > >> > >> On Fri, Jun 17, 2016 at 5:25 AM, Vitalii Diravka > >> <vitalii.dira...@gmail.com> wrote: > >>> Adding "IF" to a non-reserved keywords list can resolve that issue. > >>> > >>> But I saw that in Parser.jj SQL:2003 is used. Also I found "IF" keyword > >> as > >>> a reserved word in ISO/IEC 9075-4 (Fourth edition 2011-12-15). > >>> > >>> So what standard actually calcite uses? Can we add anyway "IF" to the > >>> non-reserved keywords list? > >>> > >>> > >>> Kind regards > >>> Vitalii > >>> > >>> 2016-06-15 18:07 GMT+00:00 Julian Hyde <jh...@apache.org>: > >>> > >>>> There are two kinds of keywords: reserved and non-reserved. A > >> non-reserved > >>>> keyword can be used as a keyword in the right context, and as an > >>>> identifier elsewhere. > >>>> > >>>> IF was reserved in SQL:92, SQL:99, SQL:2003 but is no longer reserved > in > >>>> SQL:2011 (see table in SqlParserTest). So you could add it to Calcite > >> as a > >>>> non-reserved keyword. > >>>> > >>>> > >>>>> On Jun 15, 2016, at 8:11 AM, Vitalii Diravka < > >> vitalii.dira...@gmail.com> > >>>> wrote: > >>>>> > >>>>> Hi all! > >>>>> > >>>>> I need to add a new keyword "IF" for sql parser. > >>>>> How can I use one keyword for two purposes: as a part of "IF EXISTS" > >>>>> statement and as a function name? > >>>>> > >>>>> Actually I added "IF" as a new keyword in FMPP configuration file > >> which > >>>>> allows to extend Calcite's SQL parser. > >>>>> After this "IF" isn't considered as an IDENTIFIER and function name. > >>>>> > >>>>> > >>>>> > >>>>> Kind regards > >>>>> Vitalii > >> >
Re: Adding a new keyword "IF" for sql parser
Thank's Julian This isn't drill UDF, this is hive UDF. Dril can use hive UDF while query hive tables. That's why most likely I can't use IIF as UDF name. Can I make changes in calcite parser in next manner: by default IF - reserved word but in case when IF appears after TABLE or VIEW - recognize IF as function name? Also I found that hive uses SQL2011 and IF is a non-reserved word there. Kind regards Vitalii 2016-06-17 16:35 GMT+00:00 Julian Hyde <jh...@apache.org>: > Ah, you're right. It's reserved in 2011. However, I'm looking at a > draft of 2014 and it's gone. > > Calcite attempts to follow the latest version of the standard, but we > don't do so slavishly. People want something that complies with all > versions of the standard (at least in the features that they care > about). > > Given that 4 versions of the SQL standard treat "IF" as a reserved > word, I think it would be foolish to create a function called "IF". > Most databases that have such a function call it "IIF". Could you use > that instead? > > On Fri, Jun 17, 2016 at 5:25 AM, Vitalii Diravka > <vitalii.dira...@gmail.com> wrote: > > Adding "IF" to a non-reserved keywords list can resolve that issue. > > > > But I saw that in Parser.jj SQL:2003 is used. Also I found "IF" keyword > as > > a reserved word in ISO/IEC 9075-4 (Fourth edition 2011-12-15). > > > > So what standard actually calcite uses? Can we add anyway "IF" to the > > non-reserved keywords list? > > > > > > Kind regards > > Vitalii > > > > 2016-06-15 18:07 GMT+00:00 Julian Hyde <jh...@apache.org>: > > > >> There are two kinds of keywords: reserved and non-reserved. A > non-reserved > >> keyword can be used as a keyword in the right context, and as an > >> identifier elsewhere. > >> > >> IF was reserved in SQL:92, SQL:99, SQL:2003 but is no longer reserved in > >> SQL:2011 (see table in SqlParserTest). So you could add it to Calcite > as a > >> non-reserved keyword. > >> > >> > >> > On Jun 15, 2016, at 8:11 AM, Vitalii Diravka < > vitalii.dira...@gmail.com> > >> wrote: > >> > > >> > Hi all! > >> > > >> > I need to add a new keyword "IF" for sql parser. > >> > How can I use one keyword for two purposes: as a part of "IF EXISTS" > >> > statement and as a function name? > >> > > >> > Actually I added "IF" as a new keyword in FMPP configuration file > which > >> > allows to extend Calcite's SQL parser. > >> > After this "IF" isn't considered as an IDENTIFIER and function name. > >> > > >> > > >> > > >> > Kind regards > >> > Vitalii > >> > >> >
Re: Adding a new keyword "IF" for sql parser
Adding "IF" to a non-reserved keywords list can resolve that issue. But I saw that in Parser.jj SQL:2003 is used. Also I found "IF" keyword as a reserved word in ISO/IEC 9075-4 (Fourth edition 2011-12-15). So what standard actually calcite uses? Can we add anyway "IF" to the non-reserved keywords list? Kind regards Vitalii 2016-06-15 18:07 GMT+00:00 Julian Hyde <jh...@apache.org>: > There are two kinds of keywords: reserved and non-reserved. A non-reserved > keyword can be used as a keyword in the right context, and as an > identifier elsewhere. > > IF was reserved in SQL:92, SQL:99, SQL:2003 but is no longer reserved in > SQL:2011 (see table in SqlParserTest). So you could add it to Calcite as a > non-reserved keyword. > > > > On Jun 15, 2016, at 8:11 AM, Vitalii Diravka <vitalii.dira...@gmail.com> > wrote: > > > > Hi all! > > > > I need to add a new keyword "IF" for sql parser. > > How can I use one keyword for two purposes: as a part of "IF EXISTS" > > statement and as a function name? > > > > Actually I added "IF" as a new keyword in FMPP configuration file which > > allows to extend Calcite's SQL parser. > > After this "IF" isn't considered as an IDENTIFIER and function name. > > > > > > > > Kind regards > > Vitalii > >
Adding a new keyword "IF" for sql parser
Hi all! I need to add a new keyword "IF" for sql parser. How can I use one keyword for two purposes: as a part of "IF EXISTS" statement and as a function name? Actually I added "IF" as a new keyword in FMPP configuration file which allows to extend Calcite's SQL parser. After this "IF" isn't considered as an IDENTIFIER and function name. Kind regards Vitalii
Re: Several quotes symbols in sql parsing
Julian, Are there any plans about the implementation of this improvement in the future calcite versions? Kind regards Vitalii 2016-06-02 18:03 GMT+00:00 Vitalii Diravka <vitalii.dira...@gmail.com>: > I see. Thanks for replying > > Kind regards > Vitalii > > 2016-06-01 19:25 GMT+00:00 Julian Hyde <jh...@apache.org>: > >> You can switch among quoting styles (DOUBLE_QUOTE, BACK_TICK, BRACKET) >> within a session but it can only use one at a time. >> >> > On Jun 1, 2016, at 10:03 AM, Vitalii Diravka <vitalii.dira...@gmail.com> >> wrote: >> > >> > Hi all! >> > >> > I am interested in adding ANSI_QUOTES (double quotes) into Apache Drill >> to >> > recognize identifiers. Also it would be great if default back tick >> quotes >> > will be work at the same time. >> > Since drill uses calcite parser - >> > Can calcite use several quotes/quoting at the same time while parse sql >> > statement? >> > >> > Thank's. >> > >> > >> > Kind regards >> > Vitalii >> >> >
Re: Several quotes symbols in sql parsing
I see. Thanks for replying Kind regards Vitalii 2016-06-01 19:25 GMT+00:00 Julian Hyde <jh...@apache.org>: > You can switch among quoting styles (DOUBLE_QUOTE, BACK_TICK, BRACKET) > within a session but it can only use one at a time. > > > On Jun 1, 2016, at 10:03 AM, Vitalii Diravka <vitalii.dira...@gmail.com> > wrote: > > > > Hi all! > > > > I am interested in adding ANSI_QUOTES (double quotes) into Apache Drill > to > > recognize identifiers. Also it would be great if default back tick quotes > > will be work at the same time. > > Since drill uses calcite parser - > > Can calcite use several quotes/quoting at the same time while parse sql > > statement? > > > > Thank's. > > > > > > Kind regards > > Vitalii > >
Several quotes symbols in sql parsing
Hi all! I am interested in adding ANSI_QUOTES (double quotes) into Apache Drill to recognize identifiers. Also it would be great if default back tick quotes will be work at the same time. Since drill uses calcite parser - Can calcite use several quotes/quoting at the same time while parse sql statement? Thank's. Kind regards Vitalii