I'm no Calcite expert (yet) but I have a few suggestions based on my own experience with using Planner and digging through its code. Keep in mind that there are surely better people to explain this around here, but I'll do my best based on what I've learned...
When using Planner, you shouldn't really need to access the underlying RelOptPlanner (or VolcanoPlanner) directly. How it should be used is by adding new Programs that use specific planners, e.g. Programs.ofRules(rules) for Volcano or Programs.hep for heuristic rules. Check out the Programs class for all the options. Use the FrameworkConfig builder to add programs() and traitDefs(). Then use Planner.transform(int, RelTraitSet) to apply rules in multiple phases. If you add two Programs to the Planner, the first will be run in transform(0, traits) and the second will be run in transform(1, traits). The Planner will internally handle calling changeTraits and findBestExp. You can find examples of this usage of Planner in tests. I haven't seen great documentation on conventions, but what I understand of conventions is that each RelNode has a set of input and output traits, one of which can be a calling Convention. In order for one node to be the input of another, that node's output traits must match the input traits of the other, and the goal of the planner is to make the root node's out trait set match the desired trait set by applying rules to the tree. The reason EnumerableConvention works is because there are Enumberable converter rules associated with that convention that are handling the conversion of RelNode to that convention such that all in/out conventions are compatible with one another and the root node has the Enumerable convention. Some rules may operate on Logical nodes, e.g. filter push down, join reordering, etc, and others may convert to or between calling conventions. Were you to set a different custom calling convention, that convention would need to have its own set of converter rules capable of converting each node in the tree for planning to be successful otherwise you could not convert from Convention.NONE to your custom convention. Some rules can also convert between two calling conventions, e.g. the JdbcRules have a JdbcToEnumerableConverterRule that converts the JDBC convention to Enumerable. In practice, this represents the bridge between a JDBC query and Calcite's enumerables. That is, the JdbcToEnumerableConverter is the point where a JDBC query is compiled and run and the results are passed to the EnumerableRels for completion of additional query logic that could not be pushed down. Even if an entire query can be pushed down, the root will still always have a JdbcToEnumerableConverter since the planner is expecting the Enumerable convention trait at the root of the tree. Hope my limited knowledge helps :-) > On Jun 27, 2016, at 10:10 PM, Ravikumar CS <[email protected]> wrote: > > Hi, > > I am trying to get the VolcanoPlanner working. I took the simple planner > from Milinda[1] and built a modified planner[2] which uses Volcano planner > for optimization. > > The table that I am using is CSVFilterableTable[3]. However, the volcano > planner fails to optimize with the following error[4] > > Questions: > 1. It works when I explicitly set the EnumerableConvention(Line 95-98). In > that case the rules seem to fire. I get back a plan in enumerable > convention. Is that expected ? > > 2. If I want to take the initial LogicalPlan & generate the optimized > logical plan. How can I achieve this using VolcanoPlanner? ( Just the way > it worked using HepPlanner) > > 3. Am I missing any crucial planner rules ? > > 4. I want to understand more about the Convention concept and how it > relates to Planner. Is there a documentation that I can go through? > > ~Ravi > > > [1] > https://github.com/milinda/calcite-tutorial/blob/master/src/main/java/org/pathirage/calcite/tutorial/planner/SimpleQueryPlanner.java > > [2] BasicQueryPlanner with Volcano: > Script: https://gist.github.com/ravikumarcs/724b7cbb1053a1650664aabc6eeb7271 > > Output: https://gist.github.com/ravikumarcs/d0d50c414cae47be18f45e57a58749dd > > [3] Model: > https://github.com/apache/calcite/blob/master/example/csv/src/test/resources/filterable-model.json > > [4] VolcanoPlanner failure: > https://gist.github.com/ravikumarcs/10b53d47ad0bd1037436eab7c342c048 > > > >> On Thu, Jun 2, 2016 at 2:33 PM, Ravikumar CS <[email protected]> wrote: >> >> You are right. I changed the order of the rules & it worked. Thanks Julian. >> >> Rule Order: FilterSetOpTransposeRule -> AggregateReduceFunctionsRule -> >> AggregateUnionTransposeRule >> >> New Plan: >> >> LogicalProject(id=[$0], EXPR$1=[CAST(/($1, $2)):INTEGER NOT NULL]) >> >> LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)]) >> >> LogicalUnion(all=[true]) >> >> LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT()]) >> >> LogicalFilter(condition=[=($0, 1)]) >> >> LogicalProject(id=[$0], units=[$2]) >> >> LogicalTableScan(table=[[SALES, Orders1]]) >> >> LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT()]) >> >> LogicalFilter(condition=[=($0, 1)]) >> >> LogicalProject(id=[$0], units=[$2]) >> >> LogicalTableScan(table=[[SALES, Orders2]]) >> >>> On Thu, Jun 2, 2016 at 2:07 PM, Julian Hyde <[email protected]> wrote: >>> >>> Why do you want to enable AggregateUnionAggregateRule[1]? It is doing >>> the opposite of AggregateUnionTransposeRule. >>> >>> Julian >>> >>> [1] >>> https://calcite.apache.org/apidocs/org/apache/calcite/rel/rules/AggregateUnionAggregateRule.html >>> >>> >>> On Thu, Jun 2, 2016 at 2:03 PM, Ravikumar CS <[email protected]> >>> wrote: >>>> Thanks Julian. FilterSetOpTransposeRule worked for pushing the filter >>> into >>>> the union. >>>> >>>> However the partial aggregate logic doesn't seem to work even after >>> adding >>>> the rules AggregateUnionTransposeRule & AggregateUnionAggregateRule. >>>> >>>> *Query:* SELECT id, SUM(units) FROM (SELECT id, units FROM Orders1 UNION >>>> ALL SELECT id, units FROM Orders2) where id=1 group by id >>>> >>>> *Plan:* >>>> >>>> LogicalAggregate(group=[{0}], EXPR$1=[$SUM0($1)]) >>>> >>>> LogicalUnion(all=[true]) >>>> >>>> LogicalFilter(condition=[=($0, 1)]) >>>> >>>> LogicalProject(id=[$0], units=[$2]) >>>> >>>> LogicalTableScan(table=[[SALES, Orders1]]) >>>> >>>> LogicalFilter(condition=[=($0, 1)]) >>>> >>>> LogicalProject(id=[$0], units=[$2]) >>>> >>>> LogicalTableScan(table=[[SALES, Orders2]]) >>>> >>>> ~Ravi >>>> >>>>> On Thu, Jun 2, 2016 at 12:11 PM, Julian Hyde <[email protected]> wrote: >>>>> >>>>> I logged https://issues.apache.org/jira/browse/CALCITE-1271. >>>>> >>>>>> On Thu, Jun 2, 2016 at 12:11 PM, Julian Hyde <[email protected]> wrote: >>>>>> By the way, I noticed that FilterSetOpTransposeRule and >>>>>> AggregateUnionTransposeRule are not part of the default rule set. We >>>>>> should fix that. >>>>>> >>>>>> On Thu, Jun 2, 2016 at 10:20 AM, Julian Hyde <[email protected]> >>> wrote: >>>>>>> You need to push the Filter into the Union. Otherwise the Aggregate >>> is >>>>> on top of a Filter, not a Union. Use FilterSetOpTransposeRule. >>>>>>> >>>>>>> Julian >>>>>>> >>>>>>>> On Jun 2, 2016, at 9:54 AM, Ravikumar CS <[email protected]> >>>>> wrote: >>>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I am trying to come up with an optimized relational expression >>> which >>>>> does >>>>>>>> predicate push downs >>>>>>>> >>>>>>>> & partial aggregates in preparation for a distributed execution of >>> the >>>>>>>> query. >>>>>>>> >>>>>>>> SQL of interest: >>>>>>>> >>>>>>>> SELECT col1, SUM(col2) FROM ( SELECT col2, col2 FROM Orders1 >>>>>>>> >>>>>>>> UNION >>> ALL >>>>>>>> >>>>>>>> SELECT col1, >>>>> col2 >>>>>>>> FROM Orders2 >>>>>>>> >>>>>>>> ) WHERE col1=1 >>>>> GROUP >>>>>>>> BY col1; >>>>>>>> >>>>>>>> >>>>>>>> Relational Expression - Initial: >>>>>>>> >>>>>>>> LogicalAggregate(group=[{0}], EXPR$1=[AVG($1)]) >>>>>>>> >>>>>>>> LogicalFilter(condition=[=($0, 1)]) >>>>>>>> >>>>>>>> LogicalUnion(all=[true]) >>>>>>>> >>>>>>>> LogicalProject(id=[$0], units=[$2]) >>>>>>>> >>>>>>>> LogicalTableScan(table=[[SALES, Orders1]]) >>>>>>>> >>>>>>>> LogicalProject(id=[$0], units=[$2]) >>>>>>>> >>>>>>>> LogicalTableScan(table=[[SALES, Orders2]]) >>>>>>>> >>>>>>>> >>>>>>>> Query Optimization rules applied: >>>>>>>> >>>>>>>> HepProgram program = new HepProgramBuilder() >>>>>>>> >>>>>>>> .addRuleInstance(AggregateUnionAggregateRule.INSTANCE) >>>>>>>> >>>>>>>> .addRuleInstance(AggregateUnionTransposeRule.INSTANCE) >>>>>>>> >>>>>>>> .addRuleInstance(AggregateReduceFunctionsRule.INSTANCE) >>>>>>>> >>>>>>>> .build(); >>>>>>>> >>>>>>>> HepPlanner planner = new HepPlanner(program); >>>>>>>> >>>>>>>> planner.setRoot(oldLogicalPlan); >>>>>>>> >>>>>>>> >>>>>>>> Relational Expression after Query Optimization: RelNode >>> newLogicalPlan >>>>> = >>>>>>>> planner.findBestExp(); >>>>>>>> >>>>>>>> LogicalProject(id=[$0], EXPR$1=[CAST(/($1, $2)):INTEGER NOT NULL]) >>>>>>>> >>>>>>>> LogicalAggregate(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT()]) >>>>>>>> >>>>>>>> LogicalFilter(condition=[=($0, 1)]) >>>>>>>> >>>>>>>> LogicalUnion(all=[true]) >>>>>>>> >>>>>>>> LogicalProject(id=[$0], units=[$2]) >>>>>>>> >>>>>>>> LogicalTableScan(table=[[SALES, Orders1]]) >>>>>>>> >>>>>>>> LogicalProject(id=[$0], units=[$2]) >>>>>>>> >>>>>>>> LogicalTableScan(table=[[SALES, Orders2]]) >>>>>>>> >>>>>>>> >>>>>>>> Questions: >>>>>>>> >>>>>>>> 1. Are there any rules to push the predicates(col1=1) down to the >>> sub >>>>>>>> queries ? >>>>>>>> >>>>>>>> 2. How can I rewrite the query such that the partial aggregates are >>>>>>>> computed within each union(as below)? >>>>>>>> >>>>>>>> Tried AggregateUnionTransposeRule and AggregateJoinTransposeRule. >>>>> May be >>>>>>>> I am missed something. >>>>>>>> >>>>>>>> SELECT col1, SUM(partialCol2) AS c >>>>>>>> >>>>>>>> FROM ( SELECT col1, SUM(col2) AS partialCol2 FROM Orders1 >>> where >>>>>>>> col1=1 GROUP BY deptno >>>>>>>> >>>>>>>> UNION ALL >>>>>>>> >>>>>>>> SELECT col1, SUM(col2) AS partialCol2 FROM >>> Orders2 >>>>>>>> where col1=1 GROUP BY deptno >>>>>>>> >>>>>>>> ) GROUP BY col1; >> >>
