[ https://issues.apache.org/jira/browse/CALCITE-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15576622#comment-15576622 ]
Julian Hyde commented on CALCITE-1440: -------------------------------------- I definitely think it should be done in VolcanoPlanner - dynamic programming is even more important for multi-root trees. I initially thought we could change the {{RelOptPlanner}} methods {code} void setRoot(RelNode rel); RelNode findBestExp(); {code} to {code} void setRoots(List<RelNode> rels); List<RelNode> findBestExps(); {code} The idea being to optimize several relational expressions at the same time. But then I realized we can combine the relational expressions using a new relational operator: {code} public class Combine extends AbstractRelNode { protected final ImmutableList<RelNode> inputs; public Combine(RelOptCluster cluster, RelTraitSet traitSet, List<RelNode> inputs) { ... } } {code} This lets us pass multiple relational operators into and out of the planner, so we don't need to change {{setRoot}} and {{findBestExp}}. {{Combine}} is similar to {{Union}} except that it doesn't require the inputs to have the same row type. Some more points about it: * In order to execute, all of the inputs need to be executed, and therefore they all contribute to its cost. * DML operations (insert, update, delete) are modeled as relational operators (albeit they return a single row with a single "row count" column) and therefore {{Combine}} can be used to represent a query that consists of multiple queries and DML statements. ** We'd need to take care if one statement executes after another and is intended to see the data that it produced. For example, in the following, the statements are seeing different {{emp}} relations:{code} UPDATE emp SET sal = sal * 2 WHERE deptno = 10; SELECT * FROM emp WHERE sal > 1000; {code} * A concrete implementation of {{Combine}} would make all of the constituent relational expressions accessible (say, as JDBC ResultSets), but we're mainly interested in it as a "binder" for planning purposes. * Different variants of {{Combine}} might specify that the constituent queries run in series, or parallel, or some more complex order, or just say "I don't care". Sequencing matters a lot when we get to physical optimization (e.g. allocating scarce memory), but I don't think it matters much during logical optimization. * The interesting plans produced for a {{Combine}} query almost certainly involve a {{Spool}} operator (see CALCITE-481). The {{Combine}} and {{Spool}} operators have a similar purpose: they both aim to make "difficult" graphs (forests and DAGs, respectively) look like trees. * There would be changes to the various metadata classes. E.g. we'd add a {{public RelOptCost getCumulativeCost(Combine rel, RelMetadataQuery mq)}} method to one of the metadata providers. > Implement planner for converting multiple SQL statements to unified RelNode > Tree > -------------------------------------------------------------------------------- > > Key: CALCITE-1440 > URL: https://issues.apache.org/jira/browse/CALCITE-1440 > Project: Calcite > Issue Type: New Feature > Reporter: Chinmay Kolhatkar > Assignee: Julian Hyde > > This can be implemented as a separate planner or in {{VolcanoPlanner}} > itself. The planner should take multiple SQL statements as input and return a > unified {{RelNode}} tree. > Example of above is as follows: > {{SELECT COL1, COL2 FROM TABLE WHERE COL3 > 10;}} > {{SELECT COL1, COL2 FROM TABLE WHERE COL4 = 'abc';}} > The above 2 statements have a common path and hence can provide a unified > {{RelNode}} tree as follows: > {noformat} > [Scan] -> [Project (COL1, COL2)] -> [Filter (COL4 = 'abc')] -> [Delta] > | > V > [Filter (COL3 > 10)] > | > v > [Delta] > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)