[
https://issues.apache.org/jira/browse/CALCITE-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225398#comment-14225398
]
Julian Hyde commented on CALCITE-481:
-------------------------------------
A good motivating example: in Lattice test, add
{code}
@Test public void testSpool() {
foodmartModel()
.query("select count(*) as c from (select * from \"days\"), (select *
from \"days\")" )
.explainContains("Aggregate...\n"
+ " Join...\n"
+ " Spool(spoolId: 1)\n"
+ " Scan(days)\n"
+ " Spool(spoolId: 1)\n"
+ " Scan(days)\n")
.returnsUnordered("C=49");
}
{code}
> Add "Spool" operator, to allow re-use of relational expressions
> ---------------------------------------------------------------
>
> Key: CALCITE-481
> URL: https://issues.apache.org/jira/browse/CALCITE-481
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
>
> If a sub-tree occurs more than once in a query an efficient plan would
> probably evaluate once and have two readers read the same data. We propose a
> "Spool" relational expression for this purpose.
> Spool would have one input, the expression that populates it.
> In the VolcanoPlanner, any RelNode can already have multiple consumers (each
> of which sees the same row type and the same data) but an optimal plan does
> not typically include multiple uses of the same node, so most implementors
> (e.g. EnumerableRelImplementor) would just not notice, and generate the same
> code twice. Having an explicit Spool would alert the implementor to re-use
> the result.
> We do not prescribe a mechanism for implementing Spool as a physical
> operator. A job that populates a temporary table is one possible mechanism.
> As part of this case, we should implement Spool in Enumerable convention, and
> use it to evaluate some test queries.
> The other reason to implement Spool is costing. The cost of a Spool with N
> consumers is typically something like A + B . N. A, the fixed cost, is
> significantly larger than B, the re-play cost.
> Volcano's dynamic programming model does not make it easy to account for
> re-use. There are approaches in academia based on integer linear programming;
> see e.g. http://www.slideshare.net/INRIA-OAK/plreuse
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)