Are you suggesting this only for UNION? I assume not, but I'm not sure I
see how the approach you're suggesting will generalize. Of course, we also
need to consider *how* to reuse the results of the expression.

The approach sounds like it could be reasonable, but I don't fully
understand what you're proposing. In any case, I'm happy to see the
discussion starting again on this!

--
Michael Mior
[email protected]


Le mar. 26 mars 2019 à 08:02, Roger Shi <[email protected]> a
écrit :

> Hi,
>
> I'm investigating how to consider relational expression reuse in Volcano
> cost model. CALCITE-481<https://jira.apache.org/jira/browse/CALCITE-481>
> is a good start point and the JIRA description mentions integer linear
> programming, whose compute complexity is too high in some cases. So I
> propose a simpler method, and let's discuss it here.
>
> For instance, here's example a Union with two inputs.
>
>
> UnionAll
>
>     input(0) ==  Filter - Scan
>     input(1) ==  Filter - Scan
>
> In currently implementation the cumulative cost of Union is Cost(UnionAll)
> + Cost(Filter) * 2 + Cost(Scan) * 2. If filter and the scan is the same,
> the real cost is Cost(UnionAll) + Cost(Filter) + Cost(Scan). The key point
> is how to detect the reused expression. We could resolve it by recording
> the RelNodes used in cost calculating, and it's easy to find Filter and
> Scan is computed twice.
>
> On one hand, the method should be faster than integer linear programming.
> On the other hand it may miss the "real" optimal plan because it considers
> only local cheapest cost in subset cost computing.
>
> What do you think about the method? Comments are welcome. 🙂
>

Reply via email to