Trees are immutable, and TreeNode takes care of copying unchanged parts of
the tree when you are doing transformations.  As a result, even if you do
construct a DAG with the Dataset API, the first transformation will turn it
back into a tree.

The only exception to this rule is when we share the results of plans after
an Exchange operator.  This is the last step before execution and sometimes
turns the query into a DAG to avoid redundant computation.

On Tue, Mar 15, 2016 at 9:01 AM, Koert Kuipers <ko...@tresata.com> wrote:

> i am trying to understand some parts of the catalyst optimizer. but i
> struggle with one bigger picture issue:
>
> LogicalPlan extends TreeNode, which makes sense since the optimizations
> rely on tree transformations like transformUp and transformDown.
>
> but how can a LogicalPlan be a tree? isnt it really a DAG? if it is
> possible to create diamond-like operator dependencies, then assumptions
> made in tree transformations could be wrong? for example pushing a limit
> operator down into a child sounds safe, but if that same child is also used
> by another operator (so it has another parent, no longer a tree) then its
> not safe at all.
>
> what am i missing here?
> thanks! koert
>

Reply via email to