Re: Re: Re: Volcano's problem with trait propagation: current state and future

Vladimir Ozerov Tue, 10 Dec 2019 05:51:08 -0800

HI Stamatis,

This paper doesn't explain the Calcite integration in detail required for
the purpose of our discussion. From what I see in the code, VolcanoPlanner
is only used for view materialization planning, which is also mentioned in
the paper. All other optimizations stages, including join rewrite, use HEP
approach. In other words, there is no mixed logical-physical planning in a
single optimization run.


This is my current understanding of the Hive's codebase. It would be nice
if Hive maintainers would confirm or disprove that.

Regards,
Vladimir.

вт, 10 дек. 2019 г. в 02:22, Stamatis Zampetakis <[email protected]>:

> It's been some time that I didn't look into the code but the most recent
> Hive paper [1] mostly talks about Calcite in the query optimization section
> so I have to say I am a bit surprised.
>
> [1] https://arxiv.org/pdf/1903.10970.pdf
>
> On Mon, Dec 9, 2019 at 6:21 PM Vladimir Ozerov <[email protected]> wrote:
>
> > After looking at Hive implementation I have the impression that it
> doesn't
> > use Apache Calcite for physical planning, hence it doesn't have the
> > problems mentioned in this topic.
> >
> > вс, 8 дек. 2019 г. в 18:55, Vladimir Ozerov <[email protected]>:
> >
> > > Hi Stamatis,
> > >
> > > Thank you for the idea about Hive. I looked at it some time ago and the
> > > codebase was substantially more complex to understand for me than in
> > other
> > > projects, so I gave up. I'll try to do the analysis again.
> > > I'd like to mention that I also had a thought that maybe the
> > > implementation of a top-down optimization is not a concern of
> > > VolcanoPlanner, and the brand new planner may play well here. But from
> a
> > > practical perspective, of course, I keep a hope that we will find a
> less
> > > intrusive way to introduce efficient physical optimization into
> > > VolcanoPlanner :-)
> > >
> > > Regards,
> > > Vladimir.
> > >
> > > вс, 8 дек. 2019 г. в 12:42, Stamatis Zampetakis <[email protected]>:
> > >
> > >> Thanks Vladimir for this great summary. It is really helpful to know
> how
> > >> the different projects use the optimizer and it certainly helps to
> > >> identify
> > >> limitations on our implementation.
> > >>
> > >> I cannot provide any valuable feedback at the moment since I have to
> > find
> > >> some time to read more carefully your analysis.
> > >>
> > >> In the meantime, I know that Hive is also using Calcite for quite some
> > >> time
> > >> now so maybe you can get some new ideas (or complete your background
> > >> study)
> > >> by looking in their code.
> > >>
> > >> @Haisheng: I think many people did appreciate the discussion for pull
> up
> > >> traits so I wouldn't say that we abandoned it. I had the impression
> that
> > >> we
> > >> were waiting a design doc.
> > >>
> > >> In general it may not be feasible to cover all use cases with a single
> > >> optimizer. I wouldn't find it bad to introduce another planner if
> there
> > >> are
> > >> enough reasons to do so.
> > >>
> > >> Best,
> > >> Stamatis
> > >>
> > >>
> > >> On Fri, Dec 6, 2019, 11:00 AM Vladimir Ozerov <[email protected]>
> > wrote:
> > >>
> > >> > "all we know is their *collations*" -> "all we know is their
> *traits*"
> > >> >
> > >> > пт, 6 дек. 2019 г. в 12:57, Vladimir Ozerov <[email protected]>:
> > >> >
> > >> > > Hi Haisheng,
> > >> > >
> > >> > > Thank you for your response. Let me elaborate my note on join
> > planning
> > >> > > first - what I was trying to say is not that rules on their own
> have
> > >> some
> > >> > > deficiencies. What I meant is that with current planner
> > >> implementation,
> > >> > > users tend to separate join planning from the core optimization
> > >> process
> > >> > > like this in the pseudo-code below. As a result, only one join
> > >> > permutation
> > >> > > is considered during physical planning, even though join rule may
> > >> > > potentially generate multiple plans worth exploring:
> > >> > >
> > >> > > RelNode optimizedLogicalNode = doJoinPlanning(logicalNode);
> > >> > > RelNode physicalNode = doPhysicalPlanning(optimizedLogicalNode);
> > >> > >
> > >> > > Now back to the main question. I re-read your thread about
> on-demand
> > >> > trait
> > >> > > propagation [1] carefully. I'd like to admit that when I was
> reading
> > >> it
> > >> > for
> > >> > > the first time about a month ago, I failed to understand some
> > details
> > >> due
> > >> > > to poor knowledge of different optimizer architectures. Now I
> > >> understand
> > >> > it
> > >> > > much better, and we definitely concerned with exactly the same
> > >> problem. I
> > >> > > feel that trait pull-up might be a step in the right direction,
> > >> however,
> > >> > it
> > >> > > seems to me that it is not the complete solution. Let me try to
> > >> explain
> > >> > why
> > >> > > I think so.
> > >> > >
> > >> > > The efficient optimizer should try to save CPU as much as possible
> > >> > because
> > >> > > it allows us to explore more plans in a sensible amount of time.
> To
> > >> > achieve
> > >> > > that we should avoid redundant operations, and detect and prune
> > >> > inefficient
> > >> > > paths aggressively. As far as I understand the idea of trait
> > pull-up,
> > >> we
> > >> > > essentially explore the space of possible physical properties of
> > >> children
> > >> > > nodes without forcing their implementation. But after that, the
> > >> Calcite
> > >> > > will explore that nodes again, now in order to execute
> > implementation
> > >> > > rules. I.e. we will do two dives - one to enumerate the nodes
> (trait
> > >> > > pull-up API), and the other one to implement them (implementation
> > >> rules),
> > >> > > while in Cascades one dive should be sufficient since exploration
> > >> invokes
> > >> > > the implementation rules as it goes. This is the first issue I
> see.
> > >> > >
> > >> > > The second one is more important - how to prune inefficient plans?
> > >> > > Currently, nodes are implemented independently and lack of context
> > >> > doesn't
> > >> > > allow us to estimates children's costs when implementing the
> parent,
> > >> > hence
> > >> > > branch-and-bound is not possible. Can trait pull-up API
> > >> > "List<RelTraitSet>
> > >> > > deriveTraitSets(RelNode, RelMetadataQuery)" help us with this? If
> > the
> > >> > > children nodes are not implemented before the pull-up, all we know
> > is
> > >> > their
> > >> > > collations, but not their costs. And without costs, pruning is not
> > >> > > possible. Please let me know if I missed something from the
> > proposal.
> > >> > >
> > >> > > The possible architecture I had in mind after reading multiple
> > papers,
> > >> > > which may answer all our questions, could look like this:
> > >> > > 1) We have a queue of nodes requiring optimization. Not a queue of
> > >> rules.
> > >> > > initial queue state is formed from the initial tree, top-down.
> > >> > > 2) The node is popped from the queue, and we enter
> > >> > > "node.optimize(maxCost)" call. It checks for matching rules,
> > >> prioritizes
> > >> > > them, and start their execution on by one. Execution of rules may
> > >> > re-insert
> > >> > > the current node into the queue, in which case this step is
> > repeated,
> > >> > > possibly many times
> > >> > > 3) Logical-logical rules (transformations) produce new logical
> nodes
> > >> and
> > >> > > put them into the queue for further optimization
> > >> > > 4) Logical-physical rules (implementation) do the following:
> > >> > > 4.1) Costs of logical children are estimated. The cost of a
> logical
> > >> node
> > >> > > should be less than any cost of a possible physical node that may
> be
> > >> > > produced out of it. If the logical cost exceeds "maxCost", we stop
> > and
> > >> > > return. The whole logical subspace is pruned even before
> > exploration.
> > >> > > 4.2) Recursively call "childNode.optimize(maxCost -
> > >> currentLogicalCost)"
> > >> > > method, which returns a set of possible physical implementations
> of
> > a
> > >> > > child. Returned physical children are already registered in proper
> > >> > > set/subset, but are not used for any pattern-matching, and doesn't
> > >> > trigger
> > >> > > more rule calls!
> > >> > > 4.3) Implementation rule checks the cost of the physical child. If
> > it
> > >> is
> > >> > > greater than any other already observed child with the same
> traits,
> > or
> > >> > > exceeds the "maxCost", it is discarded. Otherwise, the physical
> > >> > > implementation of the current node is produced and registered in
> the
> > >> > > optimizer.
> > >> > >
> > >> > > The pseudocode for physical implementation flow for join (two
> > inputs):
> > >> > >
> > >> > > Collection<RelNode> optimizePhysical(Cost maxCost) {
> > >> > >     // Estimated minimal self-cost. Any physical implementation of
> > >> this
> > >> > > node should have greater self-cost
> > >> > >     Cost logicalSelfCost = optimizer.getCost(this);
> > >> > >
> > >> > >     // *Pruning #1*: whatever children we implement, the total
> cost
> > >> will
> > >> > > be greater than the passed maxCost, so do not explore further
> > >> > >     Cost maxChildCost = maxCost - logicalSelfCost;
> > >> > >
> > >> > >     Cost logicalILeftCost = optimizer.getCost(leftLogicalNode);
> > >> > >     Cost logicalRightCost = optimizer.getCost(rightLogicalNode);
> > >> > >
> > >> > >     if (logicalLeftCost + logicalRightCost > maxChildCost) {
> > >> > >         return;
> > >> > >     }
> > >> > >
> > >> > >     // This is our equivalence set.
> > >> > >     RelSet equivalenceSet = this.getSet();
> > >> > >
> > >> > >     // Get promising physical implementations of child nodes
> > >> recursively
> > >> > >     List<RelNode> leftPhysicalNodes =
> > >> > > leftLogicalNode.optimizePhysical(maxChildCost);
> > >> > >     List<RelNode> rightPhysicalNodes =
> > >> > > rightLigicalNode.optimizePhysical(maxChildCost);
> > >> > >
> > >> > >     for (RelNode leftPhysicalNode : leftPhysicalNodes) {
> > >> > >         for (RelNode rightPhysicalNode : rightPhysicalNodes) {
> > >> > >             // *Pruning #2*: Combination of physical input costs
> is
> > >> > > already too expensive, give up
> > >> > >             Cost physicalLeftCost =
> > >> optimizer.getCost(leftPhysicalNode);
> > >> > >             Cost physicalRightCost =
> > >> > optimizer.getCost(rightPhysicalNode);
> > >> > >
> > >> > >             if (logicalILeftCost + logicalRightCost >
> maxChildCost)
> > {
> > >> > >                 continue.
> > >> > >             }
> > >> > >
> > >> > >             // Implement possible physical nodes for the given
> pair
> > of
> > >> > > inputs (maybe more than one)
> > >> > >             List<RelNode> physicalJoins =
> > implement(leftPhysicalNode,
> > >> > > rightPhysicalNode);
> > >> > >
> > >> > >             for (RelOptRule physicalJoin : physicalJoins) {
> > >> > >                // *Pruning #3*: Do not consider implementation if
> we
> > >> have
> > >> > > another one with the same trait set and smaller cost)
> > >> > >                 Cost physicalJoinCost =
> > >> optimizer.getCost(physicalJoin);
> > >> > >                 Cost bestCostForTraitSet =
> > >> > > equivalenceSet.getBestCost(physicalJoin.getTraitSet());
> > >> > >
> > >> > >                 if (physicalJoinCost > bestCostForTraitSet) {
> > >> > >                     continue.
> > >> > >                 }
> > >> > >
> > >> > >                 // This is a good implementation. Register it in
> the
> > >> set,
> > >> > > updating per-traitset best costs.
> > >> > >                 equivalenceSet.register(physicalJoin);
> > >> > >             }
> > >> > >         }
> > >> > >     }
> > >> > >
> > >> > >     // Return the best registered expressions with different
> > traitsets
> > >> > > from the current set.
> > >> > >     return equivalenceSet.getBestExps();
> > >> > > }
> > >> > >
> > >> > > This is a very rough pseudo-code, only to demonstrate the basic
> idea
> > >> on
> > >> > > how proper bottom-up propagation not only helps us set proper
> traits
> > >> for
> > >> > > the new physical node but also ensures that not optimal plans are
> > >> pruned
> > >> > as
> > >> > > early as possible. Real implementation should be better abstracted
> > and
> > >> > > accept enforcers as well.
> > >> > >
> > >> > > Also, please notice that the pseudo-code doesn't show when logical
> > >> rules
> > >> > > are fired. This is a separate question. One possible
> straightforward
> > >> way
> > >> > is
> > >> > > to add the aforementioned physical routine to normal Volcano flow:
> > >> > > 1) Fire logical rule on a node and register new nodes
> > >> > > 2) Fire physical optimization as shown above, then invoke
> > >> > > "call.transformTo()" for every returned physical rel
> > >> > > 3) Re-trigger the process for newly created nodes and their
> parents
> > >> > >
> > >> > > A better approach is to interleave logical and physical
> > >> optimizations, so
> > >> > > they trigger each other recursively. But this would require a
> > serious
> > >> > > redesign of a "rule queue" concept.
> > >> > >
> > >> > > Does it have any common points with your proposal?
> > >> > >
> > >> > > Regards,
> > >> > > Vladimir.
> > >> > >
> > >> > > [1]
> > >> > >
> > >> >
> > >>
> >
> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
> > >> > >
> > >> > >
> > >> > > пт, 6 дек. 2019 г. в 05:41, Haisheng Yuan <[email protected]
> >:
> > >> > >
> > >> > >> Oh, I forgot to mention that the join planning/reordering is not
> a
> > >> big
> > >> > >> problem. Calcite just use the rule to generate a single
> alternative
> > >> > plan,
> > >> > >> which is not ideal.  But we can't say Calcite is doing wrong.
> > >> > >>
> > >> > >> Ideally, we expect it generates multiple (neither all, nor
> single)
> > >> > >> bipartie graphs. The join reordering rule will cut each part into
> > >> > bipartie
> > >> > >> recursively and apply JoinCommutativity rule to generate more
> > >> > alternatives
> > >> > >> for each RelSet. It is just a different strategy. We can modify
> the
> > >> > rule,
> > >> > >> or create new join reordering rule to generate multiple plan
> > >> > >> alternatives.
> > >> > >>
> > >> > >> - Haisheng
> > >> > >>
> > >> > >>
> ------------------------------------------------------------------
> > >> > >> 发件人：Haisheng Yuan<[email protected]>
> > >> > >> 日 期：2019年12月06日 09:07:43
> > >> > >> 收件人：Vladimir Ozerov<[email protected]>; [email protected]
> (
> > >> > >> [email protected])<[email protected]>
> > >> > >> 主 题：Re: Re: Volcano's problem with trait propagation: current
> state
> > >> and
> > >> > >> future
> > >> > >>
> > >> > >> Generally agree with what Vladimir said. I think what Calcite has
> > is
> > >> > >> logical optimization or exploration, there are almost no physical
> > >> > >> optimization, Calcite leaves it to third party implementators.
> One
> > >> of my
> > >> > >> friends at University of Wisconsin–Madison database research
> group
> > >> told
> > >> > me
> > >> > >> that they gave up the idea of using Calcite in their project due
> to
> > >> this
> > >> > >> reason.
> > >> > >>
> > >> > >> Currently physical properties are requested in implementation
> > rules,
> > >> or
> > >> > >> even logical exploration rules, But each rule is independent, the
> > >> > >> pattern-matched expression is not aware of what does the parent
> > >> operator
> > >> > >> want. Using AbstractConverter doesn't help, and is not promising.
> > >> > >>
> > >> > >> >> You shouldn't regiester all logical rules in the planner
> > >> > >> simultaneously,... as Drill does.
> > >> > >> That is because Calcite does too many redundant or duplicate rule
> > >> > >> matching, like all kinds of transpose (can't be avoided due to
> > >> current
> > >> > >> design), matching physical operators.
> > >> > >>
> > >> > >> >> decoupling the logical planning from the physical one looks
> > >> > >> a bit weird to me because it violates the idea of Cascades
> > framework.
> > >> > >> Orca optimizer fully adopted the design principle of Cascades
> > >> framework
> > >> > >> except that it separates into 3 phases: logical exploration,
> > physical
> > >> > >> implementation, and optimization (property enforcing). And it
> might
> > >> be
> > >> > >> easier if we want to enable parallel optimization by seperating
> > into
> > >> 3
> > >> > >> phases. Orca does branch-and-bound in optimization phase, before
> > >> actual
> > >> > >> property derivation and enforcement, IIRC. It is highly
> efficient,
> > >> works
> > >> > >> pretty well, and battlefield-tested by many large financial and
> > >> > insurance
> > >> > >> companies.
> > >> > >>
> > >> > >> In my last thread about on-demand trait request, I gave the
> > >> high-level
> > >> > >> general API for physical operators to derive and require physical
> > >> > >> properties, which is similar to Orca's design. But seems like the
> > >> > proposal
> > >> > >> of API change gets no love.
> > >> > >>
> > >> > >> - Haisheng
> > >> > >>
> > >> > >>
> ------------------------------------------------------------------
> > >> > >> 发件人：Vladimir Ozerov<[email protected]>
> > >> > >> 日 期：2019年12月05日 22:22:43
> > >> > >> 收件人：[email protected] ([email protected])<
> > >> > >> [email protected]>
> > >> > >> 主 题：Re: Volcano's problem with trait propagation: current state
> and
> > >> > future
> > >> > >>
> > >> > >> AbstractConverter-s are attractive because they effectively
> emulate
> > >> > >> straightforward recursive top-down optimization
> (Volcano/Cascades).
> > >> But
> > >> > >> instead of doing it with a recursive method call, which preserves
> > the
> > >> > >> context, we do this in Calcite as a sequence of unrelated rule
> > calls,
> > >> > thus
> > >> > >> losing the context. So with my current understanding, it could be
> > >> > thought
> > >> > >> of not as a search space explosion, but rather than the
> inefficient
> > >> > >> implementation of an otherwise straightforward
> > parent->child->parent
> > >> > >> navigation, since we achieve this navigation indirectly through
> the
> > >> rule
> > >> > >> queue, rather than through a normal method call. In any case, the
> > net
> > >> > >> result is wasted CPU. Perhaps this is not exponential waste, but
> > some
> > >> > >> multiplication of otherwise optimal planning. As I mentioned, in
> > our
> > >> > >> experiments with TPC-H, we observed a constant factor between
> 6-9x
> > >> > between
> > >> > >> the number of joins and the number of join implementation rule
> > >> > >> invocations.
> > >> > >> It doesn't growth past 9 even for complex queries, so I hope that
> > >> this
> > >> > is
> > >> > >> not an exponent :-)
> > >> > >>
> > >> > >> Speaking of logical vs physical optimization, IMO it makes sense
> in
> > >> some
> > >> > >> cases. E.g. when doing predicate pushdown, you do not want to
> > >> consider
> > >> > >> intermediate logical tree states for implementation, until the
> > >> predicate
> > >> > >> reaches its final position. So running separate logical planning
> > >> phase
> > >> > >> with
> > >> > >> Volcano optimizer makes total sense to me, because it effectively
> > >> > prunes a
> > >> > >> lot of not optimal logical plans before they reach the physical
> > >> planning
> > >> > >> stage. The real problem to me is that we forced to remove join
> > >> planning
> > >> > >> from the physical optimization stage. Because the goal of join
> > >> planning
> > >> > >> not
> > >> > >> to generate a single optimal plan, like with predicate pushdown,
> > but
> > >> > >> rather
> > >> > >> to generate a set of logical plans all of which should be
> > implemented
> > >> > and
> > >> > >> estimated. And with AbstractConverter-s this is not possible
> > because
> > >> of
> > >> > >> their multiplicator increases the rate of search space growth,
> > making
> > >> > join
> > >> > >> planning inapplicable even for the small number of relations. So
> we
> > >> have
> > >> > >> to
> > >> > >> move them to the logical planning stage and pick only one
> > permutation
> > >> > for
> > >> > >> physical planning.
> > >> > >>
> > >> > >>
> > >> > >> чт, 5 дек. 2019 г. в 15:35, Roman Kondakov
> > >> <[email protected]
> > >> > >:
> > >> > >>
> > >> > >> > Vladimir,
> > >> > >> >
> > >> > >> > thank you for bringing it up. We are facing the same problems
> in
> > >> > Apache
> > >> > >> > Ignite project
> > >> > >> > and it would be great if Apache Calcite community will propose
> a
> > >> > >> > solution for this
> > >> > >> > issue.
> > >> > >> >
> > >> > >> > From my point of view an approach with abstract converters
> looks
> > >> more
> > >> > >> > promising, but as
> > >> > >> > you mentioned it suffers from polluting the search space. The
> > >> latter
> > >> > can
> > >> > >> > be mitigated by
> > >> > >> > splitting a planning stage into the several phases: you
> shouldn't
> > >> > >> > register all logical rules in the planner simultaneously - it
> > looks
> > >> > like
> > >> > >> > it is better to have several iterations of planning stage with
> > >> > different
> > >> > >> > sets of rules, as Drill does.
> > >> > >> >
> > >> > >> > Also I'd like to mention that decoupling the logical planning
> > from
> > >> the
> > >> > >> > physical one looks
> > >> > >> > a bit weird to me because it violates the idea of Cascades
> > >> framework.
> > >> > >> > Possibly this decoupling is the consequence of some performance
> > >> > issues.
> > >> > >> >
> > >> > >> >
> > >> > >> > --
> > >> > >> > Kind Regards
> > >> > >> > Roman Kondakov
> > >> > >> >
> > >> > >> > On 05.12.2019 14:24, Vladimir Ozerov wrote:
> > >> > >> > > Hi,
> > >> > >> > >
> > >> > >> > > As I mentioned before, we are building a distributed SQL
> engine
> > >> that
> > >> > >> uses
> > >> > >> > > Apache Calcite for query optimization. The key problem we
> faced
> > >> is
> > >> > the
> > >> > >> > > inability to pull the physical traits of child relations
> > >> > efficiently.
> > >> > >> I'd
> > >> > >> > > like to outline my understanding of the problem (I guess it
> was
> > >> > >> already
> > >> > >> > > discussed multiple times) and ask the community to prove or
> > >> disprove
> > >> > >> the
> > >> > >> > > existence of that problem and its severity for the products
> > which
> > >> > uses
> > >> > >> > > Apache Calcite and ask for ideas on how it could be improved
> in
> > >> the
> > >> > >> > future.
> > >> > >> > >
> > >> > >> > > I'll start with the simplified problem description and
> > mentioned
> > >> > more
> > >> > >> > > complex use cases then. Consider that we have a logical tree
> > and
> > >> a
> > >> > >> set of
> > >> > >> > > implementation rules. Our goal is to find the optimal
> physical
> > >> tree
> > >> > by
> > >> > >> > > applying these rules. The classical Cascades-based approach
> > >> directs
> > >> > >> the
> > >> > >> > > optimization process from the top to the bottom (hence
> > >> "top-down").
> > >> > >> > > However, the actual implementation of tree nodes still
> happens
> > >> > >> bottom-up.
> > >> > >> > > For the tree L1 <- L2, we enter "optimize(L1)", which
> > recursively
> > >> > >> > delegates
> > >> > >> > > to "optimize(L2)". We then implement children nodes L1 <-
> [P2',
> > >> > P2''],
> > >> > >> > and
> > >> > >> > > return back to the parent, which is now able to pick
> promising
> > >> > >> > > implementations of the children nodes and reject bad ones
> with
> > >> the
> > >> > >> > > branch-and-bound approach. AFAIK Pivotal's Orca works this
> way.
> > >> > >> > >
> > >> > >> > > The Apache Calcite is very different because it doesn't allow
> > the
> > >> > >> > recursion
> > >> > >> > > so that we lose the context on which node requested the child
> > >> > >> > > transformation. This loss of context leads to the following
> > >> > problems:
> > >> > >> > > 1) The parent node cannot deduce it's physical properties
> > during
> > >> the
> > >> > >> > > execution of the implementation rule, because Calcite expects
> > the
> > >> > >> > > transformation to be applied before children nodes are
> > >> implemented.
> > >> > >> That
> > >> > >> > is
> > >> > >> > > if we are optimizing LogicalProject <- LogicalScan, we cannot
> > set
> > >> > >> proper
> > >> > >> > > distribution and collation for the to be created
> > PhysicalProject,
> > >> > >> because
> > >> > >> > > it depends on the distribution and collation of the children
> > >> which
> > >> > is
> > >> > >> yet
> > >> > >> > > to be resolved.
> > >> > >> > > 2) The branch-and-bound cannot be used because it requires at
> > >> least
> > >> > >> one
> > >> > >> > > fully-built physical subtree.
> > >> > >> > >
> > >> > >> > > As a result of this limitation, products which rely on Apache
> > >> > Calcite
> > >> > >> for
> > >> > >> > > query optimization, use one or several workarounds:
> > >> > >> > >
> > >> > >> > > *1) Guess the physical properties of parent nodes before
> > logical
> > >> > >> children
> > >> > >> > > are implemented*
> > >> > >> > > *Apache Flink* uses this strategy. The strategy is bad
> because
> > of
> > >> > the
> > >> > >> > > number of combinations of traits growth exponentially with
> the
> > >> > number
> > >> > >> of
> > >> > >> > > attributes in the given RelNode, so you either explode the
> > search
> > >> > >> space
> > >> > >> > or
> > >> > >> > > give up optimization opportunities. Consider the following
> > tree:
> > >> > >> > > LogicalSort[a ASC] <- LogicalFilter <- LogicalScan
> > >> > >> > > The optimal implementation of the LogicalFilter is
> > >> > >> > PhysicalFilter[collation=a
> > >> > >> > > ASC] because it may eliminate the parent sort. But such
> > >> optimization
> > >> > >> > should
> > >> > >> > > happen only if we know that there is a physical
> implementation
> > of
> > >> > scan
> > >> > >> > > allowing for this sort order, e.g.
> > PhysicalIndexScan[collation=a
> > >> > ASC].
> > >> > >> > I.e.
> > >> > >> > > we need to know the child physical properties first.
> Otherwise
> > we
> > >> > >> > fallback
> > >> > >> > > to speculative approaches. With the *optimistic* approach, we
> > >> emit
> > >> > all
> > >> > >> > > possible combinations of physical properties, with the hope
> > that
> > >> the
> > >> > >> > child
> > >> > >> > > will satisfy some of them, thus expanding the search space
> > >> > >> exponentially.
> > >> > >> > > With the *pessimistic* approach, we just miss this
> optimization
> > >> > >> > opportunity
> > >> > >> > > even if the index exists. Apache Flink uses the pessimistic
> > >> > approach.
> > >> > >> > >
> > >> > >> > > *2) Use AbstractConverters*
> > >> > >> > > *Apache Drill* uses this strategy. The idea is to "glue"
> > logical
> > >> and
> > >> > >> > > physical operators, so that implementation of a physical
> child
> > >> > >> > re-triggers
> > >> > >> > > implementation rule of a logical parent. The flow is as
> > follows:
> > >> > >> > > - Invoke parent implementation rule - it either doesn't
> produce
> > >> new
> > >> > >> > > physical nodes or produce not optimized physical nodes (like
> in
> > >> the
> > >> > >> > Apache
> > >> > >> > > Flink case)
> > >> > >> > > - Invoke children implementation rules and create physical
> > >> children
> > >> > >> > > - Then converters kick-in and re-trigger parent
> implementation
> > >> rule
> > >> > >> > through
> > >> > >> > > the creation of an abstract converter
> > >> > >> > > - Finally, the parent implementation rule is fired again and
> > now
> > >> it
> > >> > >> > > produces optimized node(s) since at least some of the
> physical
> > >> > >> > > distributions of children nodes are implemented.
> > >> > >> > >
> > >> > >> > > Note that this is essentially a hack to simulate the Cascades
> > >> flow!
> > >> > >> The
> > >> > >> > > problem is that AbstractConverters increase the complexity of
> > >> > planning
> > >> > >> > > because they do not have any context, so parent rules are
> just
> > >> > >> > re-triggered
> > >> > >> > > blindly. Consider the optimization of the following tree:
> > >> > >> > > LogicalJoin <- [LogicalScan1, LogicalScan2]
> > >> > >> > > With the converter approach, the join implementation rule
> will
> > be
> > >> > >> fired
> > >> > >> > at
> > >> > >> > > least 3 times, while in reality, one call should be
> sufficient.
> > >> In
> > >> > our
> > >> > >> > > experiments with TPC-H queries, the join rule implemented
> that
> > >> way
> > >> > is
> > >> > >> > > typically called 6-9 times more often than expected.
> > >> > >> > >
> > >> > >> > > *3) Transformations (i.e. logical optimization) are decoupled
> > >> from
> > >> > >> > > implementation (i.e. physical optimization)*
> > >> > >> > > Normally, you would like to mix both logical and physical
> rules
> > >> in a
> > >> > >> > single
> > >> > >> > > optimization program, because it is required for proper
> > planning.
> > >> > That
> > >> > >> > is,
> > >> > >> > > you should consider both (Ax(BxC)) and ((AxB)xC) join
> ordering
> > >> > during
> > >> > >> > > physical optimization, because you do not know which one will
> > >> > produce
> > >> > >> the
> > >> > >> > > better plan in advance.
> > >> > >> > > But in some practical implementations of Calcite-based
> > >> optimizers,
> > >> > >> this
> > >> > >> > is
> > >> > >> > > not the case, and join planning is performed as a separate
> HEP
> > >> > stage.
> > >> > >> > > Examples are Apache Drill and Apache Flink.
> > >> > >> > > I believe that lack of Cascades-style flow and
> branch-and-bound
> > >> are
> > >> > >> among
> > >> > >> > > the main reasons for this. At the very least for Apache
> Drill,
> > >> since
> > >> > >> it
> > >> > >> > > uses converters, so additional logical permutations will
> > >> > exponentially
> > >> > >> > > multiply the number of fired rules, which is already very
> big.
> > >> > >> > >
> > >> > >> > > Given all these problems I'd like to ask the community to
> share
> > >> > >> current
> > >> > >> > > thoughts and ideas on the future improvement of the Calcite
> > >> > optimizer.
> > >> > >> > One
> > >> > >> > > of the ideas being discussed in the community is "Pull-up
> > >> Traits",
> > >> > >> which
> > >> > >> > > should allow the parent node to get physical properties of
> the
> > >> > >> children
> > >> > >> > > nodes. But in order to do this, you effectively need to
> > implement
> > >> > >> > children,
> > >> > >> > > which IMO makes this process indistinguishable from the
> > classical
> > >> > >> > recursive
> > >> > >> > > Cascades algorithm.
> > >> > >> > >
> > >> > >> > > Have you considered recursive transformations as an
> alternative
> > >> > >> solution
> > >> > >> > to
> > >> > >> > > that problem? I.e. instead of trying to guess or pull the
> > >> physical
> > >> > >> > > properties of non-existent physical nodes, go ahead and
> > actually
> > >> > >> > implement
> > >> > >> > > them directly from within the parent rule? This may resolve
> the
> > >> > >> problem
> > >> > >> > > with trait pull-up, as well as allow for branch-and-bound
> > >> > >> optimization.
> > >> > >> > >
> > >> > >> > > I would appreciate your feedback or any hints for future
> > >> research.
> > >> > >> > >
> > >> > >> > > Regards,
> > >> > >> > > Vladimir.
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> > >>
> > >> > >>
> > >> >
> > >>
> > >
> >
>

Re: Re: Re: Volcano's problem with trait propagation: current state and future

Reply via email to