Re: Re: Re: Volcano's problem with trait propagation: current state and future

Vladimir Ozerov Mon, 09 Dec 2019 09:21:11 -0800

After looking at Hive implementation I have the impression that it doesn't
use Apache Calcite for physical planning, hence it doesn't have the
problems mentioned in this topic.


вс, 8 дек. 2019 г. в 18:55, Vladimir Ozerov <[email protected]>:

> Hi Stamatis,
>
> Thank you for the idea about Hive. I looked at it some time ago and the
> codebase was substantially more complex to understand for me than in other
> projects, so I gave up. I'll try to do the analysis again.
> I'd like to mention that I also had a thought that maybe the
> implementation of a top-down optimization is not a concern of
> VolcanoPlanner, and the brand new planner may play well here. But from a
> practical perspective, of course, I keep a hope that we will find a less
> intrusive way to introduce efficient physical optimization into
> VolcanoPlanner :-)
>
> Regards,
> Vladimir.
>
> вс, 8 дек. 2019 г. в 12:42, Stamatis Zampetakis <[email protected]>:
>
>> Thanks Vladimir for this great summary. It is really helpful to know how
>> the different projects use the optimizer and it certainly helps to
>> identify
>> limitations on our implementation.
>>
>> I cannot provide any valuable feedback at the moment since I have to find
>> some time to read more carefully your analysis.
>>
>> In the meantime, I know that Hive is also using Calcite for quite some
>> time
>> now so maybe you can get some new ideas (or complete your background
>> study)
>> by looking in their code.
>>
>> @Haisheng: I think many people did appreciate the discussion for pull up
>> traits so I wouldn't say that we abandoned it. I had the impression that
>> we
>> were waiting a design doc.
>>
>> In general it may not be feasible to cover all use cases with a single
>> optimizer. I wouldn't find it bad to introduce another planner if there
>> are
>> enough reasons to do so.
>>
>> Best,
>> Stamatis
>>
>>
>> On Fri, Dec 6, 2019, 11:00 AM Vladimir Ozerov <[email protected]> wrote:
>>
>> > "all we know is their *collations*" -> "all we know is their *traits*"
>> >
>> > пт, 6 дек. 2019 г. в 12:57, Vladimir Ozerov <[email protected]>:
>> >
>> > > Hi Haisheng,
>> > >
>> > > Thank you for your response. Let me elaborate my note on join planning
>> > > first - what I was trying to say is not that rules on their own have
>> some
>> > > deficiencies. What I meant is that with current planner
>> implementation,
>> > > users tend to separate join planning from the core optimization
>> process
>> > > like this in the pseudo-code below. As a result, only one join
>> > permutation
>> > > is considered during physical planning, even though join rule may
>> > > potentially generate multiple plans worth exploring:
>> > >
>> > > RelNode optimizedLogicalNode = doJoinPlanning(logicalNode);
>> > > RelNode physicalNode = doPhysicalPlanning(optimizedLogicalNode);
>> > >
>> > > Now back to the main question. I re-read your thread about on-demand
>> > trait
>> > > propagation [1] carefully. I'd like to admit that when I was reading
>> it
>> > for
>> > > the first time about a month ago, I failed to understand some details
>> due
>> > > to poor knowledge of different optimizer architectures. Now I
>> understand
>> > it
>> > > much better, and we definitely concerned with exactly the same
>> problem. I
>> > > feel that trait pull-up might be a step in the right direction,
>> however,
>> > it
>> > > seems to me that it is not the complete solution. Let me try to
>> explain
>> > why
>> > > I think so.
>> > >
>> > > The efficient optimizer should try to save CPU as much as possible
>> > because
>> > > it allows us to explore more plans in a sensible amount of time. To
>> > achieve
>> > > that we should avoid redundant operations, and detect and prune
>> > inefficient
>> > > paths aggressively. As far as I understand the idea of trait pull-up,
>> we
>> > > essentially explore the space of possible physical properties of
>> children
>> > > nodes without forcing their implementation. But after that, the
>> Calcite
>> > > will explore that nodes again, now in order to execute implementation
>> > > rules. I.e. we will do two dives - one to enumerate the nodes (trait
>> > > pull-up API), and the other one to implement them (implementation
>> rules),
>> > > while in Cascades one dive should be sufficient since exploration
>> invokes
>> > > the implementation rules as it goes. This is the first issue I see.
>> > >
>> > > The second one is more important - how to prune inefficient plans?
>> > > Currently, nodes are implemented independently and lack of context
>> > doesn't
>> > > allow us to estimates children's costs when implementing the parent,
>> > hence
>> > > branch-and-bound is not possible. Can trait pull-up API
>> > "List<RelTraitSet>
>> > > deriveTraitSets(RelNode, RelMetadataQuery)" help us with this? If the
>> > > children nodes are not implemented before the pull-up, all we know is
>> > their
>> > > collations, but not their costs. And without costs, pruning is not
>> > > possible. Please let me know if I missed something from the proposal.
>> > >
>> > > The possible architecture I had in mind after reading multiple papers,
>> > > which may answer all our questions, could look like this:
>> > > 1) We have a queue of nodes requiring optimization. Not a queue of
>> rules.
>> > > initial queue state is formed from the initial tree, top-down.
>> > > 2) The node is popped from the queue, and we enter
>> > > "node.optimize(maxCost)" call. It checks for matching rules,
>> prioritizes
>> > > them, and start their execution on by one. Execution of rules may
>> > re-insert
>> > > the current node into the queue, in which case this step is repeated,
>> > > possibly many times
>> > > 3) Logical-logical rules (transformations) produce new logical nodes
>> and
>> > > put them into the queue for further optimization
>> > > 4) Logical-physical rules (implementation) do the following:
>> > > 4.1) Costs of logical children are estimated. The cost of a logical
>> node
>> > > should be less than any cost of a possible physical node that may be
>> > > produced out of it. If the logical cost exceeds "maxCost", we stop and
>> > > return. The whole logical subspace is pruned even before exploration.
>> > > 4.2) Recursively call "childNode.optimize(maxCost -
>> currentLogicalCost)"
>> > > method, which returns a set of possible physical implementations of a
>> > > child. Returned physical children are already registered in proper
>> > > set/subset, but are not used for any pattern-matching, and doesn't
>> > trigger
>> > > more rule calls!
>> > > 4.3) Implementation rule checks the cost of the physical child. If it
>> is
>> > > greater than any other already observed child with the same traits, or
>> > > exceeds the "maxCost", it is discarded. Otherwise, the physical
>> > > implementation of the current node is produced and registered in the
>> > > optimizer.
>> > >
>> > > The pseudocode for physical implementation flow for join (two inputs):
>> > >
>> > > Collection<RelNode> optimizePhysical(Cost maxCost) {
>> > >     // Estimated minimal self-cost. Any physical implementation of
>> this
>> > > node should have greater self-cost
>> > >     Cost logicalSelfCost = optimizer.getCost(this);
>> > >
>> > >     // *Pruning #1*: whatever children we implement, the total cost
>> will
>> > > be greater than the passed maxCost, so do not explore further
>> > >     Cost maxChildCost = maxCost - logicalSelfCost;
>> > >
>> > >     Cost logicalILeftCost = optimizer.getCost(leftLogicalNode);
>> > >     Cost logicalRightCost = optimizer.getCost(rightLogicalNode);
>> > >
>> > >     if (logicalLeftCost + logicalRightCost > maxChildCost) {
>> > >         return;
>> > >     }
>> > >
>> > >     // This is our equivalence set.
>> > >     RelSet equivalenceSet = this.getSet();
>> > >
>> > >     // Get promising physical implementations of child nodes
>> recursively
>> > >     List<RelNode> leftPhysicalNodes =
>> > > leftLogicalNode.optimizePhysical(maxChildCost);
>> > >     List<RelNode> rightPhysicalNodes =
>> > > rightLigicalNode.optimizePhysical(maxChildCost);
>> > >
>> > >     for (RelNode leftPhysicalNode : leftPhysicalNodes) {
>> > >         for (RelNode rightPhysicalNode : rightPhysicalNodes) {
>> > >             // *Pruning #2*: Combination of physical input costs is
>> > > already too expensive, give up
>> > >             Cost physicalLeftCost =
>> optimizer.getCost(leftPhysicalNode);
>> > >             Cost physicalRightCost =
>> > optimizer.getCost(rightPhysicalNode);
>> > >
>> > >             if (logicalILeftCost + logicalRightCost > maxChildCost) {
>> > >                 continue.
>> > >             }
>> > >
>> > >             // Implement possible physical nodes for the given pair of
>> > > inputs (maybe more than one)
>> > >             List<RelNode> physicalJoins = implement(leftPhysicalNode,
>> > > rightPhysicalNode);
>> > >
>> > >             for (RelOptRule physicalJoin : physicalJoins) {
>> > >                // *Pruning #3*: Do not consider implementation if we
>> have
>> > > another one with the same trait set and smaller cost)
>> > >                 Cost physicalJoinCost =
>> optimizer.getCost(physicalJoin);
>> > >                 Cost bestCostForTraitSet =
>> > > equivalenceSet.getBestCost(physicalJoin.getTraitSet());
>> > >
>> > >                 if (physicalJoinCost > bestCostForTraitSet) {
>> > >                     continue.
>> > >                 }
>> > >
>> > >                 // This is a good implementation. Register it in the
>> set,
>> > > updating per-traitset best costs.
>> > >                 equivalenceSet.register(physicalJoin);
>> > >             }
>> > >         }
>> > >     }
>> > >
>> > >     // Return the best registered expressions with different traitsets
>> > > from the current set.
>> > >     return equivalenceSet.getBestExps();
>> > > }
>> > >
>> > > This is a very rough pseudo-code, only to demonstrate the basic idea
>> on
>> > > how proper bottom-up propagation not only helps us set proper traits
>> for
>> > > the new physical node but also ensures that not optimal plans are
>> pruned
>> > as
>> > > early as possible. Real implementation should be better abstracted and
>> > > accept enforcers as well.
>> > >
>> > > Also, please notice that the pseudo-code doesn't show when logical
>> rules
>> > > are fired. This is a separate question. One possible straightforward
>> way
>> > is
>> > > to add the aforementioned physical routine to normal Volcano flow:
>> > > 1) Fire logical rule on a node and register new nodes
>> > > 2) Fire physical optimization as shown above, then invoke
>> > > "call.transformTo()" for every returned physical rel
>> > > 3) Re-trigger the process for newly created nodes and their parents
>> > >
>> > > A better approach is to interleave logical and physical
>> optimizations, so
>> > > they trigger each other recursively. But this would require a serious
>> > > redesign of a "rule queue" concept.
>> > >
>> > > Does it have any common points with your proposal?
>> > >
>> > > Regards,
>> > > Vladimir.
>> > >
>> > > [1]
>> > >
>> >
>> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
>> > >
>> > >
>> > > пт, 6 дек. 2019 г. в 05:41, Haisheng Yuan <[email protected]>:
>> > >
>> > >> Oh, I forgot to mention that the join planning/reordering is not a
>> big
>> > >> problem. Calcite just use the rule to generate a single alternative
>> > plan,
>> > >> which is not ideal.  But we can't say Calcite is doing wrong.
>> > >>
>> > >> Ideally, we expect it generates multiple (neither all, nor single)
>> > >> bipartie graphs. The join reordering rule will cut each part into
>> > bipartie
>> > >> recursively and apply JoinCommutativity rule to generate more
>> > alternatives
>> > >> for each RelSet. It is just a different strategy. We can modify the
>> > rule,
>> > >> or create new join reordering rule to generate multiple plan
>> > >> alternatives.
>> > >>
>> > >> - Haisheng
>> > >>
>> > >> ------------------------------------------------------------------
>> > >> 发件人：Haisheng Yuan<[email protected]>
>> > >> 日 期：2019年12月06日 09:07:43
>> > >> 收件人：Vladimir Ozerov<[email protected]>; [email protected] (
>> > >> [email protected])<[email protected]>
>> > >> 主 题：Re: Re: Volcano's problem with trait propagation: current state
>> and
>> > >> future
>> > >>
>> > >> Generally agree with what Vladimir said. I think what Calcite has is
>> > >> logical optimization or exploration, there are almost no physical
>> > >> optimization, Calcite leaves it to third party implementators. One
>> of my
>> > >> friends at University of Wisconsin–Madison database research group
>> told
>> > me
>> > >> that they gave up the idea of using Calcite in their project due to
>> this
>> > >> reason.
>> > >>
>> > >> Currently physical properties are requested in implementation rules,
>> or
>> > >> even logical exploration rules, But each rule is independent, the
>> > >> pattern-matched expression is not aware of what does the parent
>> operator
>> > >> want. Using AbstractConverter doesn't help, and is not promising.
>> > >>
>> > >> >> You shouldn't regiester all logical rules in the planner
>> > >> simultaneously,... as Drill does.
>> > >> That is because Calcite does too many redundant or duplicate rule
>> > >> matching, like all kinds of transpose (can't be avoided due to
>> current
>> > >> design), matching physical operators.
>> > >>
>> > >> >> decoupling the logical planning from the physical one looks
>> > >> a bit weird to me because it violates the idea of Cascades framework.
>> > >> Orca optimizer fully adopted the design principle of Cascades
>> framework
>> > >> except that it separates into 3 phases: logical exploration, physical
>> > >> implementation, and optimization (property enforcing). And it might
>> be
>> > >> easier if we want to enable parallel optimization by seperating into
>> 3
>> > >> phases. Orca does branch-and-bound in optimization phase, before
>> actual
>> > >> property derivation and enforcement, IIRC. It is highly efficient,
>> works
>> > >> pretty well, and battlefield-tested by many large financial and
>> > insurance
>> > >> companies.
>> > >>
>> > >> In my last thread about on-demand trait request, I gave the
>> high-level
>> > >> general API for physical operators to derive and require physical
>> > >> properties, which is similar to Orca's design. But seems like the
>> > proposal
>> > >> of API change gets no love.
>> > >>
>> > >> - Haisheng
>> > >>
>> > >> ------------------------------------------------------------------
>> > >> 发件人：Vladimir Ozerov<[email protected]>
>> > >> 日 期：2019年12月05日 22:22:43
>> > >> 收件人：[email protected] ([email protected])<
>> > >> [email protected]>
>> > >> 主 题：Re: Volcano's problem with trait propagation: current state and
>> > future
>> > >>
>> > >> AbstractConverter-s are attractive because they effectively emulate
>> > >> straightforward recursive top-down optimization (Volcano/Cascades).
>> But
>> > >> instead of doing it with a recursive method call, which preserves the
>> > >> context, we do this in Calcite as a sequence of unrelated rule calls,
>> > thus
>> > >> losing the context. So with my current understanding, it could be
>> > thought
>> > >> of not as a search space explosion, but rather than the inefficient
>> > >> implementation of an otherwise straightforward parent->child->parent
>> > >> navigation, since we achieve this navigation indirectly through the
>> rule
>> > >> queue, rather than through a normal method call. In any case, the net
>> > >> result is wasted CPU. Perhaps this is not exponential waste, but some
>> > >> multiplication of otherwise optimal planning. As I mentioned, in our
>> > >> experiments with TPC-H, we observed a constant factor between 6-9x
>> > between
>> > >> the number of joins and the number of join implementation rule
>> > >> invocations.
>> > >> It doesn't growth past 9 even for complex queries, so I hope that
>> this
>> > is
>> > >> not an exponent :-)
>> > >>
>> > >> Speaking of logical vs physical optimization, IMO it makes sense in
>> some
>> > >> cases. E.g. when doing predicate pushdown, you do not want to
>> consider
>> > >> intermediate logical tree states for implementation, until the
>> predicate
>> > >> reaches its final position. So running separate logical planning
>> phase
>> > >> with
>> > >> Volcano optimizer makes total sense to me, because it effectively
>> > prunes a
>> > >> lot of not optimal logical plans before they reach the physical
>> planning
>> > >> stage. The real problem to me is that we forced to remove join
>> planning
>> > >> from the physical optimization stage. Because the goal of join
>> planning
>> > >> not
>> > >> to generate a single optimal plan, like with predicate pushdown, but
>> > >> rather
>> > >> to generate a set of logical plans all of which should be implemented
>> > and
>> > >> estimated. And with AbstractConverter-s this is not possible because
>> of
>> > >> their multiplicator increases the rate of search space growth, making
>> > join
>> > >> planning inapplicable even for the small number of relations. So we
>> have
>> > >> to
>> > >> move them to the logical planning stage and pick only one permutation
>> > for
>> > >> physical planning.
>> > >>
>> > >>
>> > >> чт, 5 дек. 2019 г. в 15:35, Roman Kondakov
>> <[email protected]
>> > >:
>> > >>
>> > >> > Vladimir,
>> > >> >
>> > >> > thank you for bringing it up. We are facing the same problems in
>> > Apache
>> > >> > Ignite project
>> > >> > and it would be great if Apache Calcite community will propose a
>> > >> > solution for this
>> > >> > issue.
>> > >> >
>> > >> > From my point of view an approach with abstract converters looks
>> more
>> > >> > promising, but as
>> > >> > you mentioned it suffers from polluting the search space. The
>> latter
>> > can
>> > >> > be mitigated by
>> > >> > splitting a planning stage into the several phases: you shouldn't
>> > >> > register all logical rules in the planner simultaneously - it looks
>> > like
>> > >> > it is better to have several iterations of planning stage with
>> > different
>> > >> > sets of rules, as Drill does.
>> > >> >
>> > >> > Also I'd like to mention that decoupling the logical planning from
>> the
>> > >> > physical one looks
>> > >> > a bit weird to me because it violates the idea of Cascades
>> framework.
>> > >> > Possibly this decoupling is the consequence of some performance
>> > issues.
>> > >> >
>> > >> >
>> > >> > --
>> > >> > Kind Regards
>> > >> > Roman Kondakov
>> > >> >
>> > >> > On 05.12.2019 14:24, Vladimir Ozerov wrote:
>> > >> > > Hi,
>> > >> > >
>> > >> > > As I mentioned before, we are building a distributed SQL engine
>> that
>> > >> uses
>> > >> > > Apache Calcite for query optimization. The key problem we faced
>> is
>> > the
>> > >> > > inability to pull the physical traits of child relations
>> > efficiently.
>> > >> I'd
>> > >> > > like to outline my understanding of the problem (I guess it was
>> > >> already
>> > >> > > discussed multiple times) and ask the community to prove or
>> disprove
>> > >> the
>> > >> > > existence of that problem and its severity for the products which
>> > uses
>> > >> > > Apache Calcite and ask for ideas on how it could be improved in
>> the
>> > >> > future.
>> > >> > >
>> > >> > > I'll start with the simplified problem description and mentioned
>> > more
>> > >> > > complex use cases then. Consider that we have a logical tree and
>> a
>> > >> set of
>> > >> > > implementation rules. Our goal is to find the optimal physical
>> tree
>> > by
>> > >> > > applying these rules. The classical Cascades-based approach
>> directs
>> > >> the
>> > >> > > optimization process from the top to the bottom (hence
>> "top-down").
>> > >> > > However, the actual implementation of tree nodes still happens
>> > >> bottom-up.
>> > >> > > For the tree L1 <- L2, we enter "optimize(L1)", which recursively
>> > >> > delegates
>> > >> > > to "optimize(L2)". We then implement children nodes L1 <- [P2',
>> > P2''],
>> > >> > and
>> > >> > > return back to the parent, which is now able to pick promising
>> > >> > > implementations of the children nodes and reject bad ones with
>> the
>> > >> > > branch-and-bound approach. AFAIK Pivotal's Orca works this way.
>> > >> > >
>> > >> > > The Apache Calcite is very different because it doesn't allow the
>> > >> > recursion
>> > >> > > so that we lose the context on which node requested the child
>> > >> > > transformation. This loss of context leads to the following
>> > problems:
>> > >> > > 1) The parent node cannot deduce it's physical properties during
>> the
>> > >> > > execution of the implementation rule, because Calcite expects the
>> > >> > > transformation to be applied before children nodes are
>> implemented.
>> > >> That
>> > >> > is
>> > >> > > if we are optimizing LogicalProject <- LogicalScan, we cannot set
>> > >> proper
>> > >> > > distribution and collation for the to be created PhysicalProject,
>> > >> because
>> > >> > > it depends on the distribution and collation of the children
>> which
>> > is
>> > >> yet
>> > >> > > to be resolved.
>> > >> > > 2) The branch-and-bound cannot be used because it requires at
>> least
>> > >> one
>> > >> > > fully-built physical subtree.
>> > >> > >
>> > >> > > As a result of this limitation, products which rely on Apache
>> > Calcite
>> > >> for
>> > >> > > query optimization, use one or several workarounds:
>> > >> > >
>> > >> > > *1) Guess the physical properties of parent nodes before logical
>> > >> children
>> > >> > > are implemented*
>> > >> > > *Apache Flink* uses this strategy. The strategy is bad because of
>> > the
>> > >> > > number of combinations of traits growth exponentially with the
>> > number
>> > >> of
>> > >> > > attributes in the given RelNode, so you either explode the search
>> > >> space
>> > >> > or
>> > >> > > give up optimization opportunities. Consider the following tree:
>> > >> > > LogicalSort[a ASC] <- LogicalFilter <- LogicalScan
>> > >> > > The optimal implementation of the LogicalFilter is
>> > >> > PhysicalFilter[collation=a
>> > >> > > ASC] because it may eliminate the parent sort. But such
>> optimization
>> > >> > should
>> > >> > > happen only if we know that there is a physical implementation of
>> > scan
>> > >> > > allowing for this sort order, e.g. PhysicalIndexScan[collation=a
>> > ASC].
>> > >> > I.e.
>> > >> > > we need to know the child physical properties first. Otherwise we
>> > >> > fallback
>> > >> > > to speculative approaches. With the *optimistic* approach, we
>> emit
>> > all
>> > >> > > possible combinations of physical properties, with the hope that
>> the
>> > >> > child
>> > >> > > will satisfy some of them, thus expanding the search space
>> > >> exponentially.
>> > >> > > With the *pessimistic* approach, we just miss this optimization
>> > >> > opportunity
>> > >> > > even if the index exists. Apache Flink uses the pessimistic
>> > approach.
>> > >> > >
>> > >> > > *2) Use AbstractConverters*
>> > >> > > *Apache Drill* uses this strategy. The idea is to "glue" logical
>> and
>> > >> > > physical operators, so that implementation of a physical child
>> > >> > re-triggers
>> > >> > > implementation rule of a logical parent. The flow is as follows:
>> > >> > > - Invoke parent implementation rule - it either doesn't produce
>> new
>> > >> > > physical nodes or produce not optimized physical nodes (like in
>> the
>> > >> > Apache
>> > >> > > Flink case)
>> > >> > > - Invoke children implementation rules and create physical
>> children
>> > >> > > - Then converters kick-in and re-trigger parent implementation
>> rule
>> > >> > through
>> > >> > > the creation of an abstract converter
>> > >> > > - Finally, the parent implementation rule is fired again and now
>> it
>> > >> > > produces optimized node(s) since at least some of the physical
>> > >> > > distributions of children nodes are implemented.
>> > >> > >
>> > >> > > Note that this is essentially a hack to simulate the Cascades
>> flow!
>> > >> The
>> > >> > > problem is that AbstractConverters increase the complexity of
>> > planning
>> > >> > > because they do not have any context, so parent rules are just
>> > >> > re-triggered
>> > >> > > blindly. Consider the optimization of the following tree:
>> > >> > > LogicalJoin <- [LogicalScan1, LogicalScan2]
>> > >> > > With the converter approach, the join implementation rule will be
>> > >> fired
>> > >> > at
>> > >> > > least 3 times, while in reality, one call should be sufficient.
>> In
>> > our
>> > >> > > experiments with TPC-H queries, the join rule implemented that
>> way
>> > is
>> > >> > > typically called 6-9 times more often than expected.
>> > >> > >
>> > >> > > *3) Transformations (i.e. logical optimization) are decoupled
>> from
>> > >> > > implementation (i.e. physical optimization)*
>> > >> > > Normally, you would like to mix both logical and physical rules
>> in a
>> > >> > single
>> > >> > > optimization program, because it is required for proper planning.
>> > That
>> > >> > is,
>> > >> > > you should consider both (Ax(BxC)) and ((AxB)xC) join ordering
>> > during
>> > >> > > physical optimization, because you do not know which one will
>> > produce
>> > >> the
>> > >> > > better plan in advance.
>> > >> > > But in some practical implementations of Calcite-based
>> optimizers,
>> > >> this
>> > >> > is
>> > >> > > not the case, and join planning is performed as a separate HEP
>> > stage.
>> > >> > > Examples are Apache Drill and Apache Flink.
>> > >> > > I believe that lack of Cascades-style flow and branch-and-bound
>> are
>> > >> among
>> > >> > > the main reasons for this. At the very least for Apache Drill,
>> since
>> > >> it
>> > >> > > uses converters, so additional logical permutations will
>> > exponentially
>> > >> > > multiply the number of fired rules, which is already very big.
>> > >> > >
>> > >> > > Given all these problems I'd like to ask the community to share
>> > >> current
>> > >> > > thoughts and ideas on the future improvement of the Calcite
>> > optimizer.
>> > >> > One
>> > >> > > of the ideas being discussed in the community is "Pull-up
>> Traits",
>> > >> which
>> > >> > > should allow the parent node to get physical properties of the
>> > >> children
>> > >> > > nodes. But in order to do this, you effectively need to implement
>> > >> > children,
>> > >> > > which IMO makes this process indistinguishable from the classical
>> > >> > recursive
>> > >> > > Cascades algorithm.
>> > >> > >
>> > >> > > Have you considered recursive transformations as an alternative
>> > >> solution
>> > >> > to
>> > >> > > that problem? I.e. instead of trying to guess or pull the
>> physical
>> > >> > > properties of non-existent physical nodes, go ahead and actually
>> > >> > implement
>> > >> > > them directly from within the parent rule? This may resolve the
>> > >> problem
>> > >> > > with trait pull-up, as well as allow for branch-and-bound
>> > >> optimization.
>> > >> > >
>> > >> > > I would appreciate your feedback or any hints for future
>> research.
>> > >> > >
>> > >> > > Regards,
>> > >> > > Vladimir.
>> > >> > >
>> > >> >
>> > >>
>> > >>
>> > >>
>> >
>>
>

Re: Re: Re: Volcano's problem with trait propagation: current state and future

Reply via email to