Internally we have a multi-inputs merge join, for each input there maybe a 
collation permutations.

Best,
Danny Chan
在 2020年1月8日 +0800 AM1:20,Xiening Dai <[email protected]>,写道:
> Danny, I am not sure how this would affect join re-order. Could you elaborate?
>
>
> > On Jan 7, 2020, at 1:29 AM, Danny Chan <[email protected]> wrote:
> >
> > Hi, guys, it seems that the discussion silent now, so do we have some 
> > conclusion that can contribute to current code, i.e. the suggested API 
> > change or new abstraction ?
> >
> > Or better, can someone give a design doc so that we can push and make that 
> > implemented ?
> >
> > Personally I was always looking forward to the result, because Apache Flink 
> > suffers for the bad planning performance for Join re-order or traits 
> > auto-adapter.
> >
> > Best,
> > Danny Chan
> > 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov <[email protected]>,写道:
> > > HI Igor,
> > >
> > > Thank you for the details. Meanwhile, I solved it with separation of
> > > conversion rules from the physical optimization rules. So the first pass
> > > creates physical nodes with unknown physical properties (top-bottom), 
> > > while
> > > subsequent processing of the leaf nodes triggers rules which convert "bad"
> > > physical nodes to "good" physical nodes with know distribution and
> > > collation.
> > >
> > > Regards,
> > > Vladimir.
> > >
> > > пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor <[email protected]>:
> > >
> > > > Vladimir,
> > > >
> > > > Hope it may help you.
> > > >
> > > > Currently we applied the next way (just rough description):
> > > >
> > > > 1) We created an API to derive possible traits permutations on the basis
> > > > of children traits (quite similar to one, described in «On Demand trait
> > > > set request» topic)
> > > >
> > > > 2) added a general rule that copies Logical nodes, but requesting our
> > > > convention from their children (IGNITE convention, ANY distribution, 
> > > > EMPTY
> > > > collation) and sets importance of old Logical nodes to zero - so, we 
> > > > have a
> > > > Logical parent which input satisfies any possible distribution and no 
> > > > rules
> > > > matched to previous logical node any more.
> > > >
> > > > 3) Physical rules to create physical rel nodes only if physical traits 
> > > > may
> > > > be derived (there is no «barrier», described in one of previous 
> > > > messages) -
> > > > derived traits are a collection, we don’t create a physical rel node for
> > > > each possible traits set, also we may set zero importance for previously
> > > > created rel nodes to decrease search space.
> > > >
> > > > Now we know actual and required distribution, we don’t need
> > > > AbstractConverters and able just call TraitDef.convert() method inside a
> > > > rule.
> > > > A rule still able to produce the same output several times, but
> > > > «memorization" inside the planner solves it for us.
> > > >
> > > > preliminary tests show almost zero overhead of the approach.
> > > >
> > > > Regards,
> > > > Igor
> > > >
> > > >
> > > > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov <[email protected]>
> > > > написал(а):
> > > > >
> > > > > Hi Xing,
> > > > >
> > > > > Thanks for your suggestion. Yes, this may help, and if I get your idea
> > > > > right, I already had it in my reproducer:
> > > > > 1) Create the converted physical input:
> > > > >
> > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49
> > > > > 2) Use it in case no physical children were found:
> > > > >
> > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79
> > > > >
> > > > > This idea is borrowed from Apache Drill physical rules. But the 
> > > > > problem
> > > > is
> > > > > that this approach leads to a suboptimal plan - parent node doesn't 
> > > > > know
> > > > > the future distribution of a child node. And as a result, it doesn't 
> > > > > know
> > > > > it's own distribution. So the final plan is constructed in that way:
> > > > > 1.1) Root enforced "SINGLETON" on its child:
> > > > > -> PhysicalRoot[SINGLETON]
> > > > > -> Converter[SINGLETON]
> > > > > -> PhysicalProject[*ANY*]
> > > > > -> PhysicalScan[REPLICATED]
> > > > >
> > > > > 1.2) But since the child (PhysicalProject) failed to infer 
> > > > > distribution
> > > > > during rule call, it falls back to ANY distribution. In order to 
> > > > > satisfy
> > > > > SINGLETON distribution of a parent, we inject an exchange in the final
> > > > plan:
> > > > > -> PhysicalRoot[SINGLETON]
> > > > > * -> Exchange[SINGLETON]*
> > > > > -> PhysicalProject[*ANY*]
> > > > > -> PhysicalScan[REPLICATED]
> > > > >
> > > > > 2) But this is a suboptimal plan. The optimal plan is:
> > > > > -> PhysicalRoot[SINGLETON]
> > > > > -> PhysicalProject[REPLICATED]
> > > > > -> PhysicalScan[REPLICATED]
> > > > >
> > > > > You may observe it in my tests:
> > > > > 1)
> > > > >
> > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46
> > > > > -
> > > > > works as you described and produces not optimal plan with exchange
> > > > > 2)
> > > > >
> > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30
> > > > > -
> > > > > rely on AbstractConverter-s and produce an optimal plan with bottom-up
> > > > > trait propagation at the cost of significantly increased planning time
> > > > >
> > > > > Regards,
> > > > > Vladimir.
> > > > >
> > > > > пт, 8 нояб. 2019 г. в 16:15, XING JIN <[email protected]>:
> > > > >
> > > > > > Hi Vladimir,
> > > > > >
> > > > > > I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule 
> > > > > > work
> > > > may
> > > > > > help you ~
> > > > > > They work by a top-down fashion, but when matching parent, they 
> > > > > > convert
> > > > the
> > > > > > children explicitly.
> > > > > > You may try below steps:
> > > > > > 1. Construct a rule LogicalParentRule to match the LogicalParent 
> > > > > > without
> > > > > > distribution/physical requirement for the LogicalChild;
> > > > > > 2. In this rule, you call 'planner.changeTraits' on the 
> > > > > > LogicalChild to
> > > > > > build a new child with physical convention. Note that at this moment
> > > > only
> > > > > > an empty RelSubset is created and no PhysicalChild exists.
> > > > > > 3. Then set the RelNode to be the new input of LogicalParent;
> > > > > >
> > > > > > By above steps, you can build a parent-child relationship between
> > > > > > LogicalParent and PhysicalChild, and at last the PhysicalParentRule
> > > > will be
> > > > > > fired based on this relationship.
> > > > > >
> > > > > > I have a commit to illustrate my idea, check 
> > > > > > VolcanoPlannerTest#testDEV
> > > > in
> > > > > > below link, hope it may help you ~
> > > > > > https://github.com/jinxing64/calcite/tree/demo
> > > > > >
> > > > > > Also I'm +1 with Seliverstov that to get all parents of a set, which
> > > > > > against the current check in RelSubset#getParentRels
> > > > > >
> > > > > > Best,
> > > > > > Jin
> > > > > >
> > > > > > Vladimir Ozerov <[email protected]> 于2019年11月5日周二 下午6:41写道:
> > > > > >
> > > > > > > Hi Xiening,
> > > > > > >
> > > > > > > I read the thread about on-demand trait requests. It seems pretty
> > > > similar
> > > > > > > to what I am trying to achieve, as it facilitates the bottom-up
> > > > > > propagation
> > > > > > > of physical traits. In fact, both your and my strategy propagate 
> > > > > > > traits
> > > > > > > bottom-up, but I do this through rules, which also fire bottom-up,
> > > > while
> > > > > > in
> > > > > > > your case only the traits are propagated bottom-up, while rules
> > > > continue
> > > > > > > working in a top-down fashion.
> > > > > > >
> > > > > > > However, I am thinking of how I would potentially implement my
> > > > optimizer
> > > > > > > with your approach, and it feels like with on-demand traits 
> > > > > > > resulting
> > > > > > > implementation of metadata queries may become very complex to that
> > > > point
> > > > > > > that it will look like another set of rules, parallel to the 
> > > > > > > already
> > > > > > > existing ruleset. For example, consider that I have a couple of
> > > > > > distributed
> > > > > > > tables in an OLTP application. These tables have a number of 
> > > > > > > indexes,
> > > > > > and I
> > > > > > > would like to join them. First, I have a number of choices on how 
> > > > > > > to
> > > > join
> > > > > > > tables with respect to distribution. Then, I have a number of 
> > > > > > > choices
> > > > on
> > > > > > > which access method to use. Because sometimes it is beneficial to 
> > > > > > > pick
> > > > > > > index scans instead of table scans even without index conditions, 
> > > > > > > for
> > > > > > > example, to preserve a comfortable collation. So when my logical 
> > > > > > > scan
> > > > > > > receives such metadata request, it typically cannot return all 
> > > > > > > possible
> > > > > > > combinations, because there are too many of them. Instead, some
> > > > > > heuristical
> > > > > > > or cost-based logic will be used to calculate a couple of most
> > > > > > prospective
> > > > > > > ones. But it seems that we will have to duplicate the same logic 
> > > > > > > in the
> > > > > > > corresponding rule, aren't we?
> > > > > > >
> > > > > > > I would love to read your design because this is a really 
> > > > > > > interesting
> > > > > > > topic, and it is of great importance for the distributed engines
> > > > > > developed
> > > > > > > on top of Calcite since proper use of distribution and collation 
> > > > > > > is the
> > > > > > key
> > > > > > > success factor for efficient query optimization.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Vladimir.
> > > > > > >
> > > > > > > пт, 1 нояб. 2019 г. в 00:40, Xiening Dai <[email protected]>:
> > > > > > >
> > > > > > > > Actually we solved this problem in our setup using a mechanism 
> > > > > > > > called
> > > > > > > > “Pull-Up Traits”, which explores the possible trait set of 
> > > > > > > > children’s
> > > > > > > input
> > > > > > > > to decide parent’s physical properties. In order to determine 
> > > > > > > > child
> > > > > > input
> > > > > > > > trait, you would have to look at child’s children, and all the 
> > > > > > > > way to
> > > > > > the
> > > > > > > > leaves nodes or a barrier. A barrier is a rel node which cannot 
> > > > > > > > derive
> > > > > > > any
> > > > > > > > traits regardless the input. A good example would be a user 
> > > > > > > > define
> > > > > > > function
> > > > > > > > which would throw off any distribution or collation. Then we 
> > > > > > > > realize
> > > > > > just
> > > > > > > > pulling up is not enough, sometimes we would need to look at 
> > > > > > > > parent’s
> > > > > > > > requirement as well. So we try to solve this in a unified 
> > > > > > > > framework,
> > > > > > > which
> > > > > > > > we call “On Demand Trait” and implement it as part of the 
> > > > > > > > framework so
> > > > > > > > anyone can be benefited. I hope Haisheng can share a design doc 
> > > > > > > > once
> > > > we
> > > > > > > > have more concrete ideas.
> > > > > > > >
> > > > > > > >
> > > > > > > > > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni <[email protected]> 
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi Vladimir,
> > > > > > > > >
> > > > > > > > > The SubsetTransformer interface and the iterating over the 
> > > > > > > > > RelNodes
> > > > > > > > > within a RelSubset in Drill is exactly implemented to do the 
> > > > > > > > > trait
> > > > > > > > > propagation. We also had to rely on AbstractConverter to fire
> > > > > > > > > necessary rule to avoid the CanNotPlan issue. At some point, 
> > > > > > > > > Calcite
> > > > > > > > > community chooses to remove AbstractConverter and Drill had 
> > > > > > > > > to add it
> > > > > > > > > back, which is probably one of the main reasons for us to 
> > > > > > > > > continue
> > > > > > > > > using a Calcite fork. I still remember we constantly had to 
> > > > > > > > > deal
> > > > > > with
> > > > > > > > > the dilemma between "CanNotPlan" and long planing time due to
> > > > > > explored
> > > > > > > > > search space.
> > > > > > > > >
> > > > > > > > > Glad to see more people are joining the effort to solve this 
> > > > > > > > > long
> > > > > > > > > overdue issue, something missing in Calcite's core optimizer
> > > > > > framework
> > > > > > > > > "since before Calcite was Calcite" (Jacques's words).
> > > > > > > > >
> > > > > > > > > Jinfeng
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov 
> > > > > > > > > <[email protected]>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Danny,
> > > > > > > > > >
> > > > > > > > > > Thank you very much for the links. What is described here 
> > > > > > > > > > is pretty
> > > > > > > much
> > > > > > > > > > similar to the problem I describe. Especially the 
> > > > > > > > > > discussion about
> > > > > > > trait
> > > > > > > > > > propagation, as this is basically what I need - to explore 
> > > > > > > > > > potential
> > > > > > > > traits
> > > > > > > > > > of children before optimizing parents. And this is 
> > > > > > > > > > basically what
> > > > > > > Drill
> > > > > > > > > > already does with it's SubsetTransformer:
> > > > > > > > > > 1) There is a SubsetTransformer interface, which iterates 
> > > > > > > > > > over
> > > > > > > physical
> > > > > > > > > > relations of the given subset [1]
> > > > > > > > > > 2) If you want to make a physical project, you iterate over 
> > > > > > > > > > physical
> > > > > > > > > > relations of the input subset and create possible physical 
> > > > > > > > > > projects
> > > > > > > [2]
> > > > > > > > > > 3) But if you cannot find any physical input, then you 
> > > > > > > > > > trigger
> > > > > > > creation
> > > > > > > > of
> > > > > > > > > > a "bad" physical project, which is very likely to have poor 
> > > > > > > > > > cost
> > > > > > > > because it
> > > > > > > > > > cannot take advantage of input's distribution information 
> > > > > > > > > > [3]
> > > > > > > > > > So, step 2 - is a trait set propagation which is needed by 
> > > > > > > > > > many
> > > > > > > > > > distributed engines. Step 3 - an attempt to workaround 
> > > > > > > > > > current
> > > > > > > > > > VolcanoPlanner behavior, when a parent rule is fired only if
> > > > > > produced
> > > > > > > > child
> > > > > > > > > > node has compatible trait set.
> > > > > > > > > >
> > > > > > > > > > I do not know Calcite's architecture that good but at first 
> > > > > > > > > > glance,
> > > > > > > the
> > > > > > > > > > proposed ability to re-fire rules of a specific Rel seems 
> > > > > > > > > > good
> > > > > > enough.
> > > > > > > > It
> > > > > > > > > > doesn't expand search space, because no new nodes are 
> > > > > > > > > > created, and
> > > > > > it
> > > > > > > > seems
> > > > > > > > > > to be relatively simple to implement.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java
> > > > > > > > > > [2]
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66
> > > > > > > > > > [3]
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69
> > > > > > > > > >
> > > > > > > > > > чт, 31 окт. 2019 г. в 12:21, Danny Chan 
> > > > > > > > > > <[email protected]>:
> > > > > > > > > >
> > > > > > > > > > > Thanks Vladimir for bringing up this discussion !
> > > > > > > > > > >
> > > > > > > > > > > Did you notice that there is a JIRA issue about this 
> > > > > > > > > > > problem ? [1]
> > > > > > > > Also a
> > > > > > > > > > > discussion about how to propagate the traits [2]
> > > > > > > > > > >
> > > > > > > > > > > [1] https://issues.apache.org/jira/browse/CALCITE-2970
> > > > > > > > > > > [2]
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Danny Chan
> > > > > > > > > > > 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov 
> > > > > > > > > > > <[email protected]
> > > > > > > ,写道:
> > > > > > > > > > > > Hi colleagues,
> > > > > > > > > > > >
> > > > > > > > > > > > I would like to discuss with the community the 
> > > > > > > > > > > > possibility of
> > > > > > > adding a
> > > > > > > > > > > new
> > > > > > > > > > > > public method to VolcanoPlanner which will forcefully 
> > > > > > > > > > > > re-trigger
> > > > > > the
> > > > > > > > > > > rules
> > > > > > > > > > > > for the specific rel. This is a follow up of a 
> > > > > > > > > > > > discussion started
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > other thread [1].
> > > > > > > > > > > >
> > > > > > > > > > > > **Problem statement**
> > > > > > > > > > > > When converting between conventions during optimization
> > > > > > > VolcanoPlanner
> > > > > > > > > > > > prefers the top-bottom approach, so that the nodes are 
> > > > > > > > > > > > converted
> > > > > > > from
> > > > > > > > the
> > > > > > > > > > > > root. But in some cases, the intermediate node must be 
> > > > > > > > > > > > converted
> > > > > > > after
> > > > > > > > > > > its
> > > > > > > > > > > > children. This is especially true for distributed SQL 
> > > > > > > > > > > > engines,
> > > > > > which
> > > > > > > > rely
> > > > > > > > > > > > on distribution traits during the optimization process: 
> > > > > > > > > > > > it is not
> > > > > > > > > > > possible
> > > > > > > > > > > > to efficiently choose a proper physical implementation 
> > > > > > > > > > > > of a parent
> > > > > > > > node
> > > > > > > > > > > > unless the physical representation of a child node is 
> > > > > > > > > > > > known.
> > > > > > > > > > > >
> > > > > > > > > > > > It seems that presently VolcanoPlanner cannot address 
> > > > > > > > > > > > such cases
> > > > > > by
> > > > > > > > > > > > default. Consider that we have two nodes and associated 
> > > > > > > > > > > > rules
> > > > > > which
> > > > > > > > > > > convert
> > > > > > > > > > > > them to physical counterparts:
> > > > > > > > > > > > [LogicalParent <- LogicalChild]
> > > > > > > > > > > > The parent should be converted after the child. When 
> > > > > > > > > > > > the child is
> > > > > > > > > > > > converted, the physical node is created:
> > > > > > > > > > > > [LogicalParent <- {LogicalChild, PhysicalChild}]
> > > > > > > > > > > > In order to finish the optimization process, we need to 
> > > > > > > > > > > > convert
> > > > > > the
> > > > > > > > > > > parent.
> > > > > > > > > > > > But parent rules are not fired, because PhysicalChild 
> > > > > > > > > > > > has traits
> > > > > > > > > > > > incompatible with LogicalParent.
> > > > > > > > > > > >
> > > > > > > > > > > > Presently the problem could be solved in two ways:
> > > > > > > > > > > > 1) Always produce conversions when going top-down. In 
> > > > > > > > > > > > this case, I
> > > > > > > > first
> > > > > > > > > > > > create a physical parent, then a physical child. The 
> > > > > > > > > > > > problem is
> > > > > > that
> > > > > > > > > > > > created parent is not optimal because it didn't know 
> > > > > > > > > > > > child
> > > > > > > > distribution
> > > > > > > > > > > at
> > > > > > > > > > > > the time of creation. So the full flow would be: create 
> > > > > > > > > > > > a bad
> > > > > > > parent,
> > > > > > > > > > > > create a good child, create a good parent.
> > > > > > > > > > > > 1.1) [LogicalParent <- LogicalChild]
> > > > > > > > > > > > 1.2) [{LogicalParent, PhysicalParentBad} <- 
> > > > > > > > > > > > LogicalChild]
> > > > > > > > > > > > 1.3) [{LogicalParent, PhysicalParentBad} <- 
> > > > > > > > > > > > {LogicalChild,
> > > > > > > > > > > PhysicalChild}]
> > > > > > > > > > > > 1.4) [{LogicalParent, PhysicalParentBad, 
> > > > > > > > > > > > PhysicalParentGood} <-
> > > > > > > > > > > > {LogicalChild, PhysicalChild}]
> > > > > > > > > > > > What is worse, the creation of a not optimal parent 
> > > > > > > > > > > > will trigger
> > > > > > > > rules on
> > > > > > > > > > > > its parent, which in turn may create a not optimal 
> > > > > > > > > > > > parent-parent
> > > > > > > node,
> > > > > > > > > > > etc.
> > > > > > > > > > > >
> > > > > > > > > > > > 2) Make sure that your convention returns true for
> > > > > > > > > > > > Convention.canConvertConvention. In this case, every 
> > > > > > > > > > > > time a new
> > > > > > rel
> > > > > > > is
> > > > > > > > > > > > added to a RelSet, its traitset will be compared to 
> > > > > > > > > > > > traitsets of
> > > > > > all
> > > > > > > > > > > other
> > > > > > > > > > > > rels in the set. For every pair of traitset we may ask 
> > > > > > > > > > > > the engine
> > > > > > to
> > > > > > > > > > > create
> > > > > > > > > > > > a relevant AbstractConverter. The net effect is that
> > > > > > > > > > > "physical-to-logical"
> > > > > > > > > > > > converter is created, which re-triggers the rule on the 
> > > > > > > > > > > > logical
> > > > > > > parent
> > > > > > > > > > > > since their conventions are compatible:
> > > > > > > > > > > > 2.1) [LogicalParent <- LogicalChild]
> > > > > > > > > > > > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
> > > > > > > > > > > > 2.3) [LogicalParent <- {LogicalChild, PhysicalChild,
> > > > > > > > > > > > PhysicalToLogicalConverter}]
> > > > > > > > > > > > 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild,
> > > > > > > PhysicalChild,
> > > > > > > > > > > > PhysicalToLogicalConverter}]
> > > > > > > > > > > >
> > > > > > > > > > > > The problem with that approach is that it is too 
> > > > > > > > > > > > coarse-grained
> > > > > > > since
> > > > > > > > we
> > > > > > > > > > > > operate on traitsets rather than rels. As a result, 
> > > > > > > > > > > > additional
> > > > > > > memory
> > > > > > > > and
> > > > > > > > > > > > CPU pressure are introduced because usually too many 
> > > > > > > > > > > > converters
> > > > > > are
> > > > > > > > > > > > created, which triggers the rules over and over again.
> > > > > > > > > > > >
> > > > > > > > > > > > **Affected products**
> > > > > > > > > > > > At the moment two distributed engines are being 
> > > > > > > > > > > > developed for
> > > > > > > > Hazelcast
> > > > > > > > > > > and
> > > > > > > > > > > > Apache Ignite. Both require bottom-up optimization and 
> > > > > > > > > > > > currently
> > > > > > > rely
> > > > > > > > on
> > > > > > > > > > > > the second workaround.
> > > > > > > > > > > > Another example is Apache Drill. I do not know whether 
> > > > > > > > > > > > its
> > > > > > community
> > > > > > > > is
> > > > > > > > > > > > concerned with the issue, but it also uses bottom-up 
> > > > > > > > > > > > optimization
> > > > > > > for
> > > > > > > > > > > many
> > > > > > > > > > > > rules and employs both the aforementioned workarounds. 
> > > > > > > > > > > > As a
> > > > > > result,
> > > > > > > I
> > > > > > > > > > > guess
> > > > > > > > > > > > that Drill's optimizer also creates too many rels during
> > > > > > > optimization
> > > > > > > > and
> > > > > > > > > > > > suffer from huge search space. Please correct me if I 
> > > > > > > > > > > > am wrong.
> > > > > > > > > > > >
> > > > > > > > > > > > **Proposal**
> > > > > > > > > > > > The key problem is that there is no way to re-fire 
> > > > > > > > > > > > rules on a
> > > > > > > specific
> > > > > > > > > > > > node. The original problem could have been solved, if 
> > > > > > > > > > > > it would be
> > > > > > > > > > > possible
> > > > > > > > > > > > to re-fire rules on a LogicalParent without creating any
> > > > > > additional
> > > > > > > > rels.
> > > > > > > > > > > > That would lead to a clear optimization flow:
> > > > > > > > > > > > 2.1) [LogicalParent <- LogicalChild]
> > > > > > > > > > > > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
> > > > > > > > > > > > 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild,
> > > > > > > > PhysicalChild}]
> > > > > > > > > > > >
> > > > > > > > > > > > We can add the following method to VolcanoPlanner 
> > > > > > > > > > > > (RelOptPlanner?)
> > > > > > > > > > > > interface:
> > > > > > > > > > > > void fireRules(RelNode rel)
> > > > > > > > > > > >
> > > > > > > > > > > > This method will fire the rules on a passed node in a 
> > > > > > > > > > > > deferred
> > > > > > mode
> > > > > > > > as if
> > > > > > > > > > > > it was the new node just added to the optimizer. This 
> > > > > > > > > > > > would
> > > > > > require
> > > > > > > > > > > slight
> > > > > > > > > > > > changes to RuleQueue.addMatch method, and possibly some 
> > > > > > > > > > > > other
> > > > > > > places.
> > > > > > > > > > > >
> > > > > > > > > > > > Usage example:
> > > > > > > > > > > > class PhysicalChildRule extends RelOptRule {
> > > > > > > > > > > > void onMatch(RelOptRuleCall call) {
> > > > > > > > > > > > LogicalChild logicalRel = call.get(0);
> > > > > > > > > > > > PhysicalChild physicalRel = ...;
> > > > > > > > > > > >
> > > > > > > > > > > > call.transformTo(physicalRel);
> > > > > > > > > > > > optimizer.fireRules(logicalRel);
> > > > > > > > > > > > }
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > What does the community think about such a method? Does 
> > > > > > > > > > > > it make
> > > > > > any
> > > > > > > > sense
> > > > > > > > > > > > to you? If not, do you aware of any other ways on how 
> > > > > > > > > > > > to organize
> > > > > > > > > > > bottom-up
> > > > > > > > > > > > optimization with VolcanoPlanner without the creation of
> > > > > > additional
> > > > > > > > rels?
> > > > > > > > > > > >
> > > > > > > > > > > > If the community is OK in general, I can create try to 
> > > > > > > > > > > > create a PR
> > > > > > > > with a
> > > > > > > > > > > > prototype.
> > > > > > > > > > > >
> > > > > > > > > > > > Would appreciate your feedback.
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Vladimir.
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E
> > > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > > >
>

Reply via email to