Danny, I am not sure how this would affect join re-order. Could you elaborate?
> On Jan 7, 2020, at 1:29 AM, Danny Chan <yuzhao....@gmail.com> wrote: > > Hi, guys, it seems that the discussion silent now, so do we have some > conclusion that can contribute to current code, i.e. the suggested API change > or new abstraction ? > > Or better, can someone give a design doc so that we can push and make that > implemented ? > > Personally I was always looking forward to the result, because Apache Flink > suffers for the bad planning performance for Join re-order or traits > auto-adapter. > > Best, > Danny Chan > 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov <ppoze...@gmail.com>,写道: >> HI Igor, >> >> Thank you for the details. Meanwhile, I solved it with separation of >> conversion rules from the physical optimization rules. So the first pass >> creates physical nodes with unknown physical properties (top-bottom), while >> subsequent processing of the leaf nodes triggers rules which convert "bad" >> physical nodes to "good" physical nodes with know distribution and >> collation. >> >> Regards, >> Vladimir. >> >> пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor <gvvinbl...@gmail.com>: >> >>> Vladimir, >>> >>> Hope it may help you. >>> >>> Currently we applied the next way (just rough description): >>> >>> 1) We created an API to derive possible traits permutations on the basis >>> of children traits (quite similar to one, described in «On Demand trait >>> set request» topic) >>> >>> 2) added a general rule that copies Logical nodes, but requesting our >>> convention from their children (IGNITE convention, ANY distribution, EMPTY >>> collation) and sets importance of old Logical nodes to zero - so, we have a >>> Logical parent which input satisfies any possible distribution and no rules >>> matched to previous logical node any more. >>> >>> 3) Physical rules to create physical rel nodes only if physical traits may >>> be derived (there is no «barrier», described in one of previous messages) - >>> derived traits are a collection, we don’t create a physical rel node for >>> each possible traits set, also we may set zero importance for previously >>> created rel nodes to decrease search space. >>> >>> Now we know actual and required distribution, we don’t need >>> AbstractConverters and able just call TraitDef.convert() method inside a >>> rule. >>> A rule still able to produce the same output several times, but >>> «memorization" inside the planner solves it for us. >>> >>> preliminary tests show almost zero overhead of the approach. >>> >>> Regards, >>> Igor >>> >>> >>>> 14 нояб. 2019 г., в 14:49, Vladimir Ozerov <ppoze...@gmail.com> >>> написал(а): >>>> >>>> Hi Xing, >>>> >>>> Thanks for your suggestion. Yes, this may help, and if I get your idea >>>> right, I already had it in my reproducer: >>>> 1) Create the converted physical input: >>>> >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 >>>> 2) Use it in case no physical children were found: >>>> >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 >>>> >>>> This idea is borrowed from Apache Drill physical rules. But the problem >>> is >>>> that this approach leads to a suboptimal plan - parent node doesn't know >>>> the future distribution of a child node. And as a result, it doesn't know >>>> it's own distribution. So the final plan is constructed in that way: >>>> 1.1) Root enforced "SINGLETON" on its child: >>>> -> PhysicalRoot[SINGLETON] >>>> -> Converter[SINGLETON] >>>> -> PhysicalProject[*ANY*] >>>> -> PhysicalScan[REPLICATED] >>>> >>>> 1.2) But since the child (PhysicalProject) failed to infer distribution >>>> during rule call, it falls back to ANY distribution. In order to satisfy >>>> SINGLETON distribution of a parent, we inject an exchange in the final >>> plan: >>>> -> PhysicalRoot[SINGLETON] >>>> * -> Exchange[SINGLETON]* >>>> -> PhysicalProject[*ANY*] >>>> -> PhysicalScan[REPLICATED] >>>> >>>> 2) But this is a suboptimal plan. The optimal plan is: >>>> -> PhysicalRoot[SINGLETON] >>>> -> PhysicalProject[REPLICATED] >>>> -> PhysicalScan[REPLICATED] >>>> >>>> You may observe it in my tests: >>>> 1) >>>> >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 >>>> - >>>> works as you described and produces not optimal plan with exchange >>>> 2) >>>> >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 >>>> - >>>> rely on AbstractConverter-s and produce an optimal plan with bottom-up >>>> trait propagation at the cost of significantly increased planning time >>>> >>>> Regards, >>>> Vladimir. >>>> >>>> пт, 8 нояб. 2019 г. в 16:15, XING JIN <jinxing.co...@gmail.com>: >>>> >>>>> Hi Vladimir, >>>>> >>>>> I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work >>> may >>>>> help you ~ >>>>> They work by a top-down fashion, but when matching parent, they convert >>> the >>>>> children explicitly. >>>>> You may try below steps: >>>>> 1. Construct a rule LogicalParentRule to match the LogicalParent without >>>>> distribution/physical requirement for the LogicalChild; >>>>> 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to >>>>> build a new child with physical convention. Note that at this moment >>> only >>>>> an empty RelSubset is created and no PhysicalChild exists. >>>>> 3. Then set the RelNode to be the new input of LogicalParent; >>>>> >>>>> By above steps, you can build a parent-child relationship between >>>>> LogicalParent and PhysicalChild, and at last the PhysicalParentRule >>> will be >>>>> fired based on this relationship. >>>>> >>>>> I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV >>> in >>>>> below link, hope it may help you ~ >>>>> https://github.com/jinxing64/calcite/tree/demo >>>>> >>>>> Also I'm +1 with Seliverstov that to get all parents of a set, which >>>>> against the current check in RelSubset#getParentRels >>>>> >>>>> Best, >>>>> Jin >>>>> >>>>> Vladimir Ozerov <ppoze...@gmail.com> 于2019年11月5日周二 下午6:41写道: >>>>> >>>>>> Hi Xiening, >>>>>> >>>>>> I read the thread about on-demand trait requests. It seems pretty >>> similar >>>>>> to what I am trying to achieve, as it facilitates the bottom-up >>>>> propagation >>>>>> of physical traits. In fact, both your and my strategy propagate traits >>>>>> bottom-up, but I do this through rules, which also fire bottom-up, >>> while >>>>> in >>>>>> your case only the traits are propagated bottom-up, while rules >>> continue >>>>>> working in a top-down fashion. >>>>>> >>>>>> However, I am thinking of how I would potentially implement my >>> optimizer >>>>>> with your approach, and it feels like with on-demand traits resulting >>>>>> implementation of metadata queries may become very complex to that >>> point >>>>>> that it will look like another set of rules, parallel to the already >>>>>> existing ruleset. For example, consider that I have a couple of >>>>> distributed >>>>>> tables in an OLTP application. These tables have a number of indexes, >>>>> and I >>>>>> would like to join them. First, I have a number of choices on how to >>> join >>>>>> tables with respect to distribution. Then, I have a number of choices >>> on >>>>>> which access method to use. Because sometimes it is beneficial to pick >>>>>> index scans instead of table scans even without index conditions, for >>>>>> example, to preserve a comfortable collation. So when my logical scan >>>>>> receives such metadata request, it typically cannot return all possible >>>>>> combinations, because there are too many of them. Instead, some >>>>> heuristical >>>>>> or cost-based logic will be used to calculate a couple of most >>>>> prospective >>>>>> ones. But it seems that we will have to duplicate the same logic in the >>>>>> corresponding rule, aren't we? >>>>>> >>>>>> I would love to read your design because this is a really interesting >>>>>> topic, and it is of great importance for the distributed engines >>>>> developed >>>>>> on top of Calcite since proper use of distribution and collation is the >>>>> key >>>>>> success factor for efficient query optimization. >>>>>> >>>>>> Regards, >>>>>> Vladimir. >>>>>> >>>>>> пт, 1 нояб. 2019 г. в 00:40, Xiening Dai <xndai....@gmail.com>: >>>>>> >>>>>>> Actually we solved this problem in our setup using a mechanism called >>>>>>> “Pull-Up Traits”, which explores the possible trait set of children’s >>>>>> input >>>>>>> to decide parent’s physical properties. In order to determine child >>>>> input >>>>>>> trait, you would have to look at child’s children, and all the way to >>>>> the >>>>>>> leaves nodes or a barrier. A barrier is a rel node which cannot derive >>>>>> any >>>>>>> traits regardless the input. A good example would be a user define >>>>>> function >>>>>>> which would throw off any distribution or collation. Then we realize >>>>> just >>>>>>> pulling up is not enough, sometimes we would need to look at parent’s >>>>>>> requirement as well. So we try to solve this in a unified framework, >>>>>> which >>>>>>> we call “On Demand Trait” and implement it as part of the framework so >>>>>>> anyone can be benefited. I hope Haisheng can share a design doc once >>> we >>>>>>> have more concrete ideas. >>>>>>> >>>>>>> >>>>>>>> On Oct 31, 2019, at 11:37 AM, Jinfeng Ni <j...@apache.org> wrote: >>>>>>>> >>>>>>>> Hi Vladimir, >>>>>>>> >>>>>>>> The SubsetTransformer interface and the iterating over the RelNodes >>>>>>>> within a RelSubset in Drill is exactly implemented to do the trait >>>>>>>> propagation. We also had to rely on AbstractConverter to fire >>>>>>>> necessary rule to avoid the CanNotPlan issue. At some point, Calcite >>>>>>>> community chooses to remove AbstractConverter and Drill had to add it >>>>>>>> back, which is probably one of the main reasons for us to continue >>>>>>>> using a Calcite fork. I still remember we constantly had to deal >>>>> with >>>>>>>> the dilemma between "CanNotPlan" and long planing time due to >>>>> explored >>>>>>>> search space. >>>>>>>> >>>>>>>> Glad to see more people are joining the effort to solve this long >>>>>>>> overdue issue, something missing in Calcite's core optimizer >>>>> framework >>>>>>>> "since before Calcite was Calcite" (Jacques's words). >>>>>>>> >>>>>>>> Jinfeng >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov <ppoze...@gmail.com> >>>>>>> wrote: >>>>>>>>> >>>>>>>>> Hi Danny, >>>>>>>>> >>>>>>>>> Thank you very much for the links. What is described here is pretty >>>>>> much >>>>>>>>> similar to the problem I describe. Especially the discussion about >>>>>> trait >>>>>>>>> propagation, as this is basically what I need - to explore potential >>>>>>> traits >>>>>>>>> of children before optimizing parents. And this is basically what >>>>>> Drill >>>>>>>>> already does with it's SubsetTransformer: >>>>>>>>> 1) There is a SubsetTransformer interface, which iterates over >>>>>> physical >>>>>>>>> relations of the given subset [1] >>>>>>>>> 2) If you want to make a physical project, you iterate over physical >>>>>>>>> relations of the input subset and create possible physical projects >>>>>> [2] >>>>>>>>> 3) But if you cannot find any physical input, then you trigger >>>>>> creation >>>>>>> of >>>>>>>>> a "bad" physical project, which is very likely to have poor cost >>>>>>> because it >>>>>>>>> cannot take advantage of input's distribution information [3] >>>>>>>>> So, step 2 - is a trait set propagation which is needed by many >>>>>>>>> distributed engines. Step 3 - an attempt to workaround current >>>>>>>>> VolcanoPlanner behavior, when a parent rule is fired only if >>>>> produced >>>>>>> child >>>>>>>>> node has compatible trait set. >>>>>>>>> >>>>>>>>> I do not know Calcite's architecture that good but at first glance, >>>>>> the >>>>>>>>> proposed ability to re-fire rules of a specific Rel seems good >>>>> enough. >>>>>>> It >>>>>>>>> doesn't expand search space, because no new nodes are created, and >>>>> it >>>>>>> seems >>>>>>>>> to be relatively simple to implement. >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> >>>>>>> >>>>>> >>>>> >>> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java >>>>>>>>> [2] >>>>>>>>> >>>>>>> >>>>>> >>>>> >>> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 >>>>>>>>> [3] >>>>>>>>> >>>>>>> >>>>>> >>>>> >>> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 >>>>>>>>> >>>>>>>>> чт, 31 окт. 2019 г. в 12:21, Danny Chan <yuzhao....@gmail.com>: >>>>>>>>> >>>>>>>>>> Thanks Vladimir for bringing up this discussion ! >>>>>>>>>> >>>>>>>>>> Did you notice that there is a JIRA issue about this problem ? [1] >>>>>>> Also a >>>>>>>>>> discussion about how to propagate the traits [2] >>>>>>>>>> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-2970 >>>>>>>>>> [2] >>>>>>>>>> >>>>>>> >>>>>> >>>>> >>> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Danny Chan >>>>>>>>>> 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov <ppoze...@gmail.com >>>>>> ,写道: >>>>>>>>>>> Hi colleagues, >>>>>>>>>>> >>>>>>>>>>> I would like to discuss with the community the possibility of >>>>>> adding a >>>>>>>>>> new >>>>>>>>>>> public method to VolcanoPlanner which will forcefully re-trigger >>>>> the >>>>>>>>>> rules >>>>>>>>>>> for the specific rel. This is a follow up of a discussion started >>>>> in >>>>>>> the >>>>>>>>>>> other thread [1]. >>>>>>>>>>> >>>>>>>>>>> **Problem statement** >>>>>>>>>>> When converting between conventions during optimization >>>>>> VolcanoPlanner >>>>>>>>>>> prefers the top-bottom approach, so that the nodes are converted >>>>>> from >>>>>>> the >>>>>>>>>>> root. But in some cases, the intermediate node must be converted >>>>>> after >>>>>>>>>> its >>>>>>>>>>> children. This is especially true for distributed SQL engines, >>>>> which >>>>>>> rely >>>>>>>>>>> on distribution traits during the optimization process: it is not >>>>>>>>>> possible >>>>>>>>>>> to efficiently choose a proper physical implementation of a parent >>>>>>> node >>>>>>>>>>> unless the physical representation of a child node is known. >>>>>>>>>>> >>>>>>>>>>> It seems that presently VolcanoPlanner cannot address such cases >>>>> by >>>>>>>>>>> default. Consider that we have two nodes and associated rules >>>>> which >>>>>>>>>> convert >>>>>>>>>>> them to physical counterparts: >>>>>>>>>>> [LogicalParent <- LogicalChild] >>>>>>>>>>> The parent should be converted after the child. When the child is >>>>>>>>>>> converted, the physical node is created: >>>>>>>>>>> [LogicalParent <- {LogicalChild, PhysicalChild}] >>>>>>>>>>> In order to finish the optimization process, we need to convert >>>>> the >>>>>>>>>> parent. >>>>>>>>>>> But parent rules are not fired, because PhysicalChild has traits >>>>>>>>>>> incompatible with LogicalParent. >>>>>>>>>>> >>>>>>>>>>> Presently the problem could be solved in two ways: >>>>>>>>>>> 1) Always produce conversions when going top-down. In this case, I >>>>>>> first >>>>>>>>>>> create a physical parent, then a physical child. The problem is >>>>> that >>>>>>>>>>> created parent is not optimal because it didn't know child >>>>>>> distribution >>>>>>>>>> at >>>>>>>>>>> the time of creation. So the full flow would be: create a bad >>>>>> parent, >>>>>>>>>>> create a good child, create a good parent. >>>>>>>>>>> 1.1) [LogicalParent <- LogicalChild] >>>>>>>>>>> 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] >>>>>>>>>>> 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, >>>>>>>>>> PhysicalChild}] >>>>>>>>>>> 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- >>>>>>>>>>> {LogicalChild, PhysicalChild}] >>>>>>>>>>> What is worse, the creation of a not optimal parent will trigger >>>>>>> rules on >>>>>>>>>>> its parent, which in turn may create a not optimal parent-parent >>>>>> node, >>>>>>>>>> etc. >>>>>>>>>>> >>>>>>>>>>> 2) Make sure that your convention returns true for >>>>>>>>>>> Convention.canConvertConvention. In this case, every time a new >>>>> rel >>>>>> is >>>>>>>>>>> added to a RelSet, its traitset will be compared to traitsets of >>>>> all >>>>>>>>>> other >>>>>>>>>>> rels in the set. For every pair of traitset we may ask the engine >>>>> to >>>>>>>>>> create >>>>>>>>>>> a relevant AbstractConverter. The net effect is that >>>>>>>>>> "physical-to-logical" >>>>>>>>>>> converter is created, which re-triggers the rule on the logical >>>>>> parent >>>>>>>>>>> since their conventions are compatible: >>>>>>>>>>> 2.1) [LogicalParent <- LogicalChild] >>>>>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] >>>>>>>>>>> 2.3) [LogicalParent <- {LogicalChild, PhysicalChild, >>>>>>>>>>> PhysicalToLogicalConverter}] >>>>>>>>>>> 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild, >>>>>> PhysicalChild, >>>>>>>>>>> PhysicalToLogicalConverter}] >>>>>>>>>>> >>>>>>>>>>> The problem with that approach is that it is too coarse-grained >>>>>> since >>>>>>> we >>>>>>>>>>> operate on traitsets rather than rels. As a result, additional >>>>>> memory >>>>>>> and >>>>>>>>>>> CPU pressure are introduced because usually too many converters >>>>> are >>>>>>>>>>> created, which triggers the rules over and over again. >>>>>>>>>>> >>>>>>>>>>> **Affected products** >>>>>>>>>>> At the moment two distributed engines are being developed for >>>>>>> Hazelcast >>>>>>>>>> and >>>>>>>>>>> Apache Ignite. Both require bottom-up optimization and currently >>>>>> rely >>>>>>> on >>>>>>>>>>> the second workaround. >>>>>>>>>>> Another example is Apache Drill. I do not know whether its >>>>> community >>>>>>> is >>>>>>>>>>> concerned with the issue, but it also uses bottom-up optimization >>>>>> for >>>>>>>>>> many >>>>>>>>>>> rules and employs both the aforementioned workarounds. As a >>>>> result, >>>>>> I >>>>>>>>>> guess >>>>>>>>>>> that Drill's optimizer also creates too many rels during >>>>>> optimization >>>>>>> and >>>>>>>>>>> suffer from huge search space. Please correct me if I am wrong. >>>>>>>>>>> >>>>>>>>>>> **Proposal** >>>>>>>>>>> The key problem is that there is no way to re-fire rules on a >>>>>> specific >>>>>>>>>>> node. The original problem could have been solved, if it would be >>>>>>>>>> possible >>>>>>>>>>> to re-fire rules on a LogicalParent without creating any >>>>> additional >>>>>>> rels. >>>>>>>>>>> That would lead to a clear optimization flow: >>>>>>>>>>> 2.1) [LogicalParent <- LogicalChild] >>>>>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] >>>>>>>>>>> 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild, >>>>>>> PhysicalChild}] >>>>>>>>>>> >>>>>>>>>>> We can add the following method to VolcanoPlanner (RelOptPlanner?) >>>>>>>>>>> interface: >>>>>>>>>>> void fireRules(RelNode rel) >>>>>>>>>>> >>>>>>>>>>> This method will fire the rules on a passed node in a deferred >>>>> mode >>>>>>> as if >>>>>>>>>>> it was the new node just added to the optimizer. This would >>>>> require >>>>>>>>>> slight >>>>>>>>>>> changes to RuleQueue.addMatch method, and possibly some other >>>>>> places. >>>>>>>>>>> >>>>>>>>>>> Usage example: >>>>>>>>>>> class PhysicalChildRule extends RelOptRule { >>>>>>>>>>> void onMatch(RelOptRuleCall call) { >>>>>>>>>>> LogicalChild logicalRel = call.get(0); >>>>>>>>>>> PhysicalChild physicalRel = ...; >>>>>>>>>>> >>>>>>>>>>> call.transformTo(physicalRel); >>>>>>>>>>> optimizer.fireRules(logicalRel); >>>>>>>>>>> } >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> What does the community think about such a method? Does it make >>>>> any >>>>>>> sense >>>>>>>>>>> to you? If not, do you aware of any other ways on how to organize >>>>>>>>>> bottom-up >>>>>>>>>>> optimization with VolcanoPlanner without the creation of >>>>> additional >>>>>>> rels? >>>>>>>>>>> >>>>>>>>>>> If the community is OK in general, I can create try to create a PR >>>>>>> with a >>>>>>>>>>> prototype. >>>>>>>>>>> >>>>>>>>>>> Would appreciate your feedback. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Vladimir. >>>>>>>>>>> >>>>>>>>>>> [1] >>>>>>>>>>> >>>>>>>>>> >>>>>>> >>>>>> >>>>> >>> https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E >>>>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>> >>>