Re: [DISCUSS] Towards Cascades Optimizer

Haisheng Yuan Sun, 26 Apr 2020 14:14:39 -0700

Hi Roman,

Excellent! This is definitely a helpful contribution to the Calcite community.
Thank you for your endeavors.


Haisheng

On 2020/04/26 19:25:00, Roman Kondakov <[email protected]> wrote: 
> Hi everyone!
> 
> Haisheng, thank you for bringing this subject up. A new Cascades-style
> optimizer should be definitely the next step for Apache Calcite. Many
> projects suffer from the lack of this kind of optimizer.
> 
> That was the reason why several weeks ago I started working on the
> prototype of Cascades optimizer for Apache Calcite. I was not sure that
> I would build something useful without much experience in this area. But
> now I'd like to share my work with the community. You can find the
> Cascades prototype in PR [1]. This prototype is based on the well-known
> paper [2].
> 
> What prototype can do:
> - Top-down trait request
> - Convert traits without Abstract converters
> - Top-down rule apply
> - Bottom-up trait derivation
> 
> What is not supported yet:
> - Search space pruning (but I'm going to fix it in the next commits)
> - Materialized views
> - Planner hooks
> - Hints
> - Backward compatibility with Volcano planner (some research needed)
> 
> I prepared a design doc for this planner [3], you can find many details
> there. I also opened it for comments.
> 
> I've written several basic test cases in
> org.apache.calcite.plan.cascades.CascadesPlannerTest including that was
> discussed in this thread previously in the context of trait requests:
> 
> >  MergeJoin hash[a] 
> >           |---- TableScan R hash[a] (RelSubset)
> >           +---- TableScan S hash[a] (RelSubset)
> 
> 
> Haisheng, this is very similar to what you propose. Answering your question:
> 
> > There are 2 ways:
> > a) Modify on current VolcanoPlanner.
> >   Pros: code reuse, existing numerous test cases and infrastructure, fast 
> > integration
> >   Cons: changing code always brings risk
> > 
> > b) Add a new XXXXPlanner
> >   Pros: no risk, no tech debt, no need to worry about backward compatability
> >   Cons: separate test cases for new planner, one more planner to maintain
> 
> I've chosen the second approach. Because I don't have clear
> understanding how to fix Volcano planner gradually.
> 
> This new planner is very far from perfect, but I think it can be a good
> starting point for community.
> 
> Please, share your thoughts about this planner.
> 
> 
> [1] PR: https://github.com/apache/calcite/pull/1948
> [2] Paper:
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> [3] Design doc:
> https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing
> 
> 
> 
> -- 
> Kind Regards
> Roman Kondakov
> 
> 
> On 22.04.2020 09:52, Danny Chan wrote:
> >> Is there any recommended approach to make that happen smoothly besides
> > coding and testing work? We need to be aware that the new planner might be
> > co-exist with VolcanoPlanner for 5 or more years, or even never replace
> > VolcanoPlanner.
> > 
> > If that is true, i might say the new planner is probably with a not that
> > good design, we expect to see in advance for what cases/reasons user has
> > the reason to keep the old VolcanoPlanner and we *must* give a solution for
> > those problems in the new design.
> > 
> > I was expecting that migrating to a new planner would at least take 1 year
> > for developing, if that is true, modifying directly based on current
> > planner means for the near future 3~4 versions Calcite, there would bring
> > in huge plan changes/bugs for each release which i believe all the users of
> > Calcite don't want to see. And on one can guarantee that modifying directly
> > can keep good stability and compatibility, only the test set do.
> > 
> > From the experience of Alibaba Blink planner which has contributed to
> > Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
> > For the reasons that the new and old planner has different ability in some
> > corner cases.
> > 
> > From my point of view, we should at least:
> > - Give a convincing test set for the new planner that makes us believe the
> > new planner is stable and powerful enough. I mean obviously the current
> > rule tests are far away from enough to support the new planner
> > - We should give a more detailed design doc about the new planner,
> > especially about the interfaces changes and any change that would bring in
> > the compatibility problem. Then we can make more accurate decision how much
> > work the new planner would bring in, until then, we can decide if switch to
> > a pure new planner development is a good idea or modify the existing one.
> > 
> > 
> > Haisheng Yuan <[email protected]> 于2020年4月22日周三 上午9:45写道：
> > 
> >> Hi Andrii,
> >>
> >>> Obviously, from what is written here, I could guess that this would
> >> require me to change my physical planning rules, even if only by
> >> implementing a marker interface.
> >> You don't need to change your physical rules, it will be treated as equal
> >> as logical rules and be applied together with the real logical rules, no
> >> more logical/physical rules difference. This is also how current
> >> VolcanoPlanner works.
> >>
> >>> I don't want you to think that I somehow resent the changes you are
> >> pushing.
> >> Don't get me wrong. I am seriously thinking of revert these changes, since
> >> most people like the idea of adding new planner, why don't we make all the
> >> plan changes in the new planner, instead of forcing people changing test
> >> cases for the code changes that they might not need in VolcanoPlanner
> >> during upgrade.
> >>
> >> I didn't intend to replace VolcanoPlanner, thought just change the search
> >> strategy and add trait derivation mechanism, because most of the code in
> >> VolcanoPlanner can be reused. But since many agree to add new planner and
> >> replace VolcanoPlanner as the final goal, I won't be against most people's
> >> decision.
> >>
> >> Is there any recommended approach to make that happen smoothly besides
> >> coding and testing work? We need to be aware that the new planner might be
> >> co-exist with VolcanoPlanner for 5 or more years, or even never replace
> >> VolcanoPlanner.
> >>
> >> More thoughts are welcome.
> >>
> >> Haisheng
> >>
> >> On 2020/04/21 19:56:25, Андрей Цвелодуб <[email protected]> wrote:
> >>> Hello Haisheng,
> >>>
> >>>> To keep backward compatibility, all the un-marked rules will be treated
> >>> as logical rules, except rules that uses AbstractConverter as rule
> >> operand,
> >>> these rules still need to applied top-down, or random order.
> >>> Obviously, from what is written here, I could guess that this would
> >> require
> >>> me to change my physical planning rules, even if only by implementing a
> >>> marker interface. I am not saying this is a bad thing, but this is a
> >> thing
> >>> that should be communicated and planned ahead in case the VolcanoPlanner
> >> is
> >>> modified.
> >>>
> >>>> Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> >>> because they will cause another tons of plan changes.
> >>> I see you are still bitter due to all the discussions on this list
> >> lately,
> >>> I'm sorry. I don't want you to think that I somehow resent the changes
> >> you
> >>> are pushing, au contraire I support them and would be happy to help if I
> >>> can. I just want the process of these changes to be executed in the best
> >>> possible way.
> >>> As I see there are already several opinions in this thread that basically
> >>> align with what I am saying, so I guess I am not the crazy guy running
> >>> around and yelling "the end is nigh!".
> >>>
> >>> Thank you for taking these mumbled thoughts into account.
> >>>
> >>> Bestest Regards,
> >>> Andrii Tsvielodub
> >>>
> >>> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan <[email protected]> wrote:
> >>>
> >>>> Hi Andrii,
> >>>>
> >>>>> I guess changing the planner would lead to changes in tons of rules
> >> and
> >>>> even more tests.
> >>>> Obviously you didn't read through my email. You are not required to do
> >> any
> >>>> changes to your rule if you don't want to, but if you do, just need to
> >> mark
> >>>> the rule to tell planner whether it is a physical rule or not, simply
> >> by
> >>>> implementing an empty interface.
> >>>>
> >>>>> many on this list already experienced problems with upgrading even
> >>>> between the minor versions of Calcite.
> >>>> Sorry to see the problem you have experienced when upgrading Calcite.
> >>>> Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> >>>> because they will cause another tons of plan changes.
> >>>>
> >>>> But I will see if I can add a setting to use the old search strategy,
> >>>> which can be left untouched.
> >>>>
> >>>> Haisheng
> >>>>
> >>>> On 2020/04/21 06:33:08, Андрей Цвелодуб <[email protected]> wrote:
> >>>>> Hello everyone,
> >>>>>
> >>>>> First of all, thanks for this great effort of improving the core
> >> parts of
> >>>>> the framework we all are using,
> >>>>> I believe this is long overdue and hope this will have benefits both
> >> for
> >>>>> the maintainers and users of the library.
> >>>>>
> >>>>> I don't have anything to say about the general idea at the moment,
> >>>>> but I want to make a point that maintaining the old implementation of
> >>>>> VolcanoPlanner during
> >>>>> the initial stages of implementing the new planner is absolutely
> >>>> CRITICAL.
> >>>>> As a lot of users of Calcite do various customizations to the
> >> engine, to
> >>>>> the rules
> >>>>> and all that is there in between, I believe changing the
> >> implementation
> >>>> of
> >>>>> the core component
> >>>>> would have a huge impact on most users of the library. I think many
> >> on
> >>>> this
> >>>>> list
> >>>>> already experienced problems with upgrading even between the minor
> >>>> versions
> >>>>> of Calcite,
> >>>>> so I guess changing the planner would lead to changes in tons of
> >> rules
> >>>> and
> >>>>> even more tests.
> >>>>>
> >>>>> I don't have anything against replacing VolcanoPlanner as a final
> >> goal of
> >>>>> this effort,
> >>>>> but I don't think that modifying it directly and merging it to
> >> master is
> >>>> a
> >>>>> viable development approach.
> >>>>> While I understand how burdensome it is to maintain several parallel
> >> core
> >>>>> components at once
> >>>>> (we did this while moving the engine of our product to Calcite), we
> >>>> should
> >>>>> still respect those who depend
> >>>>> on it and not introduce the risks related to the development of a new
> >>>>> component into existing processing flows.
> >>>>>
> >>>>> A good model to try to follow would be the way new Garbage
> >> Collectors are
> >>>>> introduced in Java.
> >>>>> First, add it as an experimental option, then make it generally
> >>>> available,
> >>>>> then after everyone agrees
> >>>>> this is the best option - make it the default one.
> >>>>> With this approach, everyone can then move to the new planner at
> >> their
> >>>> own
> >>>>> pace, guaranteeing a smooth transition overall.
> >>>>> Yes, this could take some time, maybe even a year, but this is the
> >> price
> >>>> of
> >>>>> doing major changes in a popular framework.
> >>>>>
> >>>>> Again, thank you for initiating this discussion and leading this
> >> effort.
> >>>>>
> >>>>> Best Regards,
> >>>>> Andrii Tsvielodub
> >>>>>
> >>>>> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu <[email protected]>
> >> wrote:
> >>>>>
> >>>>>> Hi, Xiening.
> >>>>>>
> >>>>>> Regarding calculating the logical cost, here are some ways I
> >> though:
> >>>>>> 1. Logical rel may implement their own computeSelfCost method. Some
> >>>>>> rels can provide such information, for example the
> >>>>>> LogicalProject/LogicalFilter contains nearly the same information
> >> as
> >>>> their
> >>>>>> physical implementations. If we don't have enough confidence, just
> >>>> return
> >>>>>> zeroCost is also OK, as it only affects pruning.
> >>>>>> 2. Logical rel  tells its parents what its physical input could be
> >>>> after
> >>>>>> implementation. Then the problem come back to calculating lower
> >> bound
> >>>> of a
> >>>>>> physical rel.
> >>>>>> There should always be ways. The only problem is how to find a
> >> pretty
> >>>> one.
> >>>>>>
> >>>>>> Regarding the risk, new planner do have different risk. It is not
> >>>> because
> >>>>>> new planner could stop us doing something wrong but we can decide
> >> when
> >>>> to
> >>>>>> use the new one. Some scenarios:
> >>>>>> 1. If modifying the VolcanoPlanner directly, the only way user
> >> could
> >>>>>> control the risk is not to upgrade calcite version until it is
> >>>> considered
> >>>>>> stable. You know, it is quite different from keeping calcite
> >> updated
> >>>> and
> >>>>>> switching to the new planner at a proper time.
> >>>>>> 2. It is very importance for SLA control. For the important
> >> business
> >>>> and
> >>>>>> jobs, we may keep using the old and stable planner. And use the
> >> new one
> >>>>>> only for jobs that have fault tolerance. And this helps testing new
> >>>> planner
> >>>>>> with actual scenarios.
> >>>>>> 3. It is helpful when upgrading online services. When the new
> >> planner
> >>>>>> happened to have some bugs, we can switch to the old planner
> >> directly
> >>>>>> without rollback the whole service.
> >>>>>> 4. With all these ways to prevent issues becoming disasters, we
> >> are not
> >>>>>> vulnerable to making mistakes. This not only enables faster
> >> iterations
> >>>> but
> >>>>>> also let us have enough time to resolve big bugs, like considering
> >> it
> >>>> in
> >>>>>> detail and applying a time-consuming refactoring for it. To work
> >>>> around a
> >>>>>> critical bug using tricky ways usually introduces more issues.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Jinpeng
> >>>>>>
> >>>>>> On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai <[email protected]>
> >>>> wrote:
> >>>>>>
> >>>>>>> Hi Jinpeng,
> >>>>>>>
> >>>>>>> Regarding this comment - I believe there are ways to calculate
> >> the
> >>>>>> logical
> >>>>>>> cost, but I think it’s not that simple as  "cardinality *
> >>>>>> unit_copy_cost.”,
> >>>>>>> would  you provide more details of other different ways? Just the
> >>>>>> algorithm
> >>>>>>> description or pseudo code would help us understand. Thanks.
> >>>>>>>
> >>>>>>> Regarding the approach of creating new planner, I don’t think a
> >> new
> >>>>>>> planner would lower the risk. We don’t know what we don’t know.
> >> If we
> >>>>>>> introduced an issue while modifying the planner, most likely we
> >>>> would do
> >>>>>>> the same with new planner class. A new planner doesn’t
> >> necessarily
> >>>>>> prevent
> >>>>>>> the issue from happening, but just delay its surfacing, which is
> >>>> worse
> >>>>>> IMO.
> >>>>>>>
> >>>>>>> There’s one obvious benefit with new planner is that we can
> >> provide
> >>>> some
> >>>>>>> sort of isolation so the change won’t cause test baseline
> >> updates,
> >>>> which
> >>>>>>> could be painful at times.  We should see if we could use planner
> >>>> config
> >>>>>> to
> >>>>>>> achieve the same if we decide to just modify the current planner.
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Apr 20, 2020, at 8:32 AM, Haisheng Yuan <[email protected]>
> >>>> wrote:
> >>>>>>>>
> >>>>>>>> Igor,
> >>>>>>>>
> >>>>>>>> That's great.
> >>>>>>>>
> >>>>>>>> On 2020/04/20 11:17:49, Seliverstov Igor <[email protected]
> >>>
> >>>> wrote:
> >>>>>>>>> Haisheng, Xiening,
> >>>>>>>>>
> >>>>>>>>> Ok, Now I see how it should work.
> >>>>>>>>>
> >>>>>>>>> Thanks for your replies.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Igor
> >>>>>>>>>
> >>>>>>>>>> 20 апр. 2020 г., в 09:56, Seliverstov Igor <
> >> [email protected]
> >>>>>
> >>>>>>> написал(а):
> >>>>>>>>>>
> >>>>>>>>>> Haisheng, Xiening,
> >>>>>>>>>>
> >>>>>>>>>> Thanks for clarifying.
> >>>>>>>>>>
> >>>>>>>>>> In this proposal, we are not trying to split logical and
> >> physical
> >>>>>>> planning entirely. - actually I was in doubt about an idea of
> >> entire
> >>>>>>> splitting logical and physical phases, if you aren't going to, I
> >>>> have no
> >>>>>>> objections.
> >>>>>>>>>>
> >>>>>>>>>> But it returns me to my first question: how we will propagate
> >>>> traits
> >>>>>>> in bottom-up manner using proposed approach. (See [DISCUSS]
> >> Proposal
> >>>> to
> >>>>>> add
> >>>>>>> API to force rules matching specific rels for details)
> >>>>>>>>>>
> >>>>>>>>>> One of inconveniences of current VolcanoPlanner
> >> implementation is
> >>>>>>> amount of tricks that we need to get desired behaviour. It would
> >> be
> >>>> great
> >>>>>>> if some of issues (or all of them) were solved in the new
> >> approach.
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Igor
> >>>>>>>>>>
> >>>>>>>>>> пн, 20 апр. 2020 г., 7:02 Xiening Dai <[email protected]
> >>>> <mailto:
> >>>>>>> [email protected]>>:
> >>>>>>>>>> Hi Igor,
> >>>>>>>>>>
> >>>>>>>>>> Your comment - "because actual cost may be calculated
> >> correctly
> >>>> using
> >>>>>>> physical operators only. So won't be able to implement Branch and
> >>>> Bound
> >>>>>>> Space Pruning.“ is actually not true. In Cascade’s lower bound /
> >>>> upper
> >>>>>>> bound pruning algorithm, you can get cost lower bound of input
> >>>> RelNode
> >>>>>>> using cardinality * unit_copy_cost. The unit_copy_cost is a
> >> constant,
> >>>>>> which
> >>>>>>> stands for the minimal cost for processing a tuple from the
> >> input. So
> >>>>>> this
> >>>>>>> will be the minimal cost the input RelNode can achieve, and if
> >> this
> >>>> is
> >>>>>>> indeed larger than the current best cost, this input node can be
> >>>> pruned.
> >>>>>>>>>>
> >>>>>>>>>> In this proposal, we are not trying to split logical and
> >> physical
> >>>>>>> planning entirely. But for any given equivalent set, we would
> >> need to
> >>>>>>> finish exploration before implementation. But for the entire memo
> >>>> tree,
> >>>>>>> each set could be in different planning stage, and an equivalent
> >> set
> >>>> can
> >>>>>> be
> >>>>>>> pruned even before it’s implemented. A text book example of
> >>>>>> aforementioned
> >>>>>>> “lower bound / upper bound pruning” would be the join ordering
> >> case.
> >>>>>>>>>>
> >>>>>>>>>> Regarding #3, I think we can still achieve that (partially)
> >>>> through
> >>>>>>> this proposal. Remember every time when we start the
> >> optimization, we
> >>>>>> pass
> >>>>>>> down an upper bound limit. Initially this upper bound for root is
> >>>>>> infinite
> >>>>>>> - meaning that no feasible plan is available. Every time when we
> >>>> find a
> >>>>>>> physical plan we update this upper bound, then start the search
> >>>> again. We
> >>>>>>> could stop the search when the cost is less than a pre-defined
> >>>> threshold
> >>>>>> -
> >>>>>>> which gives you a “good enough” plan with early termination.
> >> Still
> >>>> this
> >>>>>>> wouldn't avoid the logical exploration. For that, you would
> >> probably
> >>>>>>> archive through rule configurations, and avoid some expansive
> >>>>>>> transformation to keep the cost down.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan <
> >> [email protected]
> >>>>>> <mailto:
> >>>>>>> [email protected]>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Igor,
> >>>>>>>>>>>
> >>>>>>>>>>> a) Given current Calcite's stats derivation strategy, mixing
> >>>> logical
> >>>>>>> and physical planning won't make it better. I hope you went
> >> through
> >>>> my
> >>>>>>> email to the end, currently operators inside a memo group don't
> >> share
> >>>>>> stats
> >>>>>>> info, each operator's stats may differ with the other one, and
> >> the
> >>>>>>> RelSubset's best node and stats may vary during optimization. So
> >>>> logical
> >>>>>>> and physical rule pruning are not safe at the moment, otherwise
> >> it
> >>>> almost
> >>>>>>> implies changing downstream project's cost model silently.
> >>>>>>>>>>>
> >>>>>>>>>>> On the other hand, ensuring child group exploration task
> >> finish
> >>>>>> first
> >>>>>>> will make rule mutual exclusivity check possible, like the
> >> result of
> >>>>>>> ReduceExpressionRule won't need trigger the same rule again, The
> >> join
> >>>>>>> result of JoinCommuteRule won't need trigger JoinCommuteRule and
> >>>>>>> ReduceExpressionRule again.
> >>>>>>>>>>>
> >>>>>>>>>>> More importantly, most if not all the the long planning
> >> queries
> >>>> in
> >>>>>>> Calcite are not caused by too many alternatives, but mutual
> >>>> triggering,
> >>>>>> or
> >>>>>>> cyclic triggering, which can be avoided by customizing the rules
> >>>> instead
> >>>>>> of
> >>>>>>> using the default one. Unless you use dynamic programming
> >> (Calcite
> >>>> use
> >>>>>>> heuristic) on join reordering (left-deep, right-deep, bushy),
> >> space
> >>>>>> pruning
> >>>>>>> won't help make long / infinite running query faster.
> >>>>>>>>>>>
> >>>>>>>>>>> b) No evidence shows current version of Calcite will return
> >> the
> >>>> most
> >>>>>>> promising plan in first planning iteration. Instead of praying
> >> for
> >>>>>> getting
> >>>>>>> good enough plan in the first iteration, why not focus on fixing
> >>>> rules
> >>>>>> that
> >>>>>>> causes the issue?
> >>>>>>>>>>>
> >>>>>>>>>>> c) That is not the goal.
> >>>>>>>>>>>
> >>>>>>>>>>> On 2020/04/19 15:14:57, Seliverstov Igor <
> >> [email protected]
> >>>>>>> <mailto:[email protected]>> wrote:
> >>>>>>>>>>>> Haisheng,
> >>>>>>>>>>>>
> >>>>>>>>>>>> From my point of view splitting logical and physical
> >> planning
> >>>> parts
> >>>>>>> isn’t a good idea.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I think so because actual cost may be calculated correctly
> >>>> using
> >>>>>>> physical operators only. So that we:
> >>>>>>>>>>>> a) won't be able to implement Branch and Bound Space
> >> Pruning
> >>>> (as
> >>>>>> far
> >>>>>>> as I understand, at exploring time there are no physical
> >> operators,
> >>>> no
> >>>>>>> correct costs, but assumptions only, I don’t think we should
> >> rely on
> >>>>>> them)
> >>>>>>>>>>>> b) won’t be able to get good enough plan (without Branch
> >> and
> >>>> Bound
> >>>>>>> Space Pruning it’s non-trivial task to get right search
> >> direction and
> >>>>>> most
> >>>>>>> promising plans in first planning iterations)
> >>>>>>>>>>>> c) won’t be able to tune the good enough plan during
> >> several
> >>>>>> similar
> >>>>>>> queries are executed
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Igor
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan <[email protected]
> >>>>>> <mailto:
> >>>>>>> [email protected]>> написал(а):
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Igor,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> You can't have your cake and eat it.
> >>>>>>>>>>>>> But one feasible work item definitely we can do is that
> >> once
> >>>>>>> timeout, stop exploring, use the first available physical
> >> operator in
> >>>>>> each
> >>>>>>> group and optimize it.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Because most, if not all, of the long / infinite running
> >>>>>>> optimizations are caused by project related transpose, join
> >>>> reordering
> >>>>>>> (join commute + associative), constant reduction for large
> >> expression
> >>>>>> (see
> >>>>>>> CALCITE-3460), all of which are logical transformations rules and
> >>>> many of
> >>>>>>> which have corresponding JIRAs. So another alternative is, let's
> >> fix
> >>>>>> these
> >>>>>>> bugs to improve Calcite to make timeout option less usable.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Another thing worth noting is that sql server optimizer
> >>>> timeout
> >>>>>>> mechanism is not based on clock time, but the number of
> >> optimization
> >>>>>> tasks
> >>>>>>> it has done [1].
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [1]
> >>>>>>>
> >>>>>>
> >>>>
> >> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> >>>>>>> <
> >>>>>>>
> >>>>>>
> >>>>
> >> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> >>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>> Haisheng
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 2020/04/19 11:31:27, Seliverstov Igor <
> >>>> [email protected]
> >>>>>>> <mailto:[email protected]>> wrote:
> >>>>>>>>>>>>>> Haisheng,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Ok, then such notification isn't needed
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> But in this case we don't have any control over how long
> >>>> planning
> >>>>>>> takes
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For some systems it's necessary to get good enough plan
> >>>> right now
> >>>>>>> instead
> >>>>>>>>>>>>>> of best one after while
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> For example we've been considering a case when a query is
> >>>>>>> optimised several
> >>>>>>>>>>>>>> times in short iterations in case it's impossible to get
> >> the
> >>>> best
> >>>>>>> plan in
> >>>>>>>>>>>>>> reasonable period of time (for example there is an SLA
> >> for
> >>>>>>> response time)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> This mean we need all needed physical implementations
> >> after
> >>>> each
> >>>>>>> logical
> >>>>>>>>>>>>>> transformation is applied.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan <
> >> [email protected]
> >>>>>>> <mailto:[email protected]>>:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Igor,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> There will be no rule queue anymore. Y will be fully
> >>>> explored
> >>>>>>> (logical
> >>>>>>>>>>>>>>> rules are matched and applied) before it can be
> >> implemented
> >>>> and
> >>>>>>> optimized.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Haisheng
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 2020/04/19 10:11:45, Seliverstov Igor <
> >>>> [email protected]
> >>>>>>> <mailto:[email protected]>> wrote:
> >>>>>>>>>>>>>>>> Hi Haisheng,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Great explanation, thanks.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> One thing I'd like to cover in advance is trait
> >> propagation
> >>>>>>> process (I
> >>>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>>> it for Apache Ignite SQL engine implementation).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> For example:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> There are two nodes: Rel X and its child node Rel Y
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Both nodes are in Optimized state, and there is a
> >> Logical
> >>>> rule
> >>>>>>> for Rel Y
> >>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> a rules queue matched,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> After logical rule is applied, there is a logical rel
> >> Rel
> >>>> Y'
> >>>>>> and
> >>>>>>>>>>>>>>> containing
> >>>>>>>>>>>>>>>> it set moves into Exploring state (since there is a
> >> logical
> >>>>>> node
> >>>>>>> which
> >>>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>> to be implemented)
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> After whole process Exploring -> ... -> Optimized we
> >> need
> >>>> to
> >>>>>>> force parent
> >>>>>>>>>>>>>>>> Rel X set to switch its state from Optimized to
> >> Optimizing
> >>>> to
> >>>>>>> peek the
> >>>>>>>>>>>>>>>> newly implemented child and (possible) produce a new
> >> Rel X'
> >>>>>>> physical rel
> >>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>> the basis of previously requested from the Rel X set
> >>>> traits.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> If a new Rel X' is produced, a Rel X' parent should
> >> move
> >>>> its
> >>>>>>> state
> >>>>>>>>>>>>>>>> Optimized -> Optimizing and repeat described above
> >>>> operations.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Does it look like true?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan <
> >>>>>> [email protected]
> >>>>>>> <mailto:[email protected]>>:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> In the past few months, we have discussed a lot about
> >>>> Cascades
> >>>>>>> style
> >>>>>>>>>>>>>>>>> top-down optimization, including on-demand trait
> >>>>>>> derivation/request,
> >>>>>>>>>>>>>>> rule
> >>>>>>>>>>>>>>>>> apply, branch and bound space pruning. Now we think
> >> it is
> >>>> time
> >>>>>>> to move
> >>>>>>>>>>>>>>>>> towards these targets.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We will separate it into several small issues, and
> >> each
> >>>> one
> >>>>>> can
> >>>>>>> be
> >>>>>>>>>>>>>>>>> integrated as a standalone, independent feature, and
> >> most
> >>>>>>> importantly,
> >>>>>>>>>>>>>>>>> meanwhile keep backward compatibility.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 1. Top-down trait request
> >>>>>>>>>>>>>>>>> In other words, pass traits requirements from parent
> >>>> nodes to
> >>>>>>> child
> >>>>>>>>>>>>>>> nodes.
> >>>>>>>>>>>>>>>>> The trait requests happens after all the logical
> >>>>>> transformation
> >>>>>>> rules
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> physical implementation rules are done, in a top-down
> >>>> manner,
> >>>>>>> driven
> >>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>> root set. e.g.:
> >>>>>>>>>>>>>>>>> SELECT a, sum(c) FROM
> >>>>>>>>>>>>>>>>>  (SELECT * FROM R JOIN S USING (a, b)) t
> >>>>>>>>>>>>>>>>> GROUP BY a;
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Suppose we have the following plan tree in the MEMO,
> >> and
> >>>> let's
> >>>>>>> only
> >>>>>>>>>>>>>>>>> consider distribution for simplicity, each group
> >>>> represents a
> >>>>>>> RelSet
> >>>>>>>>>>>>>>> in the
> >>>>>>>>>>>>>>>>> MEMO.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Group 1:     Agg on [a]
> >>>>>>>>>>>>>>>>> Group 2:           +-- MergeJoin on [a, b]
> >>>>>>>>>>>>>>>>> Group 3:                       |--- TableScan R
> >>>>>>>>>>>>>>>>> Group 4:                       +--- TableScan S
> >>>>>>>>>>>>>>>>> | Group No | Operator    | Derived Traits | Required
> >>>> Traits |
> >>>>>>>>>>>>>>>>> | ----------- | ------------- | --------------- |
> >>>>>>> --------------- |
> >>>>>>>>>>>>>>>>> | Group 1  | Aggregate    | Hash[a]         | N/A
> >>>>>>>   |
> >>>>>>>>>>>>>>>>> | Group 2  | MergeJoin    | Hash[a,b]      | Hash[a]
> >>>>>> |
> >>>>>>>>>>>>>>>>> | Group 3  | TableScan R | None            | Hash[a,b]
> >>>>    |
> >>>>>>>>>>>>>>>>> | Group 4  | TableScan S | None            | Hash[a,b]
> >>>>    |
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We will add new interface PhysicalNode (or extending
> >>>> RelNode)
> >>>>>>> with
> >>>>>>>>>>>>>>>>> methods:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> - Pair<RelTraitSet,List<RelTraitSet>>
> >>>>>> requireTraits(RelTraitSet
> >>>>>>>>>>>>>>> required);
> >>>>>>>>>>>>>>>>> pair.left is the current operator's new traitset,
> >>>> pair.right
> >>>>>> is
> >>>>>>> the
> >>>>>>>>>>>>>>> list
> >>>>>>>>>>>>>>>>> of children's required traitset.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> - RelNode passThrough(RelTraitSet required);
> >>>>>>>>>>>>>>>>> Default implementation will call above method
> >>>> requireTraits()
> >>>>>>> and
> >>>>>>>>>>>>>>>>> RelNode.copy() to create new RelNode. Available to be
> >>>>>> overriden
> >>>>>>> for
> >>>>>>>>>>>>>>>>> physical operators to customize the logic.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The planner will call above method on MergeJoin
> >> operator
> >>>> to
> >>>>>>> pass the
> >>>>>>>>>>>>>>>>> required traits (Hash[a]) to Mergejoin's child
> >> operators.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We will get a new MergeJoin:
> >>>>>>>>>>>>>>>>> MergeJoin hash[a]
> >>>>>>>>>>>>>>>>>        |---- TableScan R hash[a] (RelSubset)
> >>>>>>>>>>>>>>>>>        +---- TableScan S hash[a] (RelSubset)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Now the MEMO group looks like:
> >>>>>>>>>>>>>>>>> | Group No | Operator    | Derived Traits       |
> >> Required
> >>>>>>> Traits
> >>>>>>>>>>>>>>>  |
> >>>>>>>>>>>>>>>>> | ---------- | -------- ----- | -------------------- |
> >>>>>>>>>>>>>>>>> --------------------- |
> >>>>>>>>>>>>>>>>> | Group1   | Aggregate   | Hash[a]                 |
> >> N/A
> >>>>>>>>>>>>>>>>>      |
> >>>>>>>>>>>>>>>>> | Group2   | MergeJoin   | Hash[a,b], Hash[a]| Hash[a]
> >>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>> | Group3   | TableScan R | None                    |
> >>>>>> Hash[a,b],
> >>>>>>>>>>>>>>> Hash[a]
> >>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>> | Group4   | TableScan S | None                    |
> >>>>>> Hash[a,b],
> >>>>>>>>>>>>>>> Hash[a]
> >>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Calcite user may choose to ignore / not implement the
> >>>>>> interface
> >>>>>>> to keep
> >>>>>>>>>>>>>>>>> the original behavior. Each physical operator,
> >> according
> >>>> to
> >>>>>> its
> >>>>>>> own
> >>>>>>>>>>>>>>> logic,
> >>>>>>>>>>>>>>>>> decides whether passThrough() should pass traits down
> >> or
> >>>> not
> >>>>>> by
> >>>>>>>>>>>>>>> returning:
> >>>>>>>>>>>>>>>>> - a non-null RelNode, which means it can pass down
> >>>>>>>>>>>>>>>>> - null object, which means can't pass down
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2. Provide option to disable AbstractConverter
> >>>>>>>>>>>>>>>>> Once the plan can request traits in top-down way in
> >> the
> >>>>>>> framework, many
> >>>>>>>>>>>>>>>>> system don't need AbstractConverter anymore, since it
> >> is
> >>>> just
> >>>>>> a
> >>>>>>>>>>>>>>>>> intermediate operator to generate physical sort /
> >>>> exchange.
> >>>>>> For
> >>>>>>> those,
> >>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> can provide option to disable AbstractConverter,
> >> generate
> >>>>>>> physical
> >>>>>>>>>>>>>>> enforcer
> >>>>>>>>>>>>>>>>> directly by adding a method to interface Convention:
> >>>>>>>>>>>>>>>>> - RelNode enforce(RelNode node, RelTraitSet traits);
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The default implementation may just calling
> >>>>>>>>>>>>>>> changeTraitsUsingConverters(),
> >>>>>>>>>>>>>>>>> but people may choose to override it if the system has
> >>>> special
> >>>>>>> needs,
> >>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>> several traits must implement together, or the
> >> position of
> >>>>>>> collation in
> >>>>>>>>>>>>>>>>> RelTraitSet is before distribution.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 3. Top-down driven, on-demand rule apply
> >>>>>>>>>>>>>>>>> For every RelNode in a RelSet, rule is matched and
> >> applied
> >>>>>>>>>>>>>>> sequentially,
> >>>>>>>>>>>>>>>>> newly generated RelNodes are added to the end of
> >> RelNode
> >>>> list
> >>>>>>> in the
> >>>>>>>>>>>>>>> RelSet
> >>>>>>>>>>>>>>>>> waiting for rule apply. RuleQueue and
> >> DeferringRuleCall
> >>>> is not
> >>>>>>> needed
> >>>>>>>>>>>>>>>>> anymore. This will make space pruning and rule mutual
> >>>>>>> exclusivity check
> >>>>>>>>>>>>>>>>> possible.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> There are 3 stages for each RelSet:
> >>>>>>>>>>>>>>>>> a). Exploration: logical transformation, yields
> >> logical
> >>>> nodes
> >>>>>>>>>>>>>>>>> b). Implementation: physical transformation, yields
> >>>> physical
> >>>>>>> nodes
> >>>>>>>>>>>>>>>>> c). Optimization: trait request, enforcement
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The general process looks like:
> >>>>>>>>>>>>>>>>> - optimize RelSet X:
> >>>>>>>>>>>>>>>>> implement RelSet X
> >>>>>>>>>>>>>>>>>  for each physical relnode in RelSet X:
> >>>>>>>>>>>>>>>>>      // pass down trait requests to child RelSets
> >>>>>>>>>>>>>>>>>      for each child RelSet Y of current relnode:
> >>>>>>>>>>>>>>>>>          optimize RelSet Y
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> - implement RelSet X:
> >>>>>>>>>>>>>>>>>  if X has been implemented:
> >>>>>>>>>>>>>>>>>      return
> >>>>>>>>>>>>>>>>> explore RelSet X
> >>>>>>>>>>>>>>>>>  for each relnode in RelSet X:
> >>>>>>>>>>>>>>>>>      apply physical rules
> >>>>>>>>>>>>>>>>> - explore RelSet X:
> >>>>>>>>>>>>>>>>>   if X has been explored
> >>>>>>>>>>>>>>>>>      return
> >>>>>>>>>>>>>>>>> for each relnode in RelSet X:
> >>>>>>>>>>>>>>>>>      // ensure each child RelSet of current relnode is
> >>>>>> explored
> >>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode:
> >>>>>>>>>>>>>>>>>         explore RelSet Y
> >>>>>>>>>>>>>>>>>      apply logical rules on current relnode
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Basically it is a state machine with several states:
> >>>>>>> Initialized,
> >>>>>>>>>>>>>>>>> Explored, Exploring, Implemented, Implementing,
> >> Optimized,
> >>>>>>> Optimizing
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> several transition methods: exploreRelSet,
> >> exploreRelNode,
> >>>>>>>>>>>>>>> implementRelSet,
> >>>>>>>>>>>>>>>>> implementRelNode, optimizeRelSet, optimizeRelNode...
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> To achieve this, we need to mark the rules either
> >> logical
> >>>> rule
> >>>>>>> or
> >>>>>>>>>>>>>>> physical
> >>>>>>>>>>>>>>>>> rule.
> >>>>>>>>>>>>>>>>> To keep backward compatibility, all the un-marked
> >> rules
> >>>> will
> >>>>>> be
> >>>>>>>>>>>>>>> treated as
> >>>>>>>>>>>>>>>>> logical rules, except rules that uses
> >> AbstractConverter as
> >>>>>> rule
> >>>>>>>>>>>>>>> operand,
> >>>>>>>>>>>>>>>>> these rules still need to applied top-down, or random
> >>>> order.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 4. On-demand, bottom-up trait derivation
> >>>>>>>>>>>>>>>>> It is called bottom-up, but actually driven by
> >> top-down,
> >>>>>>> happens same
> >>>>>>>>>>>>>>> time
> >>>>>>>>>>>>>>>>> as top-down trait request, in optimization stage
> >> mentioned
> >>>>>>> above. Many
> >>>>>>>>>>>>>>>>> Calcite based bigdata system only propagate traits on
> >>>> Project
> >>>>>>> and
> >>>>>>>>>>>>>>> Filter by
> >>>>>>>>>>>>>>>>> writing rules, which is very limited. In fact, we can
> >>>> extend
> >>>>>>> trait
> >>>>>>>>>>>>>>>>> propagation/derivation to all the operators, without
> >>>> rules, by
> >>>>>>> adding
> >>>>>>>>>>>>>>>>> interface PhysicalNode (or extending RelNode) with
> >> method:
> >>>>>>>>>>>>>>>>> - RelNode derive(RelTraitSet traits, int childId);
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Given the following plan (only consider distribution
> >> for
> >>>>>>> simplicity):
> >>>>>>>>>>>>>>>>> Agg [a,b]
> >>>>>>>>>>>>>>>>>    +-- MergeJoin [a]
> >>>>>>>>>>>>>>>>>               |---- TableScan R
> >>>>>>>>>>>>>>>>>               +--- TableScan S
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Hash[a] won't satisfy Hash[a,b] without special
> >> treatment,
> >>>>>>> because
> >>>>>>>>>>>>>>> there
> >>>>>>>>>>>>>>>>> isn't a mechanism to coordinate traits between
> >> children.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Now we call derive method on Agg [a,b] node,
> >>>> derive(Hash[a],
> >>>>>>> 0), we get
> >>>>>>>>>>>>>>>>> the new node:
> >>>>>>>>>>>>>>>>> Agg [a]
> >>>>>>>>>>>>>>>>> +-- MergeJoin [a] (RelSubset)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We will provide different matching type, so each
> >> operator
> >>>> can
> >>>>>>> specify
> >>>>>>>>>>>>>>> what
> >>>>>>>>>>>>>>>>> kind of matching type it requires its children:
> >>>>>>>>>>>>>>>>> - MatchType getMatchType(RelTrait trait, int childId);
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> a) Exact: Hash[a,b] exact match Hash[a,b], aka,
> >> satisfy
> >>>>>>>>>>>>>>>>> b) Partial: Hash[a] partial match Hash[a,b]
> >>>>>>>>>>>>>>>>> c) Permuted: Sort[a,b,c] permuted match Sort[c,b,a]
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> In addition, optimization order is provided for each
> >>>> opertor
> >>>>>> to
> >>>>>>>>>>>>>>> specify:
> >>>>>>>>>>>>>>>>> a) left to right
> >>>>>>>>>>>>>>>>> b) right to left
> >>>>>>>>>>>>>>>>> c) both
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> For example, for query SELECT * FROM R join S on
> >> R.a=S.a
> >>>> and
> >>>>>>> R.b=S.b
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> R.c=S.c:
> >>>>>>>>>>>>>>>>> Suppose R is distributed by a,b, and S is distributed
> >> by
> >>>> c.
> >>>>>>>>>>>>>>>>> MergeJoin [a,b,c]
> >>>>>>>>>>>>>>>>>     |--- TableScan R [a,b]
> >>>>>>>>>>>>>>>>>     +-- TableScan S [c]
> >>>>>>>>>>>>>>>>> a) left to right, call derive(Hash[a,b], 0), we get
> >>>> MergeJoin
> >>>>>>> [a,b]
> >>>>>>>>>>>>>>>>> b) right to left, call derive(Hash[c], 1), we get
> >>>> MergeJoin
> >>>>>>> [c], most
> >>>>>>>>>>>>>>>>> likely a bad plan
> >>>>>>>>>>>>>>>>> c) both, get above 2 plans.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> For system that doesn't have a fine-tuned stats and
> >> cost
> >>>>>> model,
> >>>>>>> it may
> >>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> be able to make a right decision based purely on cost.
> >>>>>> Probably
> >>>>>>> we
> >>>>>>>>>>>>>>> need to
> >>>>>>>>>>>>>>>>> provide the MergeJoin with both children's derived
> >>>> traitset
> >>>>>>> list.
> >>>>>>>>>>>>>>>>> - List<RelNode> derive(List<List<RelTraitSet>>);
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Of course, all above methods are optional to
> >> implement for
> >>>>>>> those who
> >>>>>>>>>>>>>>>>> doesn't need this feature.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 5. Branch and Bound Space Pruning
> >>>>>>>>>>>>>>>>> After we implement on-demand, top-down trait
> >> enforcement
> >>>> and
> >>>>>>>>>>>>>>> rule-apply,
> >>>>>>>>>>>>>>>>> we can pass the cost limit at the time of passing down
> >>>>>> required
> >>>>>>>>>>>>>>> traits, as
> >>>>>>>>>>>>>>>>> described in the classical Cascades paper. Right now,
> >>>> Calcite
> >>>>>>> doesn't
> >>>>>>>>>>>>>>>>> provide group level logical properties, including
> >> stats
> >>>> info,
> >>>>>>> each
> >>>>>>>>>>>>>>> operator
> >>>>>>>>>>>>>>>>> in the same group has its own logical property and the
> >>>> stats
> >>>>>>> may vary,
> >>>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>>>> we can only do limited space pruning for trait
> >>>> enforcement,
> >>>>>>> still
> >>>>>>>>>>>>>>> good. But
> >>>>>>>>>>>>>>>>> if we agree to add option to share group level stats
> >>>> between
> >>>>>>> relnodes
> >>>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>> RelSet, we will be able to do more aggresive space
> >>>> pruning,
> >>>>>>> which will
> >>>>>>>>>>>>>>> help
> >>>>>>>>>>>>>>>>> boost the performance of join reorder planning.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> With all that being said, how do we move forward?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> There are 2 ways:
> >>>>>>>>>>>>>>>>> a) Modify on current VolcanoPlanner.
> >>>>>>>>>>>>>>>>> Pros: code reuse, existing numerous test cases and
> >>>>>>> infrastructure,
> >>>>>>>>>>>>>>> fast
> >>>>>>>>>>>>>>>>> integration
> >>>>>>>>>>>>>>>>> Cons: changing code always brings risk
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> b) Add a new XXXXPlanner
> >>>>>>>>>>>>>>>>> Pros: no risk, no tech debt, no need to worry about
> >>>> backward
> >>>>>>>>>>>>>>>>> compatability
> >>>>>>>>>>>>>>>>> Cons: separate test cases for new planner, one more
> >>>> planner to
> >>>>>>>>>>>>>>> maintain
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> We'd like to hear the community's thoughts and
> >> advices.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks.
> >>>>>>>>>>>>>>>>> - Haisheng
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> > 
> 
>

Re: [DISCUSS] Towards Cascades Optimizer

Reply via email to