Re: [DISCUSS] Towards Cascades Optimizer

Jinpeng Wu Mon, 11 May 2020 01:13:18 -0700

Hi, Roman.

Thanks to Julian's tips, I added a CoreCascadeQuidemTest to the code. It
runs iq files that CoreQuidemTest would runs and uses CascadePlanner
instead of VolcanoPlanner to generate physical plans. Currently all tests
have passed. There are some plan changes but they are actually equivalent
plans in different forms. I‘ve also committed those plan changes as '.cas'
files.
Moreover, if you would like to set -Dcalcite.cascade=true during tests, all
test cases will use the new planner for planning. There would be some
failures, typically because there's no common ways across all tests to tell
whether a RelNode is physical or logical.



When I was running the tests, I found that Haisheng's latest changes has a
conflict with my PR. The new planner relies on AbstractConverter to tell
whether a subset canConvert/needConvert to another subset. Haisheng's
change will remove AbstractConverter and break this. Currently, I switch
off top-down trait propagation in new planner to work around. I will merge
these two top-down processing together later on to completely address this
problem.
One thing to recall is that, while I am writing a new planner, Haisheng
modified the VolcanoPlanner directly and used a flag to switch on/off the
new feature. I need to decide how to merge them: to keep a separated
planner and invoke the new trait propagation interfaces or to modified the
volcano planner directly. I am trying the first way now. But of course, if
keeping one planner is preferred, I can also take the second way.


Thanks,
Jinpeng


On Fri, May 1, 2020 at 6:40 AM Haisheng Yuan <hy...@apache.org> wrote:

> Hi all,
>
> As planned in my proposal, I opened the pull request [1] for CALCITE-3896
> to achieve:
> 1. Top-down trait request
> 2. Bottom-up trait derivation
> 3. Trait enforcement without AbstractConverter
>
> The feature can be turned on or off by a flag, either through property
> config file or VolcanoPlanner set method. Since Calcite doesn't turn on
> AbstractConverter until latest master, I just disabled AbstractConverter by
> turning on top-down trait request, now all tests passed.
>
> In our system, 99 tpcds queries' test results show almost no plan diff,
> but the number of relnodes created during optimization is reduced by 10~15%
> average (even without space pruning). I believe for other systems using
> VolcanoPlanner, more than 20% reduction can be expected.
>
> It also has top-down rule apply in mind, later can be evolved to top-down
> rule apply and space pruning, e.g. integrating code from Jingpeng and
> Roman's. But the interface that is exposed to user, as described in the
> proposal, can remain the same.
>
> Haisheng
>
> [1] https://github.com/apache/calcite/pull/1953
>
>
> On 2020/04/30 18:01:26, Julian Hyde <jh...@apache.org> wrote:
> > If your test cases are SQL scripts, it might be fairly straightforward
> to port them to Quidem (.iq) files. Plenty of examples in
> https://github.com/apache/calcite/tree/master/core/src/test/resources/sql
> <https://github.com/apache/calcite/tree/master/core/src/test/resources/sql
> >.
> >
> > Quidem files are basically SQL scripts. Expected output is embedded in
> the script. You can run the script once, and if the output looks right,
> overwrite the input file with the output.
> >
> > Julian
> >
> >
> > > On Apr 30, 2020, at 3:26 AM, Jinpeng Wu <wjpabc...@gmail.com> wrote:
> > >
> > > Sure. I will add more cases to my PR.
> > >
> > > I did not design more cases because our own product has a test
> frameworks,
> > > which contains thousands of actual user queries.
> > > Calcite's code base is quite different. I cannot just migrate cases to
> > > calcite.  So it may take some time.
> > >
> > > On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov
> <kondako...@mail.ru.invalid>
> > > wrote:
> > >
> > >> Hi Jinpeng,
> > >>
> > >> I went through your PR and it seemed very impressive to me. It is very
> > >> similar to what I did, but you've reused many existing logic from the
> > >> Volcano planner. We should definitely stay in sync in our
> experiments. I
> > >> believe the future Cascades planner will be the kind combination of
> our
> > >> works.
> > >>
> > >> Is there any way to run tests that are close to the real system query
> > >> execution? May be with Enumerable convention, or, better, with
> > >> convention that supports distribution trait? I just want to look
> through
> > >> your planner's optimization steps more thoroughly. I've found some
> tests
> > >> in org.apache.calcite.plan.volcano package, but they use synthetic
> > >> conventions and nodes. May be I missed something.
> > >>
> > >> Thank you for sharing your work!
> > >>
> > >> --
> > >> Kind Regards
> > >> Roman Kondakov
> > >>
> > >>
> > >> On 28.04.2020 15:19, Jinpeng Wu wrote:
> > >>> Hi, Roman. It's great to see your proposal. Actually my team has also
> > >> been
> > >>> working on a cascade planner based on calcite.  And we already have
> some
> > >>> outcome as well.  Maybe we can combine our works.
> > >>>
> > >>> I've pushed my code as https://github.com/apache/calcite/pull/1950 .
> > >>>
> > >>> Our works have many places in common. We both developed a new
> > >>> CascadePlanner and avoid modifying the old VolcanoPlanner directly.
> We
> > >>> both implemented the top-down search strategy according to the
> > >>> Columnbia optimizer
> > >>> generator
> > >>> <
> > >>
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> > >>> 。But
> > >>> we also have some differences.
> > >>>
> > >>> The first difference is that I try to reuse the existing
> VolcanoPlanner
> > >> as
> > >>> much as possible. My CascadePlanner inherits from the existing
> > >>> VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan
> > >> method
> > >>> to rearrange rule applies, most logic generally inherit from
> > >>> VolcanoPlanner. For example,
> > >>>  - It reuses the RelSet and RelSubset class and the register method
> > >>>  - Rules are fired as soon as a RelNode is registered (In the
> > >>> Columnbia optimizer generator, rules are not fired until exploring).
> The
> > >>> ApplyRule task controls when to invoke the onMatch method of a
> RuleMatch.
> > >>> This design have a benefit that we do not need to worry about
> missing a
> > >>> rule or firing a rule multiple times.
> > >>>  - It leverages AbstractConverter to pass traits requirements down.
> So
> > >>> currently AC is still essential in my code.
> > >>> This makes the new planner highly compatible with the old
> VolcanoPlanner.
> > >>> Features like MV and Hints can apply to it directly.  And I tried to
> > >> change
> > >>> VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
> > >>> Several cases did fail. I know the reason and how to fix them. But I
> am
> > >>> still thinking about making them as "won't fix" as the ruleset
> violates
> > >>> some basic principles of top-down trait requests.
> > >>>
> > >>> The second difference is that our design have the ability for space
> > >>> pruning. Currently it contains a simply LowerBoundCost metadata to
> > >> compute
> > >>> the lower bound of a RelNdoe. Because logical properties like
> cardinality
> > >>> of a RelSet is not stable across exploring, it is required that a
> group
> > >> to
> > >>> be fully explored (implementation rules and enforcement rules should
> > >> never
> > >>> modify the logical properties) before it can provide a valid lower
> bound
> > >>> cost. Because of that, logical search space pruning is not supported
> now.
> > >>> It can only pruned out implementation rules and enforcement rules.
> > >> Testing
> > >>> with cases in our own product, the new planner saves about 10% rule
> > >>> applies. I am still considering how to support logical space pruning,
> > >>> looking forwards to have more improvements.
> > >>>
> > >>> Hope my code will help.
> > >>>
> > >>> Thanks,
> > >>> Jinpeng
> > >>>
> > >>>
> > >>> On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai <xndai....@gmail.com>
> > >> wrote:
> > >>>
> > >>>> For #1, aside from that we need to be able to build physical nodes
> based
> > >>>> on a convention. For example, if we merge two EnumerableProject, we
> > >> would
> > >>>> want to create an EnumerableProject as a result, instead of
> > >> LogicalProject.
> > >>>> The RelBuilder change I work on would help this case.
> > >>>>
> > >>>> For #2, I don’t think it’s just a bug. If the physical cost cannot
> be
> > >>>> reliable before transformation is finished, we should probably
> delay the
> > >>>> physical cost calculation, or we risk doing it over again. The other
> > >> way is
> > >>>> to complete RelSet transformation before implementing it - which is
> a
> > >>>> common practice in industry, including Orca.
> > >>>>
> > >>>> The multi-convention is a key scenario, and I agree we should
> support.
> > >> My
> > >>>> thinking is more about seperating logical one (Conventions.NONE)
> from
> > >>>> others.
> > >>>>
> > >>>>
> > >>>>> On Apr 27, 2020, at 4:59 PM, Julian Hyde <jh...@apache.org> wrote:
> > >>>>>
> > >>>>> Re 1. By all means have multiple instances of a rule (e.g. one
> instance
> > >>>> that matches LogicalFilter and another that matches FooFilter) and
> > >> enable
> > >>>> different instances during different phases. (We have been slow to
> > >> create
> > >>>> all of these variant instances, in part because of the difficulty of
> > >>>> changing the constructors of many existing rules. My proposed change
> > >>>> https://issues.apache.org/jira/browse/CALCITE-3923 <
> > >>>> https://issues.apache.org/jira/browse/CALCITE-3923> would make it
> > >> easier
> > >>>> to create new instances that are more precisely targeted.)
> > >>>>>
> > >>>>> Re 2. Sure, there are bugs.
> > >>>>>
> > >>>>> Re 3. That’s a good idea. The planner could have a
> "single-convention
> > >>>> mode", which might be faster, but would throw if it encountered a
> rel
> > >> in a
> > >>>> different convention.
> > >>>>>
> > >>>>> I think we’re in agreement - separating rules is a best practice.
> But
> > >> we
> > >>>> shouldn’t force that best practice on everyone. The multi-convention
> > >> case
> > >>>> is still crucial for planning hybrid queries (e.g. joining MySQL to
> > >>>> MongoDB).
> > >>>>>
> > >>>>> Julian
> > >>>>>
> > >>>>>
> > >>>>>> On Apr 27, 2020, at 4:28 PM, Xiening Dai <xndai....@gmail.com>
> wrote:
> > >>>>>>
> > >>>>>> Hi Julian,
> > >>>>>>
> > >>>>>> In my view, separating logic and physical rules have a number of
> > >>>> benefits -
> > >>>>>>
> > >>>>>> 1. With current design, a rule can match both physical and logical
> > >>>> nodes. This behavior could cause duplication of rule firings and
> > >> explosion
> > >>>> of memo and search space. There was a long discussion regarding this
> > >>>> (CALCITE-2223). Although the indefinitely rule matching problem is
> > >> fixed by
> > >>>> a separate change, the duplicate rule firing is not resolved.
> There's a
> > >>>> patch trying to address it (
> https://github.com/apache/calcite/pull/1543
> > >> <
> > >>>> https://github.com/apache/calcite/pull/1543>), but it still fell
> short
> > >>>> due to current design limitation.
> > >>>>>>
> > >>>>>> 2. We have a few meta inconsistency issues today which are due to
> the
> > >>>> reality that we don’t clearly define transformation phase. For
> example,
> > >> we
> > >>>> don’t have a solution for CALCITE-2166 as long as transformation can
> > >> still
> > >>>> apply to a RelSet after it’s been implemented, which means the group
> > >>>> logical properties (such as row count) can still change and
> invalidate
> > >> all
> > >>>> the previous best cost calculation, so best cost can increase
> (oops!).
> > >>>>>>
> > >>>>>> 3. The other benefit is the planner can choose shortcut a number
> of
> > >>>> expensive operations (such as RelSubSet registration, cost
> calculation
> > >> and
> > >>>> propagation, etc) during transformation phase, if it was clearly
> defined
> > >>>> and enforced by framework.
> > >>>>>>
> > >>>>>>
> > >>>>>>> On Apr 27, 2020, at 11:59 AM, Julian Hyde <jh...@apache.org>
> wrote:
> > >>>>>>>
> > >>>>>>> This thread has almost gotten too long to respond to. I confess
> I’ve
> > >>>> not read much of it. I’m going to reply anyway. Sorry.
> > >>>>>>>
> > >>>>>>> I support making Calcite’s optimizer support “Cascades”.
> > >>>>>>>
> > >>>>>>> We should keep the existing VolcanoPlanner working during the
> > >>>> transition, and perhaps longer. (I acknowledge that this will not be
> > >> easy.
> > >>>> But it will not be impossible, because we already have 2 planner
> > >> engines,
> > >>>> and going from 2 to 3 is much harder than going from 1 to 2.)
> > >>>>>>>
> > >>>>>>> I perhaps have a different philosophy than Haisheng + Cascades on
> > >>>> logical vs physical. I do believe that logical & physical rels and
> rules
> > >>>> are more similar than they are different, therefore they should
> have the
> > >>>> same classes, etc. Pragmatically, it makes a lot of sense to create
> rule
> > >>>> sets and planner phases that deal with just logical rels, or just
> > >> physical
> > >>>> rels. But that’s a best practice, not a rule.
> > >>>>>>>
> > >>>>>>> This philosophy manifested in a discussion a couple of weeks ago
> > >> about
> > >>>> whether RelBuilder should be able to create physical rels. I still
> > >> believe
> > >>>> that it should.
> > >>>>>>>
> > >>>>>>> Julian
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> On Apr 27, 2020, at 11:29 AM, Roman Kondakov
> > >>>> <kondako...@mail.ru.INVALID> wrote:
> > >>>>>>>>
> > >>>>>>>> Hi all,
> > >>>>>>>>
> > >>>>>>>> Stamatis, Haisheng thank you very much for your feedback! I
> really
> > >>>>>>>> appreciate it.
> > >>>>>>>>
> > >>>>>>>>> If in the new planner we end up copy-pasting code then I guess
> it
> > >>>> will be a bad idea.
> > >>>>>>>>
> > >>>>>>>> Yes, there are some code duplication between Volcano planner and
> > >>>>>>>> Cascades planner. I think I'll move it to some common
> superclass:
> > >>>> either
> > >>>>>>>> to existing AbstractRelOptPlanner or a new
> AbstractCostBasedPlanner.
> > >>>>>>>>
> > >>>>>>>>> Like that it will be easier for everybody to test and see if
> the
> > >>>> changes make this better or worse. From a backward compatibility
> > >>>> perspective it seems feasible to keep the new eatures configurable
> for a
> > >>>> certain amount of time.
> > >>>>>>>>
> > >>>>>>>> I was thinking about backward compatibility in this way: what
> if we
> > >>>> can
> > >>>>>>>> switch planner in tests by setting some flag (system property?)
> > >>>>>>>> somewhere and check what happens with new planner if we replace
> > >>>> Volcano
> > >>>>>>>> with it. And if ever it turns out that the new planner passes
> all
> > >>>>>>>> Volcano's tests and at the same time it works more efficiently,
> we
> > >> can
> > >>>>>>>> safely replace Volcano planner with Cascades planner.
> > >>>>>>>>
> > >>>>>>>>> A design doc would definitely help, especially if it has a few
> > >>>> end-to-end (from logical to physical plan) examples showing how the
> > >>>> optimizer works
> > >>>>>>>> at each step before/after the changes. This is actually what is
> > >>>> usually
> > >>>>>>>> missing in research papers that makes them hard to understand.
> > >>>>>>>>> I am thinking some similar to the examples that Haisheng send
> in
> > >> the
> > >>>> first  email but possibly a bit more detailed.
> > >>>>>>>>
> > >>>>>>>> I agree, I'll add some exmples to the design doc very soon.
> > >>>>>>>>
> > >>>>>>>>> I looked very briefly in the PR by Roman but I think I didn't
> see
> > >>>> tests where the final plan contains operators from multiple
> conventions.
> > >>>> Multiple conventions is among the choices that complicate certain
> parts
> > >> of
> > >>>> he existing planner so we should make sure that we take this into
> > >> account.
> > >>>>>>>>
> > >>>>>>>> It's on my radar. I'm going to add these tests.
> > >>>>>>>>
> > >>>>>>>> I would like to ask the commutnity about my next steps on the
> way
> > >> from
> > >>>>>>>> transition of the Cascades planner from the prototype status to
> > >>>> becoming
> > >>>>>>>> a part of the project. I see these steps like this:
> > >>>>>>>>
> > >>>>>>>> 1. Create a jira ticket.
> > >>>>>>>> 2. Update design document with examples.
> > >>>>>>>> 3. Make some research to obtain backward compatibility with
> Volcano
> > >>>>>>>> planner to be able to replace Volcano planner with Cascades
> planner
> > >> in
> > >>>>>>>> test and to understand the current porblems with planner.
> > >>>>>>>> 4. Solve known problems
> > >>>>>>>> - materialized views
> > >>>>>>>> - hints
> > >>>>>>>> - multiple convetions
> > >>>>>>>> - listener hooks
> > >>>>>>>> - problems from p.3.
> > >>>>>>>> 5. new PR, review and merge.
> > >>>>>>>> 6. Replacie Volcano planner wiith Cascades after several
> releases.
> > >>>>>>>>
> > >>>>>>>> What do you think about this roadmap?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>> Kind Regards
> > >>>>>>>> Roman Kondakov
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 27.04.2020 01:55, Stamatis Zampetakis wrote:
> > >>>>>>>>> Hi all,
> > >>>>>>>>>
> > >>>>>>>>> I am very excited about the ideas discussed so far and
> especially
> > >> by
> > >>>> the
> > >>>>>>>>> enthusiasm of many people that are ready to help for pulling
> this
> > >>>> out.
> > >>>>>>>>> I wouldn't except that we could have a prototype so quickly.
> > >>>>>>>>> Thanks a lot everyone!
> > >>>>>>>>>
> > >>>>>>>>> In the debate between creating new planner or patching the
> existing
> > >>>> one, I
> > >>>>>>>>> don't have a clear preference.
> > >>>>>>>>> I think the answer depends on how many things can we reuse.
> > >>>>>>>>> If in the new planner we end up copy-pasting code then I guess
> it
> > >>>> will be a
> > >>>>>>>>> bad idea.
> > >>>>>>>>> On the other hand, if the new and old planner do not have many
> > >>>> things in
> > >>>>>>>>> common then I guess the answer is obvious.
> > >>>>>>>>>
> > >>>>>>>>> From Haisheng's description, I was thinking that many of the
> > >> proposed
> > >>>>>>>>> changes could go in the existing planner.
> > >>>>>>>>> Like that it will be easier for everybody to test and see if
> the
> > >>>> changes
> > >>>>>>>>> make this better or worse.
> > >>>>>>>>> From a backward compatibility perspective it seems feasible to
> keep
> > >>>> the new
> > >>>>>>>>> features configurable for a certain amount of time.
> > >>>>>>>>>
> > >>>>>>>>> From the wish-list, I think we should focus initially on
> points:
> > >>>>>>>>> 1. Top-down trait request
> > >>>>>>>>> 2. Convert traits without Abstract converters
> > >>>>>>>>> 4. Bottom-up trait derivation
> > >>>>>>>>>
> > >>>>>>>>> I know that 3, and 5, are also important but I have the feeling
> > >> they
> > >>>> can
> > >>>>>>>>> wait a bit longer.
> > >>>>>>>>>
> > >>>>>>>>> A design doc would definitely help, especially if it has a few
> > >>>> end-to-end
> > >>>>>>>>> (from logical to physical plan) examples showing how the
> optimizer
> > >>>> works at
> > >>>>>>>>> each step before/after the changes.
> > >>>>>>>>> This is actually what is usually missing in research papers
> that
> > >>>> makes them
> > >>>>>>>>> hard to understand.
> > >>>>>>>>> I am thinking some similar to the examples that Haisheng send
> in
> > >> the
> > >>>> first
> > >>>>>>>>> email but possibly a bit more detailed.
> > >>>>>>>>>
> > >>>>>>>>> I looked very briefly in the PR by Roman but I think I didn't
> see
> > >>>> tests
> > >>>>>>>>> where the final plan contains operators from multiple
> conventions.
> > >>>>>>>>> Multiple conventions is among the choices that complicate
> certain
> > >>>> parts of
> > >>>>>>>>> the existing planner so we should make sure that we take this
> into
> > >>>> account.
> > >>>>>>>>>
> > >>>>>>>>> Hoping to find some time to think over all this more quietly.
> Very
> > >>>>>>>>> interesting stuff :)
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Stamatis
> > >>>>>>>>>
> > >>>>>>>>> On Sun, Apr 26, 2020 at 11:14 PM Haisheng Yuan <
> hy...@apache.org>
> > >>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi Roman,
> > >>>>>>>>>>
> > >>>>>>>>>> Excellent! This is definitely a helpful contribution to the
> > >> Calcite
> > >>>>>>>>>> community.
> > >>>>>>>>>> Thank you for your endeavors.
> > >>>>>>>>>>
> > >>>>>>>>>> Haisheng
> > >>>>>>>>>>
> > >>>>>>>>>> On 2020/04/26 19:25:00, Roman Kondakov
> <kondako...@mail.ru.INVALID
> > >>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>> Hi everyone!
> > >>>>>>>>>>>
> > >>>>>>>>>>> Haisheng, thank you for bringing this subject up. A new
> > >>>> Cascades-style
> > >>>>>>>>>>> optimizer should be definitely the next step for Apache
> Calcite.
> > >>>> Many
> > >>>>>>>>>>> projects suffer from the lack of this kind of optimizer.
> > >>>>>>>>>>>
> > >>>>>>>>>>> That was the reason why several weeks ago I started working
> on
> > >> the
> > >>>>>>>>>>> prototype of Cascades optimizer for Apache Calcite. I was not
> > >> sure
> > >>>> that
> > >>>>>>>>>>> I would build something useful without much experience in
> this
> > >>>> area. But
> > >>>>>>>>>>> now I'd like to share my work with the community. You can
> find
> > >> the
> > >>>>>>>>>>> Cascades prototype in PR [1]. This prototype is based on the
> > >>>> well-known
> > >>>>>>>>>>> paper [2].
> > >>>>>>>>>>>
> > >>>>>>>>>>> What prototype can do:
> > >>>>>>>>>>> - Top-down trait request
> > >>>>>>>>>>> - Convert traits without Abstract converters
> > >>>>>>>>>>> - Top-down rule apply
> > >>>>>>>>>>> - Bottom-up trait derivation
> > >>>>>>>>>>>
> > >>>>>>>>>>> What is not supported yet:
> > >>>>>>>>>>> - Search space pruning (but I'm going to fix it in the next
> > >>>> commits)
> > >>>>>>>>>>> - Materialized views
> > >>>>>>>>>>> - Planner hooks
> > >>>>>>>>>>> - Hints
> > >>>>>>>>>>> - Backward compatibility with Volcano planner (some research
> > >>>> needed)
> > >>>>>>>>>>>
> > >>>>>>>>>>> I prepared a design doc for this planner [3], you can find
> many
> > >>>> details
> > >>>>>>>>>>> there. I also opened it for comments.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I've written several basic test cases in
> > >>>>>>>>>>> org.apache.calcite.plan.cascades.CascadesPlannerTest
> including
> > >>>> that was
> > >>>>>>>>>>> discussed in this thread previously in the context of trait
> > >>>> requests:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> MergeJoin hash[a]
> > >>>>>>>>>>>>      |---- TableScan R hash[a] (RelSubset)
> > >>>>>>>>>>>>      +---- TableScan S hash[a] (RelSubset)
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Haisheng, this is very similar to what you propose. Answering
> > >> your
> > >>>>>>>>>> question:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> There are 2 ways:
> > >>>>>>>>>>>> a) Modify on current VolcanoPlanner.
> > >>>>>>>>>>>> Pros: code reuse, existing numerous test cases and
> > >> infrastructure,
> > >>>>>>>>>> fast integration
> > >>>>>>>>>>>> Cons: changing code always brings risk
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> b) Add a new XXXXPlanner
> > >>>>>>>>>>>> Pros: no risk, no tech debt, no need to worry about backward
> > >>>>>>>>>> compatability
> > >>>>>>>>>>>> Cons: separate test cases for new planner, one more planner
> to
> > >>>>>>>>>> maintain
> > >>>>>>>>>>>
> > >>>>>>>>>>> I've chosen the second approach. Because I don't have clear
> > >>>>>>>>>>> understanding how to fix Volcano planner gradually.
> > >>>>>>>>>>>
> > >>>>>>>>>>> This new planner is very far from perfect, but I think it
> can be
> > >> a
> > >>>> good
> > >>>>>>>>>>> starting point for community.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Please, share your thoughts about this planner.
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> [1] PR: https://github.com/apache/calcite/pull/1948
> > >>>>>>>>>>> [2] Paper:
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > >>
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> > >>>>>>>>>>> [3] Design doc:
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > >>
> https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> --
> > >>>>>>>>>>> Kind Regards
> > >>>>>>>>>>> Roman Kondakov
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 22.04.2020 09:52, Danny Chan wrote:
> > >>>>>>>>>>>>> Is there any recommended approach to make that happen
> smoothly
> > >>>> besides
> > >>>>>>>>>>>> coding and testing work? We need to be aware that the new
> > >> planner
> > >>>>>>>>>> might be
> > >>>>>>>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even
> never
> > >>>> replace
> > >>>>>>>>>>>> VolcanoPlanner.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> If that is true, i might say the new planner is probably
> with a
> > >>>> not
> > >>>>>>>>>> that
> > >>>>>>>>>>>> good design, we expect to see in advance for what
> cases/reasons
> > >>>> user
> > >>>>>>>>>> has
> > >>>>>>>>>>>> the reason to keep the old VolcanoPlanner and we *must*
> give a
> > >>>>>>>>>> solution for
> > >>>>>>>>>>>> those problems in the new design.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I was expecting that migrating to a new planner would at
> least
> > >>>> take 1
> > >>>>>>>>>> year
> > >>>>>>>>>>>> for developing, if that is true, modifying directly based on
> > >>>> current
> > >>>>>>>>>>>> planner means for the near future 3~4 versions Calcite,
> there
> > >>>> would
> > >>>>>>>>>> bring
> > >>>>>>>>>>>> in huge plan changes/bugs for each release which i believe
> all
> > >> the
> > >>>>>>>>>> users of
> > >>>>>>>>>>>> Calcite don't want to see. And on one can guarantee that
> > >> modifying
> > >>>>>>>>>> directly
> > >>>>>>>>>>>> can keep good stability and compatibility, only the test
> set do.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> From the experience of Alibaba Blink planner which has
> > >>>> contributed to
> > >>>>>>>>>>>> Apache Flink, yes, the old/new planner would co-exist at
> least
> > >>>> for 2
> > >>>>>>>>>> years.
> > >>>>>>>>>>>> For the reasons that the new and old planner has different
> > >>>> ability in
> > >>>>>>>>>> some
> > >>>>>>>>>>>> corner cases.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> From my point of view, we should at least:
> > >>>>>>>>>>>> - Give a convincing test set for the new planner that makes
> us
> > >>>> believe
> > >>>>>>>>>> the
> > >>>>>>>>>>>> new planner is stable and powerful enough. I mean obviously
> the
> > >>>> current
> > >>>>>>>>>>>> rule tests are far away from enough to support the new
> planner
> > >>>>>>>>>>>> - We should give a more detailed design doc about the new
> > >> planner,
> > >>>>>>>>>>>> especially about the interfaces changes and any change that
> > >> would
> > >>>>>>>>>> bring in
> > >>>>>>>>>>>> the compatibility problem. Then we can make more accurate
> > >>>> decision how
> > >>>>>>>>>> much
> > >>>>>>>>>>>> work the new planner would bring in, until then, we can
> decide
> > >> if
> > >>>>>>>>>> switch to
> > >>>>>>>>>>>> a pure new planner development is a good idea or modify the
> > >>>> existing
> > >>>>>>>>>> one.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Haisheng Yuan <hy...@apache.org> 于2020年4月22日周三 上午9:45写道：
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi Andrii,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Obviously, from what is written here, I could guess that
> this
> > >>>> would
> > >>>>>>>>>>>>> require me to change my physical planning rules, even if
> only
> > >> by
> > >>>>>>>>>>>>> implementing a marker interface.
> > >>>>>>>>>>>>> You don't need to change your physical rules, it will be
> > >> treated
> > >>>> as
> > >>>>>>>>>> equal
> > >>>>>>>>>>>>> as logical rules and be applied together with the real
> logical
> > >>>> rules,
> > >>>>>>>>>> no
> > >>>>>>>>>>>>> more logical/physical rules difference. This is also how
> > >> current
> > >>>>>>>>>>>>> VolcanoPlanner works.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I don't want you to think that I somehow resent the
> changes
> > >> you
> > >>>> are
> > >>>>>>>>>>>>> pushing.
> > >>>>>>>>>>>>> Don't get me wrong. I am seriously thinking of revert these
> > >>>> changes,
> > >>>>>>>>>> since
> > >>>>>>>>>>>>> most people like the idea of adding new planner, why don't
> we
> > >>>> make
> > >>>>>>>>>> all the
> > >>>>>>>>>>>>> plan changes in the new planner, instead of forcing people
> > >>>> changing
> > >>>>>>>>>> test
> > >>>>>>>>>>>>> cases for the code changes that they might not need in
> > >>>> VolcanoPlanner
> > >>>>>>>>>>>>> during upgrade.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I didn't intend to replace VolcanoPlanner, thought just
> change
> > >>>> the
> > >>>>>>>>>> search
> > >>>>>>>>>>>>> strategy and add trait derivation mechanism, because most
> of
> > >> the
> > >>>> code
> > >>>>>>>>>> in
> > >>>>>>>>>>>>> VolcanoPlanner can be reused. But since many agree to add
> new
> > >>>> planner
> > >>>>>>>>>> and
> > >>>>>>>>>>>>> replace VolcanoPlanner as the final goal, I won't be
> against
> > >> most
> > >>>>>>>>>> people's
> > >>>>>>>>>>>>> decision.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Is there any recommended approach to make that happen
> smoothly
> > >>>> besides
> > >>>>>>>>>>>>> coding and testing work? We need to be aware that the new
> > >> planner
> > >>>>>>>>>> might be
> > >>>>>>>>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even
> never
> > >>>>>>>>>> replace
> > >>>>>>>>>>>>> VolcanoPlanner.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> More thoughts are welcome.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Haisheng
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 2020/04/21 19:56:25, Андрей Цвелодуб <
> a.tsvelo...@gmail.com
> > >>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>> Hello Haisheng,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> To keep backward compatibility, all the un-marked rules
> will
> > >> be
> > >>>>>>>>>> treated
> > >>>>>>>>>>>>>> as logical rules, except rules that uses
> AbstractConverter as
> > >>>> rule
> > >>>>>>>>>>>>> operand,
> > >>>>>>>>>>>>>> these rules still need to applied top-down, or random
> order.
> > >>>>>>>>>>>>>> Obviously, from what is written here, I could guess that
> this
> > >>>> would
> > >>>>>>>>>>>>> require
> > >>>>>>>>>>>>>> me to change my physical planning rules, even if only by
> > >>>>>>>>>> implementing a
> > >>>>>>>>>>>>>> marker interface. I am not saying this is a bad thing, but
> > >> this
> > >>>> is a
> > >>>>>>>>>>>>> thing
> > >>>>>>>>>>>>>> that should be communicated and planned ahead in case the
> > >>>>>>>>>> VolcanoPlanner
> > >>>>>>>>>>>>> is
> > >>>>>>>>>>>>>> modified.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and
> > >>>>>>>>>> CALCITE-3753,
> > >>>>>>>>>>>>>> because they will cause another tons of plan changes.
> > >>>>>>>>>>>>>> I see you are still bitter due to all the discussions on
> this
> > >>>> list
> > >>>>>>>>>>>>> lately,
> > >>>>>>>>>>>>>> I'm sorry. I don't want you to think that I somehow
> resent the
> > >>>>>>>>>> changes
> > >>>>>>>>>>>>> you
> > >>>>>>>>>>>>>> are pushing, au contraire I support them and would be
> happy to
> > >>>> help
> > >>>>>>>>>> if I
> > >>>>>>>>>>>>>> can. I just want the process of these changes to be
> executed
> > >> in
> > >>>> the
> > >>>>>>>>>> best
> > >>>>>>>>>>>>>> possible way.
> > >>>>>>>>>>>>>> As I see there are already several opinions in this thread
> > >> that
> > >>>>>>>>>> basically
> > >>>>>>>>>>>>>> align with what I am saying, so I guess I am not the
> crazy guy
> > >>>>>>>>>> running
> > >>>>>>>>>>>>>> around and yelling "the end is nigh!".
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thank you for taking these mumbled thoughts into account.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Bestest Regards,
> > >>>>>>>>>>>>>> Andrii Tsvielodub
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan <
> hy...@apache.org
> > >>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Hi Andrii,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I guess changing the planner would lead to changes in
> tons
> > >> of
> > >>>> rules
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> even more tests.
> > >>>>>>>>>>>>>>> Obviously you didn't read through my email. You are not
> > >>>> required to
> > >>>>>>>>>> do
> > >>>>>>>>>>>>> any
> > >>>>>>>>>>>>>>> changes to your rule if you don't want to, but if you do,
> > >> just
> > >>>> need
> > >>>>>>>>>> to
> > >>>>>>>>>>>>> mark
> > >>>>>>>>>>>>>>> the rule to tell planner whether it is a physical rule or
> > >> not,
> > >>>>>>>>>> simply
> > >>>>>>>>>>>>> by
> > >>>>>>>>>>>>>>> implementing an empty interface.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> many on this list already experienced problems with
> > >> upgrading
> > >>>> even
> > >>>>>>>>>>>>>>> between the minor versions of Calcite.
> > >>>>>>>>>>>>>>> Sorry to see the problem you have experienced when
> upgrading
> > >>>>>>>>>> Calcite.
> > >>>>>>>>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and
> > >>>>>>>>>> CALCITE-3753,
> > >>>>>>>>>>>>>>> because they will cause another tons of plan changes.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> But I will see if I can add a setting to use the old
> search
> > >>>>>>>>>> strategy,
> > >>>>>>>>>>>>>>> which can be left untouched.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Haisheng
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 2020/04/21 06:33:08, Андрей Цвелодуб <
> > >> a.tsvelo...@gmail.com
> > >>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>> Hello everyone,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> First of all, thanks for this great effort of improving
> the
> > >>>> core
> > >>>>>>>>>>>>> parts of
> > >>>>>>>>>>>>>>>> the framework we all are using,
> > >>>>>>>>>>>>>>>> I believe this is long overdue and hope this will have
> > >>>> benefits
> > >>>>>>>>>> both
> > >>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>> the maintainers and users of the library.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I don't have anything to say about the general idea at
> the
> > >>>> moment,
> > >>>>>>>>>>>>>>>> but I want to make a point that maintaining the old
> > >>>> implementation
> > >>>>>>>>>> of
> > >>>>>>>>>>>>>>>> VolcanoPlanner during
> > >>>>>>>>>>>>>>>> the initial stages of implementing the new planner is
> > >>>> absolutely
> > >>>>>>>>>>>>>>> CRITICAL.
> > >>>>>>>>>>>>>>>> As a lot of users of Calcite do various customizations
> to
> > >> the
> > >>>>>>>>>>>>> engine, to
> > >>>>>>>>>>>>>>>> the rules
> > >>>>>>>>>>>>>>>> and all that is there in between, I believe changing the
> > >>>>>>>>>>>>> implementation
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>> the core component
> > >>>>>>>>>>>>>>>> would have a huge impact on most users of the library. I
> > >>>> think many
> > >>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>> list
> > >>>>>>>>>>>>>>>> already experienced problems with upgrading even
> between the
> > >>>> minor
> > >>>>>>>>>>>>>>> versions
> > >>>>>>>>>>>>>>>> of Calcite,
> > >>>>>>>>>>>>>>>> so I guess changing the planner would lead to changes in
> > >> tons
> > >>>> of
> > >>>>>>>>>>>>> rules
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>> even more tests.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I don't have anything against replacing VolcanoPlanner
> as a
> > >>>> final
> > >>>>>>>>>>>>> goal of
> > >>>>>>>>>>>>>>>> this effort,
> > >>>>>>>>>>>>>>>> but I don't think that modifying it directly and
> merging it
> > >> to
> > >>>>>>>>>>>>> master is
> > >>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>> viable development approach.
> > >>>>>>>>>>>>>>>> While I understand how burdensome it is to maintain
> several
> > >>>>>>>>>> parallel
> > >>>>>>>>>>>>> core
> > >>>>>>>>>>>>>>>> components at once
> > >>>>>>>>>>>>>>>> (we did this while moving the engine of our product to
> > >>>> Calcite), we
> > >>>>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>> still respect those who depend
> > >>>>>>>>>>>>>>>> on it and not introduce the risks related to the
> development
> > >>>> of a
> > >>>>>>>>>> new
> > >>>>>>>>>>>>>>>> component into existing processing flows.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> A good model to try to follow would be the way new
> Garbage
> > >>>>>>>>>>>>> Collectors are
> > >>>>>>>>>>>>>>>> introduced in Java.
> > >>>>>>>>>>>>>>>> First, add it as an experimental option, then make it
> > >>>> generally
> > >>>>>>>>>>>>>>> available,
> > >>>>>>>>>>>>>>>> then after everyone agrees
> > >>>>>>>>>>>>>>>> this is the best option - make it the default one.
> > >>>>>>>>>>>>>>>> With this approach, everyone can then move to the new
> > >> planner
> > >>>> at
> > >>>>>>>>>>>>> their
> > >>>>>>>>>>>>>>> own
> > >>>>>>>>>>>>>>>> pace, guaranteeing a smooth transition overall.
> > >>>>>>>>>>>>>>>> Yes, this could take some time, maybe even a year, but
> this
> > >>>> is the
> > >>>>>>>>>>>>> price
> > >>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>> doing major changes in a popular framework.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Again, thank you for initiating this discussion and
> leading
> > >>>> this
> > >>>>>>>>>>>>> effort.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Best Regards,
> > >>>>>>>>>>>>>>>> Andrii Tsvielodub
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu <
> > >> wjpabc...@gmail.com
> > >>>>>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Hi, Xiening.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Regarding calculating the logical cost, here are some
> ways
> > >> I
> > >>>>>>>>>>>>> though:
> > >>>>>>>>>>>>>>>>> 1. Logical rel may implement their own computeSelfCost
> > >>>> method.
> > >>>>>>>>>> Some
> > >>>>>>>>>>>>>>>>> rels can provide such information, for example the
> > >>>>>>>>>>>>>>>>> LogicalProject/LogicalFilter contains nearly the same
> > >>>> information
> > >>>>>>>>>>>>> as
> > >>>>>>>>>>>>>>> their
> > >>>>>>>>>>>>>>>>> physical implementations. If we don't have enough
> > >>>> confidence, just
> > >>>>>>>>>>>>>>> return
> > >>>>>>>>>>>>>>>>> zeroCost is also OK, as it only affects pruning.
> > >>>>>>>>>>>>>>>>> 2. Logical rel  tells its parents what its physical
> input
> > >>>> could be
> > >>>>>>>>>>>>>>> after
> > >>>>>>>>>>>>>>>>> implementation. Then the problem come back to
> calculating
> > >>>> lower
> > >>>>>>>>>>>>> bound
> > >>>>>>>>>>>>>>> of a
> > >>>>>>>>>>>>>>>>> physical rel.
> > >>>>>>>>>>>>>>>>> There should always be ways. The only problem is how to
> > >> find
> > >>>> a
> > >>>>>>>>>>>>> pretty
> > >>>>>>>>>>>>>>> one.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Regarding the risk, new planner do have different
> risk. It
> > >>>> is not
> > >>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>> new planner could stop us doing something wrong but we
> can
> > >>>> decide
> > >>>>>>>>>>>>> when
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> use the new one. Some scenarios:
> > >>>>>>>>>>>>>>>>> 1. If modifying the VolcanoPlanner directly, the only
> way
> > >>>> user
> > >>>>>>>>>>>>> could
> > >>>>>>>>>>>>>>>>> control the risk is not to upgrade calcite version
> until it
> > >>>> is
> > >>>>>>>>>>>>>>> considered
> > >>>>>>>>>>>>>>>>> stable. You know, it is quite different from keeping
> > >> calcite
> > >>>>>>>>>>>>> updated
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> switching to the new planner at a proper time.
> > >>>>>>>>>>>>>>>>> 2. It is very importance for SLA control. For the
> important
> > >>>>>>>>>>>>> business
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> jobs, we may keep using the old and stable planner.
> And use
> > >>>> the
> > >>>>>>>>>>>>> new one
> > >>>>>>>>>>>>>>>>> only for jobs that have fault tolerance. And this helps
> > >>>> testing
> > >>>>>>>>>> new
> > >>>>>>>>>>>>>>> planner
> > >>>>>>>>>>>>>>>>> with actual scenarios.
> > >>>>>>>>>>>>>>>>> 3. It is helpful when upgrading online services. When
> the
> > >> new
> > >>>>>>>>>>>>> planner
> > >>>>>>>>>>>>>>>>> happened to have some bugs, we can switch to the old
> > >> planner
> > >>>>>>>>>>>>> directly
> > >>>>>>>>>>>>>>>>> without rollback the whole service.
> > >>>>>>>>>>>>>>>>> 4. With all these ways to prevent issues becoming
> > >> disasters,
> > >>>> we
> > >>>>>>>>>>>>> are not
> > >>>>>>>>>>>>>>>>> vulnerable to making mistakes. This not only enables
> faster
> > >>>>>>>>>>>>> iterations
> > >>>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>> also let us have enough time to resolve big bugs, like
> > >>>> considering
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>> detail and applying a time-consuming refactoring for
> it. To
> > >>>> work
> > >>>>>>>>>>>>>>> around a
> > >>>>>>>>>>>>>>>>> critical bug using tricky ways usually introduces more
> > >>>> issues.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>> Jinpeng
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai <
> > >>>> xndai....@gmail.com>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Hi Jinpeng,
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Regarding this comment - I believe there are ways to
> > >>>> calculate
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>>>>> cost, but I think it’s not that simple as
> "cardinality *
> > >>>>>>>>>>>>>>>>> unit_copy_cost.”,
> > >>>>>>>>>>>>>>>>>> would  you provide more details of other different
> ways?
> > >>>> Just the
> > >>>>>>>>>>>>>>>>> algorithm
> > >>>>>>>>>>>>>>>>>> description or pseudo code would help us understand.
> > >> Thanks.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Regarding the approach of creating new planner, I
> don’t
> > >>>> think a
> > >>>>>>>>>>>>> new
> > >>>>>>>>>>>>>>>>>> planner would lower the risk. We don’t know what we
> don’t
> > >>>> know.
> > >>>>>>>>>>>>> If we
> > >>>>>>>>>>>>>>>>>> introduced an issue while modifying the planner, most
> > >>>> likely we
> > >>>>>>>>>>>>>>> would do
> > >>>>>>>>>>>>>>>>>> the same with new planner class. A new planner doesn’t
> > >>>>>>>>>>>>> necessarily
> > >>>>>>>>>>>>>>>>> prevent
> > >>>>>>>>>>>>>>>>>> the issue from happening, but just delay its
> surfacing,
> > >>>> which is
> > >>>>>>>>>>>>>>> worse
> > >>>>>>>>>>>>>>>>> IMO.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> There’s one obvious benefit with new planner is that
> we
> > >> can
> > >>>>>>>>>>>>> provide
> > >>>>>>>>>>>>>>> some
> > >>>>>>>>>>>>>>>>>> sort of isolation so the change won’t cause test
> baseline
> > >>>>>>>>>>>>> updates,
> > >>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>> could be painful at times.  We should see if we could
> use
> > >>>> planner
> > >>>>>>>>>>>>>>> config
> > >>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> achieve the same if we decide to just modify the
> current
> > >>>> planner.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Apr 20, 2020, at 8:32 AM, Haisheng Yuan <
> > >>>> hy...@apache.org>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Igor,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> That's great.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On 2020/04/20 11:17:49, Seliverstov Igor <
> > >>>> gvvinbl...@gmail.com
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>> Haisheng, Xiening,
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Ok, Now I see how it should work.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Thanks for your replies.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>> Igor
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> 20 апр. 2020 г., в 09:56, Seliverstov Igor <
> > >>>>>>>>>>>>> gvvinbl...@gmail.com
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> написал(а):
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Haisheng, Xiening,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Thanks for clarifying.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> In this proposal, we are not trying to split
> logical
> > >> and
> > >>>>>>>>>>>>> physical
> > >>>>>>>>>>>>>>>>>> planning entirely. - actually I was in doubt about an
> idea
> > >>>> of
> > >>>>>>>>>>>>> entire
> > >>>>>>>>>>>>>>>>>> splitting logical and physical phases, if you aren't
> going
> > >>>> to, I
> > >>>>>>>>>>>>>>> have no
> > >>>>>>>>>>>>>>>>>> objections.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> But it returns me to my first question: how we will
> > >>>> propagate
> > >>>>>>>>>>>>>>> traits
> > >>>>>>>>>>>>>>>>>> in bottom-up manner using proposed approach. (See
> > >> [DISCUSS]
> > >>>>>>>>>>>>> Proposal
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> add
> > >>>>>>>>>>>>>>>>>> API to force rules matching specific rels for details)
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> One of inconveniences of current VolcanoPlanner
> > >>>>>>>>>>>>> implementation is
> > >>>>>>>>>>>>>>>>>> amount of tricks that we need to get desired
> behaviour. It
> > >>>> would
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>> great
> > >>>>>>>>>>>>>>>>>> if some of issues (or all of them) were solved in the
> new
> > >>>>>>>>>>>>> approach.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>> Igor
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> пн, 20 апр. 2020 г., 7:02 Xiening Dai <
> > >>>> xndai....@gmail.com
> > >>>>>>>>>>>>>>> <mailto:
> > >>>>>>>>>>>>>>>>>> xndai....@gmail.com>>:
> > >>>>>>>>>>>>>>>>>>>>> Hi Igor,
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Your comment - "because actual cost may be
> calculated
> > >>>>>>>>>>>>> correctly
> > >>>>>>>>>>>>>>> using
> > >>>>>>>>>>>>>>>>>> physical operators only. So won't be able to implement
> > >>>> Branch and
> > >>>>>>>>>>>>>>> Bound
> > >>>>>>>>>>>>>>>>>> Space Pruning.“ is actually not true. In Cascade’s
> lower
> > >>>> bound /
> > >>>>>>>>>>>>>>> upper
> > >>>>>>>>>>>>>>>>>> bound pruning algorithm, you can get cost lower bound
> of
> > >>>> input
> > >>>>>>>>>>>>>>> RelNode
> > >>>>>>>>>>>>>>>>>> using cardinality * unit_copy_cost. The
> unit_copy_cost is
> > >> a
> > >>>>>>>>>>>>> constant,
> > >>>>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>> stands for the minimal cost for processing a tuple
> from
> > >> the
> > >>>>>>>>>>>>> input. So
> > >>>>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>> will be the minimal cost the input RelNode can
> achieve,
> > >> and
> > >>>> if
> > >>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>> indeed larger than the current best cost, this input
> node
> > >>>> can be
> > >>>>>>>>>>>>>>> pruned.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> In this proposal, we are not trying to split
> logical
> > >> and
> > >>>>>>>>>>>>> physical
> > >>>>>>>>>>>>>>>>>> planning entirely. But for any given equivalent set,
> we
> > >>>> would
> > >>>>>>>>>>>>> need to
> > >>>>>>>>>>>>>>>>>> finish exploration before implementation. But for the
> > >>>> entire memo
> > >>>>>>>>>>>>>>> tree,
> > >>>>>>>>>>>>>>>>>> each set could be in different planning stage, and an
> > >>>> equivalent
> > >>>>>>>>>>>>> set
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>> pruned even before it’s implemented. A text book
> example
> > >> of
> > >>>>>>>>>>>>>>>>> aforementioned
> > >>>>>>>>>>>>>>>>>> “lower bound / upper bound pruning” would be the join
> > >>>> ordering
> > >>>>>>>>>>>>> case.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> Regarding #3, I think we can still achieve that
> > >>>> (partially)
> > >>>>>>>>>>>>>>> through
> > >>>>>>>>>>>>>>>>>> this proposal. Remember every time when we start the
> > >>>>>>>>>>>>> optimization, we
> > >>>>>>>>>>>>>>>>> pass
> > >>>>>>>>>>>>>>>>>> down an upper bound limit. Initially this upper bound
> for
> > >>>> root is
> > >>>>>>>>>>>>>>>>> infinite
> > >>>>>>>>>>>>>>>>>> - meaning that no feasible plan is available. Every
> time
> > >>>> when we
> > >>>>>>>>>>>>>>> find a
> > >>>>>>>>>>>>>>>>>> physical plan we update this upper bound, then start
> the
> > >>>> search
> > >>>>>>>>>>>>>>> again. We
> > >>>>>>>>>>>>>>>>>> could stop the search when the cost is less than a
> > >>>> pre-defined
> > >>>>>>>>>>>>>>> threshold
> > >>>>>>>>>>>>>>>>> -
> > >>>>>>>>>>>>>>>>>> which gives you a “good enough” plan with early
> > >> termination.
> > >>>>>>>>>>>>> Still
> > >>>>>>>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>> wouldn't avoid the logical exploration. For that, you
> > >> would
> > >>>>>>>>>>>>> probably
> > >>>>>>>>>>>>>>>>>> archive through rule configurations, and avoid some
> > >>>> expansive
> > >>>>>>>>>>>>>>>>>> transformation to keep the cost down.
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan <
> > >>>>>>>>>>>>> hy...@apache.org
> > >>>>>>>>>>>>>>>>> <mailto:
> > >>>>>>>>>>>>>>>>>> hy...@apache.org>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> Igor,
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> a) Given current Calcite's stats derivation
> strategy,
> > >>>> mixing
> > >>>>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>>>>> and physical planning won't make it better. I hope you
> > >> went
> > >>>>>>>>>>>>> through
> > >>>>>>>>>>>>>>> my
> > >>>>>>>>>>>>>>>>>> email to the end, currently operators inside a memo
> group
> > >>>> don't
> > >>>>>>>>>>>>> share
> > >>>>>>>>>>>>>>>>> stats
> > >>>>>>>>>>>>>>>>>> info, each operator's stats may differ with the other
> one,
> > >>>> and
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> RelSubset's best node and stats may vary during
> > >>>> optimization. So
> > >>>>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>>>>> and physical rule pruning are not safe at the moment,
> > >>>> otherwise
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>> almost
> > >>>>>>>>>>>>>>>>>> implies changing downstream project's cost model
> silently.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> On the other hand, ensuring child group
> exploration
> > >> task
> > >>>>>>>>>>>>> finish
> > >>>>>>>>>>>>>>>>> first
> > >>>>>>>>>>>>>>>>>> will make rule mutual exclusivity check possible,
> like the
> > >>>>>>>>>>>>> result of
> > >>>>>>>>>>>>>>>>>> ReduceExpressionRule won't need trigger the same rule
> > >>>> again, The
> > >>>>>>>>>>>>> join
> > >>>>>>>>>>>>>>>>>> result of JoinCommuteRule won't need trigger
> > >>>> JoinCommuteRule and
> > >>>>>>>>>>>>>>>>>> ReduceExpressionRule again.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> More importantly, most if not all the the long
> > >> planning
> > >>>>>>>>>>>>> queries
> > >>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>> Calcite are not caused by too many alternatives, but
> > >> mutual
> > >>>>>>>>>>>>>>> triggering,
> > >>>>>>>>>>>>>>>>> or
> > >>>>>>>>>>>>>>>>>> cyclic triggering, which can be avoided by
> customizing the
> > >>>> rules
> > >>>>>>>>>>>>>>> instead
> > >>>>>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>> using the default one. Unless you use dynamic
> programming
> > >>>>>>>>>>>>> (Calcite
> > >>>>>>>>>>>>>>> use
> > >>>>>>>>>>>>>>>>>> heuristic) on join reordering (left-deep, right-deep,
> > >>>> bushy),
> > >>>>>>>>>>>>> space
> > >>>>>>>>>>>>>>>>> pruning
> > >>>>>>>>>>>>>>>>>> won't help make long / infinite running query faster.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> b) No evidence shows current version of Calcite
> will
> > >>>> return
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> most
> > >>>>>>>>>>>>>>>>>> promising plan in first planning iteration. Instead of
> > >>>> praying
> > >>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>> getting
> > >>>>>>>>>>>>>>>>>> good enough plan in the first iteration, why not
> focus on
> > >>>> fixing
> > >>>>>>>>>>>>>>> rules
> > >>>>>>>>>>>>>>>>> that
> > >>>>>>>>>>>>>>>>>> causes the issue?
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> c) That is not the goal.
> > >>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>> On 2020/04/19 15:14:57, Seliverstov Igor <
> > >>>>>>>>>>>>> gvvinbl...@gmail.com
> > >>>>>>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>> Haisheng,
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> From my point of view splitting logical and
> physical
> > >>>>>>>>>>>>> planning
> > >>>>>>>>>>>>>>> parts
> > >>>>>>>>>>>>>>>>>> isn’t a good idea.
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> I think so because actual cost may be calculated
> > >>>> correctly
> > >>>>>>>>>>>>>>> using
> > >>>>>>>>>>>>>>>>>> physical operators only. So that we:
> > >>>>>>>>>>>>>>>>>>>>>>> a) won't be able to implement Branch and Bound
> Space
> > >>>>>>>>>>>>> Pruning
> > >>>>>>>>>>>>>>> (as
> > >>>>>>>>>>>>>>>>> far
> > >>>>>>>>>>>>>>>>>> as I understand, at exploring time there are no
> physical
> > >>>>>>>>>>>>> operators,
> > >>>>>>>>>>>>>>> no
> > >>>>>>>>>>>>>>>>>> correct costs, but assumptions only, I don’t think we
> > >> should
> > >>>>>>>>>>>>> rely on
> > >>>>>>>>>>>>>>>>> them)
> > >>>>>>>>>>>>>>>>>>>>>>> b) won’t be able to get good enough plan (without
> > >>>> Branch
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> Bound
> > >>>>>>>>>>>>>>>>>> Space Pruning it’s non-trivial task to get right
> search
> > >>>>>>>>>>>>> direction and
> > >>>>>>>>>>>>>>>>> most
> > >>>>>>>>>>>>>>>>>> promising plans in first planning iterations)
> > >>>>>>>>>>>>>>>>>>>>>>> c) won’t be able to tune the good enough plan
> during
> > >>>>>>>>>>>>> several
> > >>>>>>>>>>>>>>>>> similar
> > >>>>>>>>>>>>>>>>>> queries are executed
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>>> Igor
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan <
> > >>>> hy...@apache.org
> > >>>>>>>>>>>>>>>>> <mailto:
> > >>>>>>>>>>>>>>>>>> hy...@apache.org>> написал(а):
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Hi Igor,
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> You can't have your cake and eat it.
> > >>>>>>>>>>>>>>>>>>>>>>>> But one feasible work item definitely we can do
> is
> > >>>> that
> > >>>>>>>>>>>>> once
> > >>>>>>>>>>>>>>>>>> timeout, stop exploring, use the first available
> physical
> > >>>>>>>>>>>>> operator in
> > >>>>>>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>>>>> group and optimize it.
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Because most, if not all, of the long / infinite
> > >>>> running
> > >>>>>>>>>>>>>>>>>> optimizations are caused by project related transpose,
> > >> join
> > >>>>>>>>>>>>>>> reordering
> > >>>>>>>>>>>>>>>>>> (join commute + associative), constant reduction for
> large
> > >>>>>>>>>>>>> expression
> > >>>>>>>>>>>>>>>>> (see
> > >>>>>>>>>>>>>>>>>> CALCITE-3460), all of which are logical
> transformations
> > >>>> rules and
> > >>>>>>>>>>>>>>> many of
> > >>>>>>>>>>>>>>>>>> which have corresponding JIRAs. So another
> alternative is,
> > >>>> let's
> > >>>>>>>>>>>>> fix
> > >>>>>>>>>>>>>>>>> these
> > >>>>>>>>>>>>>>>>>> bugs to improve Calcite to make timeout option less
> > >> usable.
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Another thing worth noting is that sql server
> > >>>> optimizer
> > >>>>>>>>>>>>>>> timeout
> > >>>>>>>>>>>>>>>>>> mechanism is not based on clock time, but the number
> of
> > >>>>>>>>>>>>> optimization
> > >>>>>>>>>>>>>>>>> tasks
> > >>>>>>>>>>>>>>>>>> it has done [1].
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > >>
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> > >>>>>>>>>>>>>>>>>> <
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > >>
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>>>> Haisheng
> > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>> On 2020/04/19 11:31:27, Seliverstov Igor <
> > >>>>>>>>>>>>>>> gvvinbl...@gmail.com
> > >>>>>>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>>> Haisheng,
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Ok, then such notification isn't needed
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> But in this case we don't have any control
> over how
> > >>>> long
> > >>>>>>>>>>>>>>> planning
> > >>>>>>>>>>>>>>>>>> takes
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> For some systems it's necessary to get good
> enough
> > >>>> plan
> > >>>>>>>>>>>>>>> right now
> > >>>>>>>>>>>>>>>>>> instead
> > >>>>>>>>>>>>>>>>>>>>>>>>> of best one after while
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> For example we've been considering a case when
> a
> > >>>> query is
> > >>>>>>>>>>>>>>>>>> optimised several
> > >>>>>>>>>>>>>>>>>>>>>>>>> times in short iterations in case it's
> impossible
> > >> to
> > >>>> get
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> best
> > >>>>>>>>>>>>>>>>>> plan in
> > >>>>>>>>>>>>>>>>>>>>>>>>> reasonable period of time (for example there
> is an
> > >>>> SLA
> > >>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>>> response time)
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> This mean we need all needed physical
> > >> implementations
> > >>>>>>>>>>>>> after
> > >>>>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>>>>>>>>>>>> transformation is applied.
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>>>>> Igor
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan <
> > >>>>>>>>>>>>> hy...@apache.org
> > >>>>>>>>>>>>>>>>>> <mailto:hy...@apache.org>>:
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Igor,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> There will be no rule queue anymore. Y will be
> > >> fully
> > >>>>>>>>>>>>>>> explored
> > >>>>>>>>>>>>>>>>>> (logical
> > >>>>>>>>>>>>>>>>>>>>>>>>>> rules are matched and applied) before it can
> be
> > >>>>>>>>>>>>> implemented
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>> optimized.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Haisheng
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>> On 2020/04/19 10:11:45, Seliverstov Igor <
> > >>>>>>>>>>>>>>> gvvinbl...@gmail.com
> > >>>>>>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Haisheng,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Great explanation, thanks.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> One thing I'd like to cover in advance is
> trait
> > >>>>>>>>>>>>> propagation
> > >>>>>>>>>>>>>>>>>> process (I
> > >>>>>>>>>>>>>>>>>>>>>>>>>> need
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> it for Apache Ignite SQL engine
> implementation).
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> For example:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> There are two nodes: Rel X and its child node
> > >> Rel Y
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Both nodes are in Optimized state, and there
> is a
> > >>>>>>>>>>>>> Logical
> > >>>>>>>>>>>>>>> rule
> > >>>>>>>>>>>>>>>>>> for Rel Y
> > >>>>>>>>>>>>>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> a rules queue matched,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> After logical rule is applied, there is a
> logical
> > >>>> rel
> > >>>>>>>>>>>>> Rel
> > >>>>>>>>>>>>>>> Y'
> > >>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>> containing
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> it set moves into Exploring state (since
> there
> > >> is a
> > >>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>>>> node
> > >>>>>>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>>>>>>>>>> has
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> to be implemented)
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> After whole process Exploring -> ... ->
> Optimized
> > >>>> we
> > >>>>>>>>>>>>> need
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> force parent
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Rel X set to switch its state from Optimized
> to
> > >>>>>>>>>>>>> Optimizing
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> peek the
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> newly implemented child and (possible)
> produce a
> > >>>> new
> > >>>>>>>>>>>>> Rel X'
> > >>>>>>>>>>>>>>>>>> physical rel
> > >>>>>>>>>>>>>>>>>>>>>>>>>> on
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> the basis of previously requested from the
> Rel X
> > >>>> set
> > >>>>>>>>>>>>>>> traits.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> If a new Rel X' is produced, a Rel X' parent
> > >> should
> > >>>>>>>>>>>>> move
> > >>>>>>>>>>>>>>> its
> > >>>>>>>>>>>>>>>>>> state
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Optimized -> Optimizing and repeat described
> > >> above
> > >>>>>>>>>>>>>>> operations.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Does it look like true?
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Igor
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan <
> > >>>>>>>>>>>>>>>>> h.y...@alibaba-inc.com
> > >>>>>>>>>>>>>>>>>> <mailto:h.y...@alibaba-inc.com>>:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> In the past few months, we have discussed a
> lot
> > >>>> about
> > >>>>>>>>>>>>>>> Cascades
> > >>>>>>>>>>>>>>>>>> style
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> top-down optimization, including on-demand
> trait
> > >>>>>>>>>>>>>>>>>> derivation/request,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> rule
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> apply, branch and bound space pruning. Now
> we
> > >>>> think
> > >>>>>>>>>>>>> it is
> > >>>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>> to move
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> towards these targets.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> We will separate it into several small
> issues,
> > >> and
> > >>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>> one
> > >>>>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> integrated as a standalone, independent
> feature,
> > >>>> and
> > >>>>>>>>>>>>> most
> > >>>>>>>>>>>>>>>>>> importantly,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> meanwhile keep backward compatibility.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Top-down trait request
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> In other words, pass traits requirements
> from
> > >>>> parent
> > >>>>>>>>>>>>>>> nodes to
> > >>>>>>>>>>>>>>>>>> child
> > >>>>>>>>>>>>>>>>>>>>>>>>>> nodes.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> The trait requests happens after all the
> logical
> > >>>>>>>>>>>>>>>>> transformation
> > >>>>>>>>>>>>>>>>>> rules
> > >>>>>>>>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> physical implementation rules are done, in a
> > >>>> top-down
> > >>>>>>>>>>>>>>> manner,
> > >>>>>>>>>>>>>>>>>> driven
> > >>>>>>>>>>>>>>>>>>>>>>>>>> from
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> root set. e.g.:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> SELECT a, sum(c) FROM
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> (SELECT * FROM R JOIN S USING (a, b)) t
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> GROUP BY a;
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have the following plan tree in
> the
> > >>>> MEMO,
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>> let's
> > >>>>>>>>>>>>>>>>>> only
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> consider distribution for simplicity, each
> group
> > >>>>>>>>>>>>>>> represents a
> > >>>>>>>>>>>>>>>>>> RelSet
> > >>>>>>>>>>>>>>>>>>>>>>>>>> in the
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> MEMO.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Group 1:     Agg on [a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Group 2:           +-- MergeJoin on [a, b]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Group 3:                       |---
> TableScan R
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Group 4:                       +---
> TableScan S
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group No | Operator    | Derived Traits |
> > >>>> Required
> > >>>>>>>>>>>>>>> Traits |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | ----------- | ------------- |
> ---------------
> > >> |
> > >>>>>>>>>>>>>>>>>> --------------- |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group 1  | Aggregate    | Hash[a]
>  |
> > >> N/A
> > >>>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group 2  | MergeJoin    | Hash[a,b]      |
> > >>>> Hash[a]
> > >>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group 3  | TableScan R | None            |
> > >>>> Hash[a,b]
> > >>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group 4  | TableScan S | None            |
> > >>>> Hash[a,b]
> > >>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> We will add new interface PhysicalNode (or
> > >>>> extending
> > >>>>>>>>>>>>>>> RelNode)
> > >>>>>>>>>>>>>>>>>> with
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> methods:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - Pair<RelTraitSet,List<RelTraitSet>>
> > >>>>>>>>>>>>>>>>> requireTraits(RelTraitSet
> > >>>>>>>>>>>>>>>>>>>>>>>>>> required);
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> pair.left is the current operator's new
> > >> traitset,
> > >>>>>>>>>>>>>>> pair.right
> > >>>>>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>>>>>>> list
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> of children's required traitset.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - RelNode passThrough(RelTraitSet required);
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Default implementation will call above
> method
> > >>>>>>>>>>>>>>> requireTraits()
> > >>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> RelNode.copy() to create new RelNode.
> Available
> > >>>> to be
> > >>>>>>>>>>>>>>>>> overriden
> > >>>>>>>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> physical operators to customize the logic.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> The planner will call above method on
> MergeJoin
> > >>>>>>>>>>>>> operator
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> pass the
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> required traits (Hash[a]) to Mergejoin's
> child
> > >>>>>>>>>>>>> operators.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> We will get a new MergeJoin:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> MergeJoin hash[a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>   |---- TableScan R hash[a] (RelSubset)
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>   +---- TableScan S hash[a] (RelSubset)
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Now the MEMO group looks like:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group No | Operator    | Derived Traits
> > >> |
> > >>>>>>>>>>>>> Required
> > >>>>>>>>>>>>>>>>>> Traits
> > >>>>>>>>>>>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | ---------- | -------- ----- |
> > >>>> -------------------- |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> --------------------- |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group1   | Aggregate   | Hash[a]
> > >>>> |
> > >>>>>>>>>>>>> N/A
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group2   | MergeJoin   | Hash[a,b],
> Hash[a]|
> > >>>> Hash[a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group3   | TableScan R | None
> > >>>>  |
> > >>>>>>>>>>>>>>>>> Hash[a,b],
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hash[a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> | Group4   | TableScan S | None
> > >>>>  |
> > >>>>>>>>>>>>>>>>> Hash[a,b],
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hash[a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> |
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Calcite user may choose to ignore / not
> > >> implement
> > >>>> the
> > >>>>>>>>>>>>>>>>> interface
> > >>>>>>>>>>>>>>>>>> to keep
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the original behavior. Each physical
> operator,
> > >>>>>>>>>>>>> according
> > >>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>> its
> > >>>>>>>>>>>>>>>>>> own
> > >>>>>>>>>>>>>>>>>>>>>>>>>> logic,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> decides whether passThrough() should pass
> traits
> > >>>> down
> > >>>>>>>>>>>>> or
> > >>>>>>>>>>>>>>> not
> > >>>>>>>>>>>>>>>>> by
> > >>>>>>>>>>>>>>>>>>>>>>>>>> returning:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - a non-null RelNode, which means it can
> pass
> > >> down
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - null object, which means can't pass down
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Provide option to disable
> AbstractConverter
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Once the plan can request traits in
> top-down way
> > >>>> in
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> framework, many
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> system don't need AbstractConverter anymore,
> > >>>> since it
> > >>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>> just
> > >>>>>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> intermediate operator to generate physical
> sort
> > >> /
> > >>>>>>>>>>>>>>> exchange.
> > >>>>>>>>>>>>>>>>> For
> > >>>>>>>>>>>>>>>>>> those,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> can provide option to disable
> AbstractConverter,
> > >>>>>>>>>>>>> generate
> > >>>>>>>>>>>>>>>>>> physical
> > >>>>>>>>>>>>>>>>>>>>>>>>>> enforcer
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> directly by adding a method to interface
> > >>>> Convention:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - RelNode enforce(RelNode node, RelTraitSet
> > >>>> traits);
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> The default implementation may just calling
> > >>>>>>>>>>>>>>>>>>>>>>>>>> changeTraitsUsingConverters(),
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> but people may choose to override it if the
> > >>>> system has
> > >>>>>>>>>>>>>>> special
> > >>>>>>>>>>>>>>>>>> needs,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> like
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> several traits must implement together, or
> the
> > >>>>>>>>>>>>> position of
> > >>>>>>>>>>>>>>>>>> collation in
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> RelTraitSet is before distribution.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Top-down driven, on-demand rule apply
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> For every RelNode in a RelSet, rule is
> matched
> > >> and
> > >>>>>>>>>>>>> applied
> > >>>>>>>>>>>>>>>>>>>>>>>>>> sequentially,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> newly generated RelNodes are added to the
> end of
> > >>>>>>>>>>>>> RelNode
> > >>>>>>>>>>>>>>> list
> > >>>>>>>>>>>>>>>>>> in the
> > >>>>>>>>>>>>>>>>>>>>>>>>>> RelSet
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> waiting for rule apply. RuleQueue and
> > >>>>>>>>>>>>> DeferringRuleCall
> > >>>>>>>>>>>>>>> is not
> > >>>>>>>>>>>>>>>>>> needed
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> anymore. This will make space pruning and
> rule
> > >>>> mutual
> > >>>>>>>>>>>>>>>>>> exclusivity check
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> possible.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> There are 3 stages for each RelSet:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> a). Exploration: logical transformation,
> yields
> > >>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>> nodes
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> b). Implementation: physical transformation,
> > >>>> yields
> > >>>>>>>>>>>>>>> physical
> > >>>>>>>>>>>>>>>>>> nodes
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> c). Optimization: trait request, enforcement
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> The general process looks like:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - optimize RelSet X:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> implement RelSet X
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> for each physical relnode in RelSet X:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> // pass down trait requests to child RelSets
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>     optimize RelSet Y
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - implement RelSet X:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> if X has been implemented:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> return
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> explore RelSet X
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> apply physical rules
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - explore RelSet X:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> if X has been explored
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> return
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> // ensure each child RelSet of current
> relnode
> > >> is
> > >>>>>>>>>>>>>>>>> explored
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>    explore RelSet Y
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> apply logical rules on current relnode
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Basically it is a state machine with several
> > >>>> states:
> > >>>>>>>>>>>>>>>>>> Initialized,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Explored, Exploring, Implemented,
> Implementing,
> > >>>>>>>>>>>>> Optimized,
> > >>>>>>>>>>>>>>>>>> Optimizing
> > >>>>>>>>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> several transition methods: exploreRelSet,
> > >>>>>>>>>>>>> exploreRelNode,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> implementRelSet,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> implementRelNode, optimizeRelSet,
> > >>>> optimizeRelNode...
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> To achieve this, we need to mark the rules
> > >> either
> > >>>>>>>>>>>>> logical
> > >>>>>>>>>>>>>>> rule
> > >>>>>>>>>>>>>>>>>> or
> > >>>>>>>>>>>>>>>>>>>>>>>>>> physical
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> rule.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> To keep backward compatibility, all the
> > >> un-marked
> > >>>>>>>>>>>>> rules
> > >>>>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>>>>>>> treated as
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> logical rules, except rules that uses
> > >>>>>>>>>>>>> AbstractConverter as
> > >>>>>>>>>>>>>>>>> rule
> > >>>>>>>>>>>>>>>>>>>>>>>>>> operand,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> these rules still need to applied top-down,
> or
> > >>>> random
> > >>>>>>>>>>>>>>> order.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 4. On-demand, bottom-up trait derivation
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> It is called bottom-up, but actually driven
> by
> > >>>>>>>>>>>>> top-down,
> > >>>>>>>>>>>>>>>>>> happens same
> > >>>>>>>>>>>>>>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> as top-down trait request, in optimization
> stage
> > >>>>>>>>>>>>> mentioned
> > >>>>>>>>>>>>>>>>>> above. Many
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Calcite based bigdata system only propagate
> > >>>> traits on
> > >>>>>>>>>>>>>>> Project
> > >>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>> Filter by
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> writing rules, which is very limited. In
> fact,
> > >> we
> > >>>> can
> > >>>>>>>>>>>>>>> extend
> > >>>>>>>>>>>>>>>>>> trait
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> propagation/derivation to all the operators,
> > >>>> without
> > >>>>>>>>>>>>>>> rules, by
> > >>>>>>>>>>>>>>>>>> adding
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> interface PhysicalNode (or extending
> RelNode)
> > >> with
> > >>>>>>>>>>>>> method:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - RelNode derive(RelTraitSet traits, int
> > >> childId);
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Given the following plan (only consider
> > >>>> distribution
> > >>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>>> simplicity):
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Agg [a,b]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>          |---- TableScan R
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>          +--- TableScan S
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hash[a] won't satisfy Hash[a,b] without
> special
> > >>>>>>>>>>>>> treatment,
> > >>>>>>>>>>>>>>>>>> because
> > >>>>>>>>>>>>>>>>>>>>>>>>>> there
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> isn't a mechanism to coordinate traits
> between
> > >>>>>>>>>>>>> children.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Now we call derive method on Agg [a,b] node,
> > >>>>>>>>>>>>>>> derive(Hash[a],
> > >>>>>>>>>>>>>>>>>> 0), we get
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the new node:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Agg [a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a] (RelSubset)
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> We will provide different matching type, so
> each
> > >>>>>>>>>>>>> operator
> > >>>>>>>>>>>>>>> can
> > >>>>>>>>>>>>>>>>>> specify
> > >>>>>>>>>>>>>>>>>>>>>>>>>> what
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> kind of matching type it requires its
> children:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - MatchType getMatchType(RelTrait trait, int
> > >>>> childId);
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> a) Exact: Hash[a,b] exact match Hash[a,b],
> aka,
> > >>>>>>>>>>>>> satisfy
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> b) Partial: Hash[a] partial match Hash[a,b]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> c) Permuted: Sort[a,b,c] permuted match
> > >>>> Sort[c,b,a]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> In addition, optimization order is provided
> for
> > >>>> each
> > >>>>>>>>>>>>>>> opertor
> > >>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>>>>>>>>>> specify:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> a) left to right
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> b) right to left
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> c) both
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, for query SELECT * FROM R join
> S on
> > >>>>>>>>>>>>> R.a=S.a
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>> R.b=S.b
> > >>>>>>>>>>>>>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> R.c=S.c:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose R is distributed by a,b, and S is
> > >>>> distributed
> > >>>>>>>>>>>>> by
> > >>>>>>>>>>>>>>> c.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> MergeJoin [a,b,c]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> |--- TableScan R [a,b]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> +-- TableScan S [c]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> a) left to right, call derive(Hash[a,b],
> 0), we
> > >>>> get
> > >>>>>>>>>>>>>>> MergeJoin
> > >>>>>>>>>>>>>>>>>> [a,b]
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> b) right to left, call derive(Hash[c], 1),
> we
> > >> get
> > >>>>>>>>>>>>>>> MergeJoin
> > >>>>>>>>>>>>>>>>>> [c], most
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> likely a bad plan
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> c) both, get above 2 plans.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> For system that doesn't have a fine-tuned
> stats
> > >>>> and
> > >>>>>>>>>>>>> cost
> > >>>>>>>>>>>>>>>>> model,
> > >>>>>>>>>>>>>>>>>> it may
> > >>>>>>>>>>>>>>>>>>>>>>>>>> not
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> be able to make a right decision based
> purely on
> > >>>> cost.
> > >>>>>>>>>>>>>>>>> Probably
> > >>>>>>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>>>>>>>>>> need to
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> provide the MergeJoin with both children's
> > >> derived
> > >>>>>>>>>>>>>>> traitset
> > >>>>>>>>>>>>>>>>>> list.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - List<RelNode>
> derive(List<List<RelTraitSet>>);
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Of course, all above methods are optional to
> > >>>>>>>>>>>>> implement for
> > >>>>>>>>>>>>>>>>>> those who
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> doesn't need this feature.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 5. Branch and Bound Space Pruning
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> After we implement on-demand, top-down trait
> > >>>>>>>>>>>>> enforcement
> > >>>>>>>>>>>>>>> and
> > >>>>>>>>>>>>>>>>>>>>>>>>>> rule-apply,
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> we can pass the cost limit at the time of
> > >> passing
> > >>>> down
> > >>>>>>>>>>>>>>>>> required
> > >>>>>>>>>>>>>>>>>>>>>>>>>> traits, as
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> described in the classical Cascades paper.
> Right
> > >>>> now,
> > >>>>>>>>>>>>>>> Calcite
> > >>>>>>>>>>>>>>>>>> doesn't
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> provide group level logical properties,
> > >> including
> > >>>>>>>>>>>>> stats
> > >>>>>>>>>>>>>>> info,
> > >>>>>>>>>>>>>>>>>> each
> > >>>>>>>>>>>>>>>>>>>>>>>>>> operator
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> in the same group has its own logical
> property
> > >>>> and the
> > >>>>>>>>>>>>>>> stats
> > >>>>>>>>>>>>>>>>>> may vary,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> so
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> we can only do limited space pruning for
> trait
> > >>>>>>>>>>>>>>> enforcement,
> > >>>>>>>>>>>>>>>>>> still
> > >>>>>>>>>>>>>>>>>>>>>>>>>> good. But
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> if we agree to add option to share group
> level
> > >>>> stats
> > >>>>>>>>>>>>>>> between
> > >>>>>>>>>>>>>>>>>> relnodes
> > >>>>>>>>>>>>>>>>>>>>>>>>>> in a
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> RelSet, we will be able to do more aggresive
> > >> space
> > >>>>>>>>>>>>>>> pruning,
> > >>>>>>>>>>>>>>>>>> which will
> > >>>>>>>>>>>>>>>>>>>>>>>>>> help
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> boost the performance of join reorder
> planning.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> With all that being said, how do we move
> > >> forward?
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> There are 2 ways:
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> a) Modify on current VolcanoPlanner.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Pros: code reuse, existing numerous test
> cases
> > >> and
> > >>>>>>>>>>>>>>>>>> infrastructure,
> > >>>>>>>>>>>>>>>>>>>>>>>>>> fast
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> integration
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Cons: changing code always brings risk
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> b) Add a new XXXXPlanner
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Pros: no risk, no tech debt, no need to
> worry
> > >>>> about
> > >>>>>>>>>>>>>>> backward
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> compatability
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Cons: separate test cases for new planner,
> one
> > >>>> more
> > >>>>>>>>>>>>>>> planner to
> > >>>>>>>>>>>>>>>>>>>>>>>>>> maintain
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> We'd like to hear the community's thoughts
> and
> > >>>>>>>>>>>>> advices.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks.
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> - Haisheng
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> >
>

Re: [DISCUSS] Towards Cascades Optimizer

Reply via email to