Hi Stamatis,

The problem that the presented reproducer is a very simplified version of
what is actually needed. In reality, there are several distribution types -
PARTITIONED, REPLICATED, SINGLETON. To make things worse, PARTITIONED may
or may not have concrete distribution fields. In theory, I can create one
transformation per distribution type, but that would increase plan space
significantly. In my sample "Root <- Project <- Scan" plan, there is no
room for optimization at all, we only need to convert the nodes and
propagate traits. But if I follow the proposed approach, the planner would
create three equivalent paths. For complex queries, this may increase
optimization time significantly.

What I need instead is to gradually convert and optimize nodes *bottom-up*,
instead of top-bottom. That is, create a physical scan first, then create a
physical project on top of it, etc. But at the same time, some rules still
require the top-bottom approach. So essentially I need the optimizer to do
both. Abstract converters help me establish bottom-up preparation but do
this at the cost of considering too many trait pairs, and as a result, also
pollute the search space.

To the contrast, precise command "I transformed the node A to node B,
please re-trigger the rules for A's parents" would allow us to re-trigger
only required rules, without adding more nodes.

Does it make sense?

Regards,
Vladimir.

ср, 30 окт. 2019 г. в 21:02, Stamatis Zampetakis <[email protected]>:

> I admit that I didn't thoroughly read the exchanges so far but forcing the
> rules trigger does not seem the right approach.
>
> Other than that I would like to backtrack a bit to point 4.3 raised by
> Vladimir.
>
> "ProjectPhysicalRule [6] - transforms logical project to physical
> project *ONLY* if there is an underlying physical input with REPLICATED or
> SINGLETON distribution"
>
> The rule could be modified to do the following two transformations:
> 1. Create a physical project and require the input to be REPLICATED.
> 2. Create a physical project and require
> the input to be SINGLETON.
>
> I would assume that afterwards when your scan rule fires it should go to
> the appropriate subset and give you back the desired plan. Is there a
> problem with this approach?
>
> Best,
> Stamatis
>
> On Wed, Oct 30, 2019, 5:52 PM Seliverstov Igor <[email protected]>
> wrote:
>
> > Unfortunately it requires package-private API usage.
> >
> > Best solution there if Calcite provide it for end users.
> >
> > ср, 30 окт. 2019 г., 18:48 Vladimir Ozerov <[email protected]>:
> >
> > > “e pension” == “expand conversion” :)
> > >
> > > ср, 30 окт. 2019 г. в 18:46, Vladimir Ozerov <[email protected]>:
> > >
> > > > Yes, that may work. Even if e pension rule is used, for the most
> cases
> > it
> > > > will not trigger any real conversions, since we are moving from
> > abstract
> > > > convention to physical, and created converters will have the opposite
> > > trait
> > > > direction (from physical to abstract).
> > > >
> > > > But again - ideally we only need to re-trigger the rules for a
> specific
> > > > node, no more than that. So API support like
> > > > “VolcanoPlanner.forceRules(RelNode)” would be very convenient.
> > > >
> > > > What do you think?
> > > >
> > > > ср, 30 окт. 2019 г. в 17:56, Seliverstov Igor <[email protected]
> >:
> > > >
> > > >> I considered manual rules calling too, for now we use abstract
> > > converters
> > > >> +
> > > >> ExpandConversionRule for exchanges producing.
> > > >>
> > > >> You may create such converters manually (checking appropriate
> subset)
> > > this
> > > >> case you may reduce created converters count, also, a converter is a
> > > quite
> > > >> special node, that does almost nothing (without corresponding rule)
> it
> > > may
> > > >> be used just as a rule trigger.
> > > >>
> > > >> Regards,
> > > >> Igor
> > > >>
> > > >> ср, 30 окт. 2019 г., 17:31 Vladimir Ozerov <[email protected]>:
> > > >>
> > > >> > One funny hack which helped me is manual registration of a fake
> > > RelNode
> > > >> > with desired traits through VolcanoPlanner.register() method. But
> > > again,
> > > >> > this leads to trashing. What could really help is a call to
> > > >> > VolcanoPlanner.fireRules() with desired rel. But this doesn't work
> > out
> > > >> of
> > > >> > the box since some internals of the rule queue needs to be
> adjusted.
> > > >> >
> > > >> > What does the community think about adding a method which will
> > re-add
> > > >> rules
> > > >> > applicable to the specific RelNode to the rule queue?
> > > >> >
> > > >> > ср, 30 окт. 2019 г. в 17:00, Vladimir Ozerov <[email protected]
> >:
> > > >> >
> > > >> > > Hi Igor,
> > > >> > >
> > > >> > > Yes, I came to the same conclusion, thank you. This is how it
> > > >> basically
> > > >> > > happens when converters are disabled:
> > > >> > > 1) We start with initial tree: [LogicalProject] <- [LogicalScan]
> > > >> > > 2) Then we convert LogicalScan to PhysicalScan, so it is added
> to
> > > the
> > > >> > > set: [LogicalProject] <- [LogicalScan, PhysicalScan]
> > > >> > > 3) Finally, when it is time to fire a rule for PhysicalScan, we
> > try
> > > to
> > > >> > get
> > > >> > > parents of that scan set with traits of the PhysicalScan. Since
> > > there
> > > >> are
> > > >> > > no such parents (we skipped it intentionally), the rule is not
> > > queued.
> > > >> > >
> > > >> > > But when converters are enabled, a converter rel is created:
> > > >> > [LogicalProject]
> > > >> > > <- [LogicalScan, PhysicalScan, ConverterFromPhysicalToLogical].
> No
> > > >> rules
> > > >> > > are fired for PhysicalScan again, but they are fired for
> converter
> > > >> since
> > > >> > > it has the necessary LOGICAL trait.
> > > >> > >
> > > >> > > It makes sense, that converters essentially allow forcing rule
> > > >> invocation
> > > >> > > on parents, even if the child was created with different traits.
> > But
> > > >> it
> > > >> > > seems that this mechanism may be too heavy for complex queries
> > > >> because it
> > > >> > > literally creates hundreds of new converter rels and triggers
> > rules
> > > >> over
> > > >> > > and over again.
> > > >> > >
> > > >> > > We need some fine-grained alternative. Basically, what would be
> > > enough
> > > >> > for
> > > >> > > me is to let the planner know somehow: "I created that rel, and
> I
> > > want
> > > >> > you
> > > >> > > to execute parent rules not only using its trait but also on
> this
> > > and
> > > >> > those
> > > >> > > traits."
> > > >> > > Is there any API in Calcite which allows doing this without
> > > creating a
> > > >> > new
> > > >> > > rel node?
> > > >> > >
> > > >> > > Regards,
> > > >> > > Vladimir.
> > > >> > >
> > > >> > >
> > > >> > > ср, 30 окт. 2019 г. в 09:25, Seliverstov Igor <
> > [email protected]
> > > >:
> > > >> > >
> > > >> > >> Vladimir,
> > > >> > >>
> > > >> > >> Probably it'll help you:
> > > >> > >>
> > > >> > >> Seems the cause of issue in RelSubset.getParentRels()  The
> check
> > > used
> > > >> > when
> > > >> > >> the planner schedules newly matched rules after successful
> > > >> > transformation
> > > >> > >> (see VolcanoRuleCall.matchRecurse), it prevents the parent rule
> > be
> > > >> > applied
> > > >> > >> once again (here your logical project with an input having ANY
> > > >> > >> distribution
> > > >> > >> doesn't satisfy a transformed input traits).
> > > >> > >>
> > > >> > >> In our case we use another workaround, so there are also much
> > more
> > > >> > >> transformations than we wanted, so the desired rule is
> triggered.
> > > >> > >>
> > > >> > >>
> > > >> > >> вт, 29 окт. 2019 г., 14:46 Vladimir Ozerov <[email protected]
> >:
> > > >> > >>
> > > >> > >> > Hi Vladimir,
> > > >> > >> >
> > > >> > >> > I am sorry. Pushed, it works now.
> > > >> > >> >
> > > >> > >> > вт, 29 окт. 2019 г. в 14:41, Vladimir Sitnikov <
> > > >> > >> > [email protected]
> > > >> > >> > >:
> > > >> > >> >
> > > >> > >> > > > mvn clean test
> > > >> > >> > >
> > > >> > >> > > [ERROR] The goal you specified requires a project to
> execute
> > > but
> > > >> > >> there is
> > > >> > >> > > no POM in this directory
> > > >> > >> > >
> > > >> > >> > > Vladimir, please push missing files
> > > >> > >> > >
> > > >> > >> > > Vladimir
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to