Hi Stamatis, The problem that the presented reproducer is a very simplified version of what is actually needed. In reality, there are several distribution types - PARTITIONED, REPLICATED, SINGLETON. To make things worse, PARTITIONED may or may not have concrete distribution fields. In theory, I can create one transformation per distribution type, but that would increase plan space significantly. In my sample "Root <- Project <- Scan" plan, there is no room for optimization at all, we only need to convert the nodes and propagate traits. But if I follow the proposed approach, the planner would create three equivalent paths. For complex queries, this may increase optimization time significantly.
What I need instead is to gradually convert and optimize nodes *bottom-up*, instead of top-bottom. That is, create a physical scan first, then create a physical project on top of it, etc. But at the same time, some rules still require the top-bottom approach. So essentially I need the optimizer to do both. Abstract converters help me establish bottom-up preparation but do this at the cost of considering too many trait pairs, and as a result, also pollute the search space. To the contrast, precise command "I transformed the node A to node B, please re-trigger the rules for A's parents" would allow us to re-trigger only required rules, without adding more nodes. Does it make sense? Regards, Vladimir. ср, 30 окт. 2019 г. в 21:02, Stamatis Zampetakis <[email protected]>: > I admit that I didn't thoroughly read the exchanges so far but forcing the > rules trigger does not seem the right approach. > > Other than that I would like to backtrack a bit to point 4.3 raised by > Vladimir. > > "ProjectPhysicalRule [6] - transforms logical project to physical > project *ONLY* if there is an underlying physical input with REPLICATED or > SINGLETON distribution" > > The rule could be modified to do the following two transformations: > 1. Create a physical project and require the input to be REPLICATED. > 2. Create a physical project and require > the input to be SINGLETON. > > I would assume that afterwards when your scan rule fires it should go to > the appropriate subset and give you back the desired plan. Is there a > problem with this approach? > > Best, > Stamatis > > On Wed, Oct 30, 2019, 5:52 PM Seliverstov Igor <[email protected]> > wrote: > > > Unfortunately it requires package-private API usage. > > > > Best solution there if Calcite provide it for end users. > > > > ср, 30 окт. 2019 г., 18:48 Vladimir Ozerov <[email protected]>: > > > > > “e pension” == “expand conversion” :) > > > > > > ср, 30 окт. 2019 г. в 18:46, Vladimir Ozerov <[email protected]>: > > > > > > > Yes, that may work. Even if e pension rule is used, for the most > cases > > it > > > > will not trigger any real conversions, since we are moving from > > abstract > > > > convention to physical, and created converters will have the opposite > > > trait > > > > direction (from physical to abstract). > > > > > > > > But again - ideally we only need to re-trigger the rules for a > specific > > > > node, no more than that. So API support like > > > > “VolcanoPlanner.forceRules(RelNode)” would be very convenient. > > > > > > > > What do you think? > > > > > > > > ср, 30 окт. 2019 г. в 17:56, Seliverstov Igor <[email protected] > >: > > > > > > > >> I considered manual rules calling too, for now we use abstract > > > converters > > > >> + > > > >> ExpandConversionRule for exchanges producing. > > > >> > > > >> You may create such converters manually (checking appropriate > subset) > > > this > > > >> case you may reduce created converters count, also, a converter is a > > > quite > > > >> special node, that does almost nothing (without corresponding rule) > it > > > may > > > >> be used just as a rule trigger. > > > >> > > > >> Regards, > > > >> Igor > > > >> > > > >> ср, 30 окт. 2019 г., 17:31 Vladimir Ozerov <[email protected]>: > > > >> > > > >> > One funny hack which helped me is manual registration of a fake > > > RelNode > > > >> > with desired traits through VolcanoPlanner.register() method. But > > > again, > > > >> > this leads to trashing. What could really help is a call to > > > >> > VolcanoPlanner.fireRules() with desired rel. But this doesn't work > > out > > > >> of > > > >> > the box since some internals of the rule queue needs to be > adjusted. > > > >> > > > > >> > What does the community think about adding a method which will > > re-add > > > >> rules > > > >> > applicable to the specific RelNode to the rule queue? > > > >> > > > > >> > ср, 30 окт. 2019 г. в 17:00, Vladimir Ozerov <[email protected] > >: > > > >> > > > > >> > > Hi Igor, > > > >> > > > > > >> > > Yes, I came to the same conclusion, thank you. This is how it > > > >> basically > > > >> > > happens when converters are disabled: > > > >> > > 1) We start with initial tree: [LogicalProject] <- [LogicalScan] > > > >> > > 2) Then we convert LogicalScan to PhysicalScan, so it is added > to > > > the > > > >> > > set: [LogicalProject] <- [LogicalScan, PhysicalScan] > > > >> > > 3) Finally, when it is time to fire a rule for PhysicalScan, we > > try > > > to > > > >> > get > > > >> > > parents of that scan set with traits of the PhysicalScan. Since > > > there > > > >> are > > > >> > > no such parents (we skipped it intentionally), the rule is not > > > queued. > > > >> > > > > > >> > > But when converters are enabled, a converter rel is created: > > > >> > [LogicalProject] > > > >> > > <- [LogicalScan, PhysicalScan, ConverterFromPhysicalToLogical]. > No > > > >> rules > > > >> > > are fired for PhysicalScan again, but they are fired for > converter > > > >> since > > > >> > > it has the necessary LOGICAL trait. > > > >> > > > > > >> > > It makes sense, that converters essentially allow forcing rule > > > >> invocation > > > >> > > on parents, even if the child was created with different traits. > > But > > > >> it > > > >> > > seems that this mechanism may be too heavy for complex queries > > > >> because it > > > >> > > literally creates hundreds of new converter rels and triggers > > rules > > > >> over > > > >> > > and over again. > > > >> > > > > > >> > > We need some fine-grained alternative. Basically, what would be > > > enough > > > >> > for > > > >> > > me is to let the planner know somehow: "I created that rel, and > I > > > want > > > >> > you > > > >> > > to execute parent rules not only using its trait but also on > this > > > and > > > >> > those > > > >> > > traits." > > > >> > > Is there any API in Calcite which allows doing this without > > > creating a > > > >> > new > > > >> > > rel node? > > > >> > > > > > >> > > Regards, > > > >> > > Vladimir. > > > >> > > > > > >> > > > > > >> > > ср, 30 окт. 2019 г. в 09:25, Seliverstov Igor < > > [email protected] > > > >: > > > >> > > > > > >> > >> Vladimir, > > > >> > >> > > > >> > >> Probably it'll help you: > > > >> > >> > > > >> > >> Seems the cause of issue in RelSubset.getParentRels() The > check > > > used > > > >> > when > > > >> > >> the planner schedules newly matched rules after successful > > > >> > transformation > > > >> > >> (see VolcanoRuleCall.matchRecurse), it prevents the parent rule > > be > > > >> > applied > > > >> > >> once again (here your logical project with an input having ANY > > > >> > >> distribution > > > >> > >> doesn't satisfy a transformed input traits). > > > >> > >> > > > >> > >> In our case we use another workaround, so there are also much > > more > > > >> > >> transformations than we wanted, so the desired rule is > triggered. > > > >> > >> > > > >> > >> > > > >> > >> вт, 29 окт. 2019 г., 14:46 Vladimir Ozerov <[email protected] > >: > > > >> > >> > > > >> > >> > Hi Vladimir, > > > >> > >> > > > > >> > >> > I am sorry. Pushed, it works now. > > > >> > >> > > > > >> > >> > вт, 29 окт. 2019 г. в 14:41, Vladimir Sitnikov < > > > >> > >> > [email protected] > > > >> > >> > >: > > > >> > >> > > > > >> > >> > > > mvn clean test > > > >> > >> > > > > > >> > >> > > [ERROR] The goal you specified requires a project to > execute > > > but > > > >> > >> there is > > > >> > >> > > no POM in this directory > > > >> > >> > > > > > >> > >> > > Vladimir, please push missing files > > > >> > >> > > > > > >> > >> > > Vladimir > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > > > > > >> > > > > >> > > > > > > > > > >
