Yes, I have already appreciated the value and flexibility that predefined rules provide, and would not want to lose it :) I didn't look into the query metadata much before, but that looks promising! Thanks!
On Thu, Nov 8, 2018 at 11:05 PM Julian Hyde <[email protected]> wrote: > For validation (i.e. that doesn’t modify the tree), I would use a visitor. > RelVisitor may suffice. > > There are also a few “whole tree” transformations, e.g. column pruning. > Use sparingly. > > You are correct that rules and their operands do not “scale” to match > large sections of the tree. We could in principle extend matching a little > (e.g. better handing of Union with many inputs) but the locality is mostly > a good thing. In a Volcano graph, there are multiple nodes in each > equivalence set, therefore huge numbers of paths through the graph. Deep > matches would quickly become intractable. > > I strongly recommend using traits, and in particular predicates > (RelMdPredicates / RelOptPredicateList). Let’s suppose you want to know > whether a particular input column is always equal to 5. You could write a > rule that looks for a Project several layers down whose expression is the > literal 5. But much better is to look at the predicates. Predicates are > propagated up the tree, which means you don’t need to look at the > structure, and you can reason and act locally. > > Similar arguments apply for sort and distribution (which are also traits). > > If are able to package your logic into a RelOptRule you will be pleased > with the results. It composes beautifully and efficiently with the hundreds > of other rules, and with all the flavors of metadata. > > Julian > > > > On Nov 8, 2018, at 12:50 PM, Андрей Цвелодуб <[email protected]> > wrote: > > > > Hello everyone! > > > > I have a question that I can't find an answer to, so maybe someone could > > help me. > > As a part of Rel Rules, there is always an operand, that matches a part > of > > the tree, and says if the rule should be executed. > > The operand can be complex, so I can say for example - match an Aggregate > > on top of Project on top of Filter. AFAIU, this operand will only match > if > > exactly this three nodes will be somewhere in the tree. > > But here is my question - what if I want a rule that will match a more > > generic structure, like this > >> Aggregate > >> -... > >> --* any number of any nodes in any levels > >> ---... > >> ---- Project > > Is there an official way to do that? > > > > My first approach was to match any Aggregate and then try to inspect the > > underlying tree in matches()/onMatch(), but this turned out to be quite > > unreliable since it involves inspecting RelSubsets (and this shouldn't be > > done, as follows from > > > https://lists.apache.org/thread.html/ee2349272e9d344228595c0940820b2fc525cc6115388c48e99495a6@%3Cdev.calcite.apache.org%3E > ). > > > > > > In case I'm doing it all wrong, I can formulate my question even broader > - > > is there a mechanism to perform validation of the execution tree during > the > > planning process, i.e. skip some plans as unimplementable based on their > > internal structure. As an example imagine I want to say that in > > JdbcConvention, all plans that have a Filter node, over a Project node > that > > has more than three fields, should not be implemented. (Modifying cost > > calculation is also not an option since the plan still has RelSubsets) > > > > I hope this makes sense, and thanks in advance! > > > > Best Regards, > > Andrew Tsvielodub > >
