For validation (i.e. that doesn’t modify the tree), I would use a visitor. 
RelVisitor may suffice.

There are also a few “whole tree” transformations, e.g. column pruning. Use 
sparingly.

You are correct that rules and their operands do not “scale” to match large 
sections of the tree. We could in principle extend matching a little (e.g. 
better handing of Union with many inputs) but the locality is mostly a good 
thing. In a Volcano graph, there are multiple nodes in each equivalence set, 
therefore huge numbers of paths through the graph. Deep matches would quickly 
become intractable.

I strongly recommend using traits, and in particular predicates 
(RelMdPredicates / RelOptPredicateList). Let’s suppose you want to know whether 
a particular input column is always equal to 5. You could write a rule that 
looks for a Project several layers down whose expression is the literal 5. But 
much better is to look at the predicates. Predicates are propagated up the 
tree, which means you don’t need to look at the structure, and you can reason 
and act locally.

Similar arguments apply for sort and distribution (which are also traits).

If are able to package your logic into a RelOptRule you will be pleased with 
the results. It composes beautifully and efficiently with the hundreds of other 
rules, and with all the flavors of metadata.

Julian


> On Nov 8, 2018, at 12:50 PM, Андрей Цвелодуб <[email protected]> wrote:
> 
> Hello everyone!
> 
> I have a question that I can't find an answer to, so maybe someone could
> help me.
> As a part of Rel Rules, there is always an operand, that matches a part of
> the tree, and says if the rule should be executed.
> The operand can be complex, so I can say for example - match an Aggregate
> on top of Project on top of Filter. AFAIU, this operand will only match if
> exactly this three nodes will be somewhere in the tree.
> But here is my question - what if I want a rule that will match a more
> generic structure, like this
>> Aggregate
>> -...
>> --* any number of any nodes in any levels
>> ---...
>> ----  Project
> Is there an official way to do that?
> 
> My first approach was to match any Aggregate and then try to inspect the
> underlying tree in matches()/onMatch(), but this turned out to be quite
> unreliable since it involves inspecting RelSubsets (and this shouldn't be
> done, as follows from
> https://lists.apache.org/thread.html/ee2349272e9d344228595c0940820b2fc525cc6115388c48e99495a6@%3Cdev.calcite.apache.org%3E).
> 
> 
> In case I'm doing it all wrong, I can formulate my question even broader -
> is there a mechanism to perform validation of the execution tree during the
> planning process, i.e. skip some plans as unimplementable based on their
> internal structure. As an example imagine I want to say that in
> JdbcConvention, all plans that have a Filter node, over a Project node that
> has more than three fields, should not be implemented. (Modifying cost
> calculation is also not an option since the plan still has RelSubsets)
> 
> I hope this makes sense, and thanks in advance!
> 
> Best Regards,
> Andrew Tsvielodub

Reply via email to