I created the pull request at https://github.com/apache/incubator-calcite/pull/139
I will take a look at it later today. Thanks, Jesús On 9/28/15, 7:57 AM, "Julian Hyde" <[email protected]> wrote: >I have completed work on this change. Can someone please review and +1? > >https://github.com/julianhyde/incubator-calcite/tree/828-rule-builder > >https://issues.apache.org/jira/browse/CALCITE-828 > >Julian > >> On Sep 25, 2015, at 1:07 PM, Julian Hyde <[email protected]> wrote: >> >> The problem of making rules extensible is not straightforward. The >>solution has to be a mixture of API changes and best practices. In this >>change I am making API changes and fixing up the existing rules to >>match, but I am not revisiting the rules to introduce best practices. >> >> The API change is that every RelOptRule has a relBuilderFactory. We >>strongly recommend that if a rule needs to create a relational >>expression it either matches the type of its inputs (calling the copy >>method) or it uses the factory. That means removing calls in rules to >>methods such as LogicalProject.create or RelOptUtil.project. I¹ve done >>quite a lot of this as part of this change. >> >> The best practice is that a rule¹s constructor should expose >>RelBuilderFactory and Class parameters so that the rule can be re-used. >>Quite a few rules do this, but a lot do not. Quite a few rules are >>re-usable in principle, but have not been tested against other RelNode >>sub-classes, and quite a few rules are just not re-usable. So I propose >>that we enforce best practices when reviewing contributions but don¹t >>spend a lot of effort upgrading the existing rules. >> >> In some cases (e.g. WindowedAggRelSplitter in ProjectToWindowRule) the >>rule uses the default factory but I¹ve changed the code to use a >>RelBuilder. So, there¹s no change in behavior, but if someone were to >>add a RelBuilderFactory to the rule¹s constructor, they would reap the >>benefit. >> >> Julian >> >> >> >>> On Sep 24, 2015, at 6:29 AM, Jesus Camacho Rodriguez >>><[email protected]> wrote: >>> >>> In our case, we had to do some copy-pasting of rules in the past in >>>Hive. >>> Although the reasons have been already discussed in the thread, just to >>> confirm: >>> - Rules that use a default factory to create operators e.g. >>> LogicalProject, instead of being able to provide our own factory. An >>> example is AggregateProjectMergeRule. (Instead of copy/pasting the >>>rule, >>> we could apply another rule to transform LogicalProject into >>>HiveProject, >>> but this is not the right way either). >>> - Rules that match directly on Logical instances of the operators. An >>> example is SortProjectTransposeRule. >>> >>> I see that with the solutions that we are discussing, those two >>>problems >>> would be solved. >>> >>> About including the matching classes in RelBuilder, it can be a good >>>idea >>> if we still make the rules extensible i.e. we can provide our own >>>matching >>> classes if we want to. Assume a project that might subclass the Project >>> operator into two different operators, then would like to match a >>>certain >>> rule only to one of them. Further, I guess using the copy methods >>>provided >>> by the operators when possible, instead of calling directly to the >>> Factory, would make the rules more extensible for these cases. >>> >>> -- >>> Jesús >>> >>> >>> On 9/23/15, 11:08 PM, "Jinfeng Ni" <[email protected]> wrote: >>> >>>> I did a quick check of Drill side rule, and did not find the >>>>copy/paste >>>> code >>>> simply because we have to specify the rule's operands. >>>> >>>> The reason I'm asking this is because many Calcite rules use core >>>> Filter.class, >>>> Project.class etc. Currently, Drill simply uses some of Calcite rules >>>> directly. >>>> However, that means the rule will fire against both LogicalFilter, and >>>> DrillFilter, >>>> which seems to be redundant. I understand we could extend Calcite >>>>rule to >>>> make sure it matches only one of them. But I'm looking for a bit more >>>> direct way >>>> to do that. >>>> >>>> For instance, in [2] we uses Calcite's FilterMergeRule directly, which >>>> would match >>>> both LogicalFilter and DrillFilter. >>>> >>>> Also, I feel the rule patten specification can be regarded as the >>>> input type for the >>>> rule, while RelBuilder is to address the output type from the rule's >>>> firing. That's why >>>> I feel they had better be specified together in some way. >>>> >>>> >>>> [1] >>>> >>>>https://github.com/apache/incubator-calcite/blob/master/core/src/main/j >>>>ava >>>> /org/apache/calcite/rel/rules/FilterMergeRule.java#L46 >>>> >>>> [2] >>>> >>>>https://github.com/apache/drill/blob/master/exec/java-exec/src/main/jav >>>>a/o >>>> rg/apache/drill/exec/planner/logical/DrillRuleSets.java#L121 >>>> >>>> On Wed, Sep 23, 2015 at 2:17 PM, Jacques Nadeau <[email protected]> >>>> wrote: >>>>> I think this would be a good place to have a little bit more design, >>>>> discussion and validation. Julian, it seems like you've done a lot of >>>>> thinking and it would be good to have a better understanding of the >>>>> choices >>>>> you're making. (Your response below seems to be just scratching the >>>>> surface). >>>>> >>>>> If the goal is to minimize copy and paste rules, it would be useful >>>>>to >>>>> test >>>>> the proposal against example rules that have been copy/pasted and see >>>>> what >>>>> other patterns might emerge. At first glance, I'm inclined more with >>>>> Jinfeng's proposal regarding operands than yours. Your decision about >>>>> copy/not behavior should also be discussed further. >>>>> >>>>> Drill (Jinfeng?) and Hive (Jesus?) guys, can you provide an example >>>>>set >>>>> of >>>>> rules that we've needed to copy/paste due to constraints around rule >>>>> configuration to make sure the outcome of these changes is as >>>>>desired? >>>>> >>>>> >>>>> >>>>> On Wed, Sep 23, 2015 at 1:28 PM, Julian Hyde <[email protected]> >>>>>wrote: >>>>> >>>>>> Great question. >>>>>> >>>>>> For that case, I think you should carry on making a separate rule >>>>>> instance with operands that contain classes. RelBuilder is a good >>>>>> single point to put a lot of the configuration, but it is taking it >>>>>> too far to put the target classes in there. I don't want its role as >>>>>> "point for configuration for rules" to overwhelm its main role "an >>>>>> easy way to build relational algebra expressions". >>>>>> >>>>>> And besides, operands are powerful and descriptive, and because the >>>>>> engine understands them it can short-cut which rules are fired. >>>>>> >>>>>> For what it's worth, there's another piece of configuration that I >>>>>> considered putting into RelBuilder but decided not to. That is, >>>>>> whether a rule instance should call RelNode.copy to automatically >>>>>> create RelNodes the same type as its inputs. See I introduced >>>>>> copyFilter and copyProject arguments in FilterProjectTransposeRule: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>https://github.com/julianhyde/incubator-calcite/blob/8099cf3151a462b0 >>>>>>6a4 >>>>>> >>>>>>cbb1c0cad8f7e90881ca5/core/src/main/java/org/apache/calcite/rel/rules >>>>>>/Fi >>>>>> lterProjectTransposeRule.java#L67 >>>>>> >>>>>> rather than introduce new methods RelBuilder.shouldCopyProject() and >>>>>> .shouldCopyFilter(). >>>>>> >>>>>> And by the way I also considered adding methods to RelBuilderFactory >>>>>> (formerly known as ProtoRelBuilder). Adding methods there would not >>>>>> "pollute" RelBuilder as badly. >>>>>> >>>>>> Julian >>>>>> >>>>>> >>>>>> On Wed, Sep 23, 2015 at 12:18 PM, Jinfeng Ni <[email protected]> >>>>>> wrote: >>>>>>> +1. >>>>>>> >>>>>>> I have one question. RelBuilder seems to be used in onMatch() >>>>>> method, >>>>>>> when a particular type of RelNode has to be created. What about >>>>>>> RelOptRuleOperand in the constructor of rule? Let's say if we want >>>>>>>to >>>>>>> rule to match only DrillProject, or HiveFilter. The current way >>>>>> seems to >>>>>>> be that we extend the rule by providing a different >>>>>> RelOptRuleOperand. >>>>>>> >>>>>>> Can RelBuilder also have getFilterClass(), getProjectClass() etc, >>>>>> which >>>>>>> would be used to specify the rule patten matching? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Sep 22, 2015 at 6:41 PM, Jacques Nadeau >>>>>>><[email protected]> >>>>>> wrote: >>>>>>>> Agree on the utility of this. +1. >>>>>>>> >>>>>>>> On Tue, Sep 22, 2015 at 6:22 PM, John Pullokkaran < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> This is useful, though Hive will have to do some refactoring. >>>>>>>>> >>>>>>>>> In past Hive had to copy bunch of rules to work around these >>>>>> issues. >>>>>>>>> Rules & relnodes for org.apache.calcite.rel.logical.* is a good >>>>>> example of >>>>>>>>> this. >>>>>>>>> >>>>>>>>> >>>>>>>>> John >>>>>>>>> >>>>>>>>> On 9/22/15, 5:14 PM, "Julian Hyde" <[email protected]> wrote: >>>>>>>>> >>>>>>>>>> I have started work on >>>>>>>>>> https://issues.apache.org/jira/browse/CALCITE-828 and my >>>>>>>>>> work-in-progress is in >>>>>>>>>> >>>>>> >>>>>> >>>>>>https://github.com/julianhyde/incubator-calcite/commits/828-rule-buil >>>>>>der >>>>>> . >>>>>>>>>> >>>>>>>>>> This will be a big change to teams such as Hive and Drill. Now >>>>>> would >>>>>>>>>> be a good time to chime in. >>>>>>>>>> >>>>>>>>>> The plan is that every rule instance has a ProtoRelBuilder field >>>>>> from >>>>>>>>>> which a RelBuilder can be created to be used in an onMatch >>>>>> method. The >>>>>>>>>> RelBuilder contains factories for every kind of RelNode, so we >>>>>> won't >>>>>>>>>> have to keep plumbing new factories in to existing rules. >>>>>>>>>> >>>>>>>>>> There will be other advantages. RelBuilder is an easier and more >>>>>>>>>> concise way of creating relational expressions. It does some >>>>>> useful >>>>>>>>>> stuff for free, like flattening ANDs and eliminating identity >>>>>>>>>> projections. I'm seeing the volume of code decrease and a few >>>>>> minor >>>>>>>>>> plan improvements. >>>>>>>>>> >>>>>>>>>> I haven't yet found a good way to deal with the pattern where >>>>>>>>>>if, >>>>>> say, >>>>>>>>>> ProjectFactory is null the rule is to use Project.copy to >>>>>>>>>>create a >>>>>>>>>> Project of the same type. >>>>>>>>>> >>>>>>>>>> I'm keeping & deprecating the old rule constructors that take a >>>>>>>>>> variety of XxxFactory arguments. >>>>>>>>>> >>>>>>>>>> As always, we try to keep API compatibility and mark the old API >>>>>>>>>> >>>>>>>>>> @Deprecated // to be removed before 2.0 >>>>>>>>>> >>>>>>>>>> but the deprecated APIs are no longer tested and are likely to >>>>>>>>>>be >>>>>>>>>> flaky. We strongly suggest that you fix calls to deprecated APIs >>>>>> as >>>>>>>>>> soon as you upgrade to a more recent version of Calcite. >>>>>>>>>> >>>>>>>>>> Julian >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>>> >>> >> > >
