I created the pull request at
https://github.com/apache/incubator-calcite/pull/139

I will take a look at it later today.

Thanks,
Jesús


On 9/28/15, 7:57 AM, "Julian Hyde" <[email protected]> wrote:

>I have completed work on this change. Can someone please review and +1?
>
>https://github.com/julianhyde/incubator-calcite/tree/828-rule-builder
>
>https://issues.apache.org/jira/browse/CALCITE-828
>
>Julian
>
>> On Sep 25, 2015, at 1:07 PM, Julian Hyde <[email protected]> wrote:
>> 
>> The problem of making rules extensible is not straightforward. The
>>solution has to be a mixture of API changes and best practices. In this
>>change I am making API changes and fixing up the existing rules to
>>match, but I am not revisiting the rules to introduce best practices.
>> 
>> The API change is that every RelOptRule has a relBuilderFactory. We
>>strongly recommend that if a rule needs to create a relational
>>expression it either matches the type of its inputs (calling the copy
>>method) or it uses the factory. That means removing calls in rules to
>>methods such as LogicalProject.create or RelOptUtil.project. I¹ve done
>>quite a lot of this as part of this change.
>> 
>> The best practice is that a rule¹s constructor should expose
>>RelBuilderFactory and Class parameters so that the rule can be re-used.
>>Quite a few rules do this, but a lot do not. Quite a few rules are
>>re-usable in principle, but have not been tested against other RelNode
>>sub-classes, and quite a few rules are just not re-usable. So I propose
>>that we enforce best practices when reviewing contributions but don¹t
>>spend a lot of effort upgrading the existing rules.
>> 
>> In some cases (e.g. WindowedAggRelSplitter in ProjectToWindowRule) the
>>rule uses the default factory but I¹ve changed the code to use a
>>RelBuilder. So, there¹s no change in behavior, but if someone were to
>>add a RelBuilderFactory to the rule¹s constructor, they would reap the
>>benefit.
>> 
>> Julian
>> 
>> 
>> 
>>> On Sep 24, 2015, at 6:29 AM, Jesus Camacho Rodriguez
>>><[email protected]> wrote:
>>> 
>>> In our case, we had to do some copy-pasting of rules in the past in
>>>Hive.
>>> Although the reasons have been already discussed in the thread, just to
>>> confirm:
>>> - Rules that use a default factory to create operators e.g.
>>> LogicalProject, instead of being able to provide our own factory. An
>>> example is AggregateProjectMergeRule. (Instead of copy/pasting the
>>>rule,
>>> we could apply another rule to transform LogicalProject into
>>>HiveProject,
>>> but this is not the right way either).
>>> - Rules that match directly on Logical instances of the operators. An
>>> example is SortProjectTransposeRule.
>>> 
>>> I see that with the solutions that we are discussing, those two
>>>problems
>>> would be solved.
>>> 
>>> About including the matching classes in RelBuilder, it can be a good
>>>idea
>>> if we still make the rules extensible i.e. we can provide our own
>>>matching
>>> classes if we want to. Assume a project that might subclass the Project
>>> operator into two different operators, then would like to match a
>>>certain
>>> rule only to one of them. Further, I guess using the copy methods
>>>provided
>>> by the operators when possible, instead of calling directly to the
>>> Factory, would make the rules more extensible for these cases.
>>> 
>>> --
>>> Jesús
>>> 
>>> 
>>> On 9/23/15, 11:08 PM, "Jinfeng Ni" <[email protected]> wrote:
>>> 
>>>> I did a quick check of Drill side rule, and did not find the
>>>>copy/paste
>>>> code
>>>> simply because we have to specify the rule's operands.
>>>> 
>>>> The reason I'm asking this is because many Calcite rules use core
>>>> Filter.class,
>>>> Project.class etc.  Currently, Drill simply uses some of Calcite rules
>>>> directly.
>>>> However, that means the rule will fire against both LogicalFilter, and
>>>> DrillFilter,
>>>> which seems to be redundant.  I understand we could extend Calcite
>>>>rule to
>>>> make sure it matches only one of them. But I'm looking for a bit more
>>>> direct way
>>>> to do that.
>>>> 
>>>> For instance, in [2] we uses Calcite's FilterMergeRule directly, which
>>>> would match
>>>> both LogicalFilter and DrillFilter.
>>>> 
>>>> Also, I feel the rule patten specification can be regarded as the
>>>> input type for the
>>>> rule, while RelBuilder is to address the output type from the rule's
>>>> firing. That's why
>>>> I feel they had better be specified together in some way.
>>>> 
>>>> 
>>>> [1] 
>>>> 
>>>>https://github.com/apache/incubator-calcite/blob/master/core/src/main/j
>>>>ava
>>>> /org/apache/calcite/rel/rules/FilterMergeRule.java#L46
>>>> 
>>>> [2] 
>>>> 
>>>>https://github.com/apache/drill/blob/master/exec/java-exec/src/main/jav
>>>>a/o
>>>> rg/apache/drill/exec/planner/logical/DrillRuleSets.java#L121
>>>> 
>>>> On Wed, Sep 23, 2015 at 2:17 PM, Jacques Nadeau <[email protected]>
>>>> wrote:
>>>>> I think this would be a good place to have a little bit more design,
>>>>> discussion and validation. Julian, it seems like you've done a lot of
>>>>> thinking and it would be good to have a better understanding of the
>>>>> choices
>>>>> you're making. (Your response below seems to be just scratching the
>>>>> surface).
>>>>> 
>>>>> If the goal is to minimize copy and paste rules, it would be useful
>>>>>to
>>>>> test
>>>>> the proposal against example rules that have been copy/pasted and see
>>>>> what
>>>>> other patterns might emerge. At first glance, I'm inclined more with
>>>>> Jinfeng's proposal regarding operands than yours. Your decision about
>>>>> copy/not behavior should also be discussed further.
>>>>> 
>>>>> Drill (Jinfeng?) and Hive (Jesus?) guys, can you provide an example
>>>>>set
>>>>> of
>>>>> rules that we've needed to copy/paste due to constraints around rule
>>>>> configuration to make sure the outcome of these changes is as
>>>>>desired?
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, Sep 23, 2015 at 1:28 PM, Julian Hyde <[email protected]>
>>>>>wrote:
>>>>> 
>>>>>> Great question.
>>>>>> 
>>>>>> For that case, I think you should carry on making a separate rule
>>>>>> instance with operands that contain classes. RelBuilder is a good
>>>>>> single point to put a lot of the configuration, but it is taking it
>>>>>> too far to put the target classes in there. I don't want its role as
>>>>>> "point for configuration for rules" to overwhelm its main role "an
>>>>>> easy way to build relational algebra expressions".
>>>>>> 
>>>>>> And besides, operands are powerful and descriptive, and because the
>>>>>> engine understands them it can short-cut which rules are fired.
>>>>>> 
>>>>>> For what it's worth, there's another piece of configuration that I
>>>>>> considered putting into RelBuilder but decided not to. That is,
>>>>>> whether a rule instance should call RelNode.copy to automatically
>>>>>> create RelNodes the same type as its inputs. See I introduced
>>>>>> copyFilter and copyProject arguments in FilterProjectTransposeRule:
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>https://github.com/julianhyde/incubator-calcite/blob/8099cf3151a462b0
>>>>>>6a4
>>>>>> 
>>>>>>cbb1c0cad8f7e90881ca5/core/src/main/java/org/apache/calcite/rel/rules
>>>>>>/Fi
>>>>>> lterProjectTransposeRule.java#L67
>>>>>> 
>>>>>> rather than introduce new methods RelBuilder.shouldCopyProject() and
>>>>>> .shouldCopyFilter().
>>>>>> 
>>>>>> And by the way I also considered adding methods to RelBuilderFactory
>>>>>> (formerly known as ProtoRelBuilder). Adding methods there would not
>>>>>> "pollute" RelBuilder as badly.
>>>>>> 
>>>>>> Julian
>>>>>> 
>>>>>> 
>>>>>> On Wed, Sep 23, 2015 at 12:18 PM, Jinfeng Ni <[email protected]>
>>>>>> wrote:
>>>>>>> +1.
>>>>>>> 
>>>>>>> I have one question.  RelBuilder seems to be used in onMatch()
>>>>>> method,
>>>>>>> when a particular type of RelNode has to be created.  What about
>>>>>>> RelOptRuleOperand in the constructor of rule? Let's say if we want
>>>>>>>to
>>>>>>> rule to match only DrillProject, or HiveFilter. The current way
>>>>>> seems to
>>>>>>> be that we extend the rule by providing a different
>>>>>> RelOptRuleOperand.
>>>>>>> 
>>>>>>> Can RelBuilder also have getFilterClass(), getProjectClass() etc,
>>>>>> which
>>>>>>> would be used to specify the rule patten matching?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Tue, Sep 22, 2015 at 6:41 PM, Jacques Nadeau
>>>>>>><[email protected]>
>>>>>> wrote:
>>>>>>>> Agree on the utility of this. +1.
>>>>>>>> 
>>>>>>>> On Tue, Sep 22, 2015 at 6:22 PM, John Pullokkaran <
>>>>>>>> [email protected]> wrote:
>>>>>>>> 
>>>>>>>>> This is useful, though Hive will have to do some refactoring.
>>>>>>>>> 
>>>>>>>>> In past Hive had to copy bunch of rules to work around these
>>>>>> issues.
>>>>>>>>> Rules & relnodes for org.apache.calcite.rel.logical.* is a good
>>>>>> example of
>>>>>>>>> this.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> John
>>>>>>>>> 
>>>>>>>>> On 9/22/15, 5:14 PM, "Julian Hyde" <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>> I have started work on
>>>>>>>>>> https://issues.apache.org/jira/browse/CALCITE-828 and my
>>>>>>>>>> work-in-progress is in
>>>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>https://github.com/julianhyde/incubator-calcite/commits/828-rule-buil
>>>>>>der
>>>>>> .
>>>>>>>>>> 
>>>>>>>>>> This will be a big change to teams such as Hive and Drill. Now
>>>>>> would
>>>>>>>>>> be a good time to chime in.
>>>>>>>>>> 
>>>>>>>>>> The plan is that every rule instance has a ProtoRelBuilder field
>>>>>> from
>>>>>>>>>> which a RelBuilder can be created to be used in an onMatch
>>>>>> method. The
>>>>>>>>>> RelBuilder contains factories for every kind of RelNode, so we
>>>>>> won't
>>>>>>>>>> have to keep plumbing new factories in to existing rules.
>>>>>>>>>> 
>>>>>>>>>> There will be other advantages. RelBuilder is an easier and more
>>>>>>>>>> concise way of creating relational expressions. It does some
>>>>>> useful
>>>>>>>>>> stuff for free, like flattening ANDs and eliminating identity
>>>>>>>>>> projections. I'm seeing the volume of code decrease and a few
>>>>>> minor
>>>>>>>>>> plan improvements.
>>>>>>>>>> 
>>>>>>>>>> I haven't yet found a good way to deal with the pattern where
>>>>>>>>>>if,
>>>>>> say,
>>>>>>>>>> ProjectFactory is null the rule is to use Project.copy to
>>>>>>>>>>create a
>>>>>>>>>> Project of the same type.
>>>>>>>>>> 
>>>>>>>>>> I'm keeping & deprecating the old rule constructors that take a
>>>>>>>>>> variety of XxxFactory arguments.
>>>>>>>>>> 
>>>>>>>>>> As always, we try to keep API compatibility and mark the old API
>>>>>>>>>> 
>>>>>>>>>> @Deprecated // to be removed before 2.0
>>>>>>>>>> 
>>>>>>>>>> but the deprecated APIs are no longer tested and are likely to
>>>>>>>>>>be
>>>>>>>>>> flaky. We strongly suggest that you fix calls to deprecated APIs
>>>>>> as
>>>>>>>>>> soon as you upgrade to a more recent version of Calcite.
>>>>>>>>>> 
>>>>>>>>>> Julian
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> 
>
>

Reply via email to