Re: Diff in Materialization views registration between Calcite 1.10 and calcite 1.12

Jesus Camacho Rodriguez Thu, 09 Mar 2017 23:38:35 -0800

There are two different objects associated with the view: 1) the view itself 
(TableScan on a materialized view) and 2) the view content (RelNode plan 
representing the query of the view).


1) I understand that the first object (TS on view) should be registered, as it 
might be part of the optimization process and the final query plan. However, I 
think that is the case, as the TS and other possible operators on top of it to 
unify the expressions are created in the context of the user query cluster.

2) However, is it necessary to register the materialized view associated query 
using _registerImpl_? The view query has a slightly different nature as it is 
not part of the user query and it will not be part of the final plan, it would 
add more nodes to the planning phase, and rules will be triggered on those 
nodes too if I am not mistaken? That would increase optimization 
time/complexity unnecessarily for large number of views/nodes in views?

Unless I am missing something, I think we should avoid calling _registerImpl_ 
for 2).


--
Jesús


On 3/9/17, 9:17 PM, "Julian Hyde" <jh...@apache.org> wrote:

>So, the question is whether materialized views need to be “registered” with 
>the planner before they can be considered. If they are “registered” with a 
>Volcano planner this means that they are included in equivalence classes 
>(RelSets and RelSubsets) and canonized.
>
>A weaker form of registration is to make sure that the types used (in both the 
>row-type of a RelNode and in the various RexNodes contained therein) all come 
>from the same type factory.
>
>Clearly there are advantages to registration (if objects are canonized they 
>use less memory and can be compared using ==) and there are negatives 
>(significant copying is involved). So, the question is whether we can use some 
>kind of compromise: work on un-registered RelNodes at an early stage (while 
>figuring out which materialized views might pertain to a query) and register 
>only when we have narrowed down the set of materialized views.
>
>Maryann,
>
>Since you did https://issues.apache.org/jira/browse/CALCITE-1500 
><https://issues.apache.org/jira/browse/CALCITE-1500>, can you comment on this 
>change?
>
>Julian
>
>
>> On Mar 9, 2017, at 1:09 PM, Remus Rusanu <rrus...@hortonworks.com> wrote:
>> 
>> Moving to calcite-dev
>> 
>> From: Remus Rusanu <rrus...@hortonworks.com>
>> Date: Thursday, March 9, 2017 at 1:04 PM
>> To: Ashutosh Chauhan <ashut...@hortonworks.com>, Julian Hyde 
>> <jh...@hortonworks.com>
>> Cc: "sql...@hortonworks.com" <sql...@hortonworks.com>
>> Subject: Why Calcite 1.10 is not hitting the assert
>> 
>> The 1.12 relevant assert stack is this:
>>       at 
>> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1475)
>>       at 
>> org.apache.calcite.plan.volcano.VolcanoPlanner.registerMaterializations(VolcanoPlanner.java:368)
>>       at 
>> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:592)
>>       at 
>> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1467)
>> 
>> In 1.10 the names are a bit different, but VolcanoPlanner.findBestExp() 
>> calls useApplicableMaterializations() which exists immediately, because 
>> context.unwrap(CalciteConnectionConfig.class) returns null. So no 
>> ‘registration’ occurs (registerImpl is never called with the provided 
>> materialization plan, as per my debugging).
>> 
>> However, when needed, the materialization is found. This stack bellow finds 
>> it, and uses it, despite not being ‘registered’:
>>       at 
>> org.apache.calcite.plan.volcano.VolcanoPlanner.getMaterializations(VolcanoPlanner.java:348)
>>       at 
>> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewFilterScanRule.apply(HiveMaterializedViewFilterScanRule.java:71)
>>       at 
>> org.apache.hadoop.hive.ql.optimizer.calcite.rules.views.HiveMaterializedViewFilterScanRule.onMatch(HiveMaterializedViewFilterScanRule.java:64)
>>       at 
>> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:213)
>>       at 
>> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:819)
>>       at 
>> org.apache.hadoop.hive.ql.parse.CalcitePlanner$CalcitePlannerAction.apply(CalcitePlanner.java:1455)
>> 
>> The result is the desired one:
>> 
>> hive> create materialized view srcm enable rewrite as select key from src 
>> where key=10;
>> …
>> hive> explain extended select key from src where key=10;
>> OK
>> STAGE DEPENDENCIES:
>>  Stage-0 is a root stage
>> 
>> STAGE PLANS:
>>  Stage: Stage-0
>>    Fetch Operator
>>      limit: -1
>>      Processor Tree:
>>        TableScan
>>          alias: default.srcm
>>          GatherStats: false
>>          Select Operator
>>            expressions: key (type: string)
>>            outputColumnNames: _col0
>>            ListSink
>> 
>> The big changes came with CALCITE-1500
>> 
>

Re: Diff in Materialization views registration between Calcite 1.10 and calcite 1.12

Reply via email to