Re: Cluster mismatch between RelNodes of a query and a materialized view

Remus Rusanu Wed, 08 Mar 2017 09:37:27 -0800

Agree on the RelOptCluster.

I noticed how adding new method to RelNode yields a gargantuan task (from the 
list of compile errors I got…). But I’m not sure a RelShuttle can handle this. 
For my test case the 3 nodes that need to be cloned are HiveScanTable, 
HiveFilter and HiveProject, all declared in Hive and not even extending 
AbstractRelNode but directly base RelNode. A RelShuttle wouldn’t know about 
these types, and wouldn’t be able to create them (short of using reflection and 
adhering to a strict constructor signature, which I think is much too fragile). 
What I did to get the ball rolling I added a default implementation in 
AbstractRelNode (basically assert ‘must be implemented by subclass’), this 
allowed me to test easily and, as a proof of concept, I have it working. But I 
reckon is fragile test wise, and new RelNode types wouldn’t know about the 
requirement to provide a copyTo implementation.


Thanks,
~Remus 

On 3/8/17, 9:05 AM, "Julian Hyde" <[email protected]> wrote:

    The argument should be a RelOptCluster, not a RelOptPlanner. The link from 
RelNode to planner is indirect currently (via cluster) and will be non-existent 
after CALCITE-1536.
    
    I question whether we need a new method. Putting an abstract method on 
RelNode is a huge burden because every RelNode sub-class needs to be fixed when 
people upgrade. Even a non-abstract method imposes a conceptual burden: more 
methods to understand. 
    
    So, my approach would be to sub-class RelShuttle. It’s sufficient that it 
only works for LogicalXxx nodes.
    
    No need to copy RexNode expressions. They are immutable.
    
    Julian
    
    
    > On Mar 8, 2017, at 4:14 AM, Remus Rusanu <[email protected]> wrote:
    > 
    > I created CALCITE-1681 https://issues.apache.org/jira/browse/CALCITE-1681 
and I intent to work on it for finishing HIVE-15708.
    > My current thinking is to create a RelCopier based on RelShuttle and add 
a new abstract RelNode.copyTo(RelOptPlanner) that each concrete Rel type must 
override. The Rex part is already handled by the existing RexCopier.
    > 
    > Thanks,
    > ~Remus
    > 
    > On 3/6/17, 12:30 PM, "Julian Hyde" <[email protected]> wrote:
    > 
    >    Every RelNode belongs to a RelOptCluster, and basically there is one 
RelOptCluster created each time a query is prepared. When working with 
materialized views, the view’s query is represented as a tree of RelNodes, that 
tree is used for optimizing more than one query. When planning a particular 
query, the nodes of that query will have a different RelOptCluster than the 
nodes of the materialized view(s) they are matched against. 
    > 
    >    How do we deal with this? Do we copy the nodes into the query’s 
cluster once we have found a match? If so, how? I couldn’t find a sub-class of 
RelVisitor or RelShuttle that copies trees to a different RelOptCluster.
    > 
    >    By the way, https://issues.apache.org/jira/browse/CALCITE-1536 
<https://issues.apache.org/jira/browse/CALCITE-1536> aims to improve the 
RelNode life-cycle but I don’t think it will solve this problem.
    > 
    >    Julian
    > 
    >

Re: Cluster mismatch between RelNodes of a query and a materialized view

Reply via email to