> On Mar 8, 2017, at 11:27 AM, Ashutosh Chauhan <[email protected]> wrote: > > Fact that > RelFactories exist is evidence enough that LogicalRelNodes are not > sufficient outside of Calcite.
Not true. Hive uses its own RelNodes far more than other projects do. The other projects do a significant part of their planning in the logical domain, and only later move to the physical domain. They use RelFactories so that they can re-use rules when they are in the physical domain. Materialized view substitution is a logical operation and it makes sense to do it in the logical domain. I don’t recall the details of the Hive RelNodes but isn’t it true that in most cases they have the same information as the corresponding Logical RelNode? We can find a solution to this problem for Hive RelNodes, but I still feel that Hive overuses Hive RelNodes and that causes inefficiencies for both Hive and Calcite. Julian > On Mar 8, 2017, at 11:27 AM, Ashutosh Chauhan <[email protected]> wrote: > > Working on LogicalRelNodes doesnt solve problem for Hive (or any other > project which use RelFactories to generate custom rel node). Fact that > RelFactories exist is evidence enough that LogicalRelNodes are not > sufficient outside of Calcite. We have to solve this problem in that > context. > > On Wed, Mar 8, 2017 at 10:50 AM, Remus Rusanu <[email protected]> > wrote: > >> To make sure I understand, you’re saying that the cloning would take a >> tree of various Xxx operators (eg. HiveProject) and generate a tree of >> equivalent LogicalXxx operators (HiveProject -> LogicalProject, >> HiveFilter->LogicalFilter, HiveAggregate -> LogicalAggregate and so on). >> Is my understanding correct? Ignoring Hive issues, wouldn’t that loose >> information? >> For Hive specific case, the HiveXxx operator extend Xxx not LogicalXxx >> (not sure yet if this is an issue or not) but for sure they carry lots of >> extra info that would be lost if the clone would contain LogicalXxx instead >> of HiveXxx. >> >> Thanks, >> ~Remus >> >> On 3/8/17, 9:45 AM, "Julian Hyde" <[email protected]> wrote: >> >> Could you work on LogicalXxx rather than HiveXxx? I know the Hive team >> likes to do everything in terms of HiveXxx RelNodes but I’m not sure it has >> to be that way. >> >> Julian >> >>> On Mar 8, 2017, at 9:37 AM, Remus Rusanu <[email protected]> >> wrote: >>> >>> Agree on the RelOptCluster. >>> >>> I noticed how adding new method to RelNode yields a gargantuan task >> (from the list of compile errors I got…). But I’m not sure a RelShuttle can >> handle this. For my test case the 3 nodes that need to be cloned are >> HiveScanTable, HiveFilter and HiveProject, all declared in Hive and not >> even extending AbstractRelNode but directly base RelNode. A RelShuttle >> wouldn’t know about these types, and wouldn’t be able to create them (short >> of using reflection and adhering to a strict constructor signature, which I >> think is much too fragile). What I did to get the ball rolling I added a >> default implementation in AbstractRelNode (basically assert ‘must be >> implemented by subclass’), this allowed me to test easily and, as a proof >> of concept, I have it working. But I reckon is fragile test wise, and new >> RelNode types wouldn’t know about the requirement to provide a copyTo >> implementation. >>> >>> Thanks, >>> ~Remus >>> >>> On 3/8/17, 9:05 AM, "Julian Hyde" <[email protected]> wrote: >>> >>> The argument should be a RelOptCluster, not a RelOptPlanner. The >> link from RelNode to planner is indirect currently (via cluster) and will >> be non-existent after CALCITE-1536. >>> >>> I question whether we need a new method. Putting an abstract >> method on RelNode is a huge burden because every RelNode sub-class needs to >> be fixed when people upgrade. Even a non-abstract method imposes a >> conceptual burden: more methods to understand. >>> >>> So, my approach would be to sub-class RelShuttle. It’s sufficient >> that it only works for LogicalXxx nodes. >>> >>> No need to copy RexNode expressions. They are immutable. >>> >>> Julian >>> >>> >>>> On Mar 8, 2017, at 4:14 AM, Remus Rusanu <[email protected]> >> wrote: >>>> >>>> I created CALCITE-1681 https://issues.apache.org/ >> jira/browse/CALCITE-1681 and I intent to work on it for finishing >> HIVE-15708. >>>> My current thinking is to create a RelCopier based on RelShuttle >> and add a new abstract RelNode.copyTo(RelOptPlanner) that each concrete Rel >> type must override. The Rex part is already handled by the existing >> RexCopier. >>>> >>>> Thanks, >>>> ~Remus >>>> >>>> On 3/6/17, 12:30 PM, "Julian Hyde" <[email protected]> wrote: >>>> >>>> Every RelNode belongs to a RelOptCluster, and basically there is >> one RelOptCluster created each time a query is prepared. When working with >> materialized views, the view’s query is represented as a tree of RelNodes, >> that tree is used for optimizing more than one query. When planning a >> particular query, the nodes of that query will have a different >> RelOptCluster than the nodes of the materialized view(s) they are matched >> against. >>>> >>>> How do we deal with this? Do we copy the nodes into the query’s >> cluster once we have found a match? If so, how? I couldn’t find a sub-class >> of RelVisitor or RelShuttle that copies trees to a different RelOptCluster. >>>> >>>> By the way, https://issues.apache.org/jira/browse/CALCITE-1536 < >> https://issues.apache.org/jira/browse/CALCITE-1536> aims to improve the >> RelNode life-cycle but I don’t think it will solve this problem. >>>> >>>> Julian >>>> >>>> >>> >>> >>> >>> >> >> >> >> >>
