Dear Calcite Devs, I’m currently looking into an issue in the SQL extension [1] of Apache Beam and was hoping to find some advice here. Using a bunch of Calcite ConverterRules [2], we convert a RelNode tree into a tree of BeamRelNodes which is then used to build a Beam DAG. Pretty standard I suppose …
What I’m scratching my head about is that applying the converter rules changes the semantics of the graph, which it shouldn’t I thought. Or is that a wrong expectation? Here’s a very simple SQL example to illustrate this (see also BeamUnionRelTest [3]): SELECT order_id, site_id, price FROM ORDER_DETAILS UNION ALL SELECT order_id, site_id, price FROM ORDER_DETAILS This is where we start at, the corresponding RelNode: LogicalUnion.NONE(input#0=LogicalProject#8,input#1=LogicalProject#10,all=true) Once the corresponding conversion rule [4] is applied to above by the CheapestPlanReplacer, but before visiting its old inputs, we get the following: BeamUnionRel.BEAM_LOGICAL(input#0=RelSubset#25,input#1=RelSubset#25,all=true) At this point both inputs refer to the same node (#25). However, once visiting the inputs [5] in CheapestPlanReplacer, that semantic information is lost as RelNodes get copied if inputs change [5]. In below result, the two inputs refer to different nodes: BeamUnionRel.BEAM_LOGICAL(input#0=BeamCalcRel#49,input#1=BeamCalcRel#50,all=true) This, however, currently prevents caching of intermediate results in the Beam DAG when this might be beneficial. Would you have any advice how to better approach this? Of course, I could use a stateful copy operation to handle such repeated copy operations with the same parameters. But this seems wrong to be honest. Thanks a million! Kind regards, Moritz [1] https://github.com/apache/beam/issues/24314 [2] https://github.com/apache/beam/blob/6ba647333c4c69fb6dfc65929456c7c11570382f/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java#L136-L152 [3] https://github.com/apache/beam/blob/6ba647333c4c69fb6dfc65929456c7c11570382f/sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamUnionRelTest.java#L52-L71 [4] https://github.com/apache/beam/blob/a96afe2c57c45a869a622086eaa4f81305f06e72/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamUnionRule.java [5] https://github.com/apache/calcite/blob/a326bd2d0e0b4b6b3336f10217b0ecbb79522239/core/src/main/java/org/apache/calcite/plan/volcano/RelSubset.java#L727-L739 As a recipient of an email from Talend, your contact personal data will be on our systems. Please see our privacy notice. <https://www.talend.com/privacy/>
