I see now. I think the problem only occurs when you call AbstractRelNode.recomputeDigest().
The first time the digest is computed, the input RelNodes have a digest (and desc) as it has been set in AbstractRelNode’s constructor: this.digest = getRelTypeName() + "#" + id; this.desc = digest; Explain writer uses the “desc” field to identify inputs, but maybe it should use id or type + id. Or maybe the “desc” field should be final. By the way, the comment // Substring uses the same underlying array of chars, so saves a bit // of memory. was true until JDK 1.6 but is no longer true. Can you log a JIRA case please. Julian > On Aug 15, 2018, at 2:37 PM, Laurent Goujon <[email protected]> wrote: > > Sorry, I should have mentioned the method too: HepPlanner#buildFinalPlan > (when running RelOptRulesTest#testWindowInParenthesis()) > > On Wed, Aug 15, 2018 at 2:36 PM Laurent Goujon <[email protected]> wrote: > >> It looks to happen when building the final plan: the hep planner goes >> recursively to each node to recompute the digest. In that relnode tree, >> there's no more HepRelVertex nodes, and the digest now includes the whole >> input(s) description. >> >> On Wed, Aug 15, 2018 at 2:33 PM Julian Hyde <[email protected]> wrote: >> >>> When I run that test I get >>> >>> LogicalProject(input=HepRelVertex#10,$0=$9) >>> >>> Have you screwed something up? >>> >>>> On Aug 15, 2018, at 2:23 PM, Laurent Goujon <[email protected]> wrote: >>>> >>>> Just ran RelOptRulesTest with a breakpoint in >>>> AbstractRelNode#computeDigest() and I'm able to observe those kind of >>>> digest: >>>> >>> "LogicalProject(input=rel#6:LogicalWindow(input=rel#0:LogicalTableScan(table=[CATALOG, >>>> SALES, EMP]),window#0=window(partition {0} order by [0] range between >>>> UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT()])),$0=$9)" >>>> >>>> On Wed, Aug 15, 2018 at 2:09 PM Laurent Goujon <[email protected]> >>> wrote: >>>> >>>>> Here's one (partial) example (truncated because it contains potential >>>>> sensitive info, and didn't obfuscate or try to reproduce locally with >>> non >>>>> sensitive data): >>>>> >>>>> >>> "rel#8643738:LogicalProject.NONE.ANY([]).[](input=rel#8643736:LogicalUnion.NONE.ANY([]).[](input#0=rel#8643702:LogicalUnion.NONE.ANY([]).[](input#0=rel#8643668:LogicalUnion.NONE.ANY([]).[](input#0=rel#8643634:LogicalProject.NONE.ANY([]).[](input=rel#8643632:LogicalAggregate.NONE.ANY([]).[](input=rel#8643630:LogicalAggregate.NONE.ANY([]).[](input=rel#8643628:LogicalProject.NONE.ANY([]).[](input=rel#8643626:LogicalFilter.NONE.ANY([]).[](input=rel#8643624:LogicalProject.NONE.ANY([]).[](input=rel#8643622:LogicalProject.NONE.ANY([]).[](input=rel#8643842:MultiJoin.NONE.ANY([]).[](input#0=rel#8643838:LogicalProject.NONE.ANY([]).[](input=rel#8643615:MultiJoin.NONE.ANY([]).[](input#0=rel#8643603:LogicalProject.NONE.ANY([]).[](input=rel#8643601:SampleCrel.NONE.ANY([]).[](input=rel#8639853:ScanCrel.NONE.ANY([]).[](table="... >>>>> >>>>> The Logical* relnodes don't override computeDigest method, so this is >>>>> basically whatever AbstractRelNode#computeDigest is doing: >>>>> >>> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/AbstractRelNode.java#L415 >>>>> >>>>> Laurent >>>>> >>>>> >>>>> >>>>> On Wed, Aug 15, 2018 at 1:57 PM Julian Hyde <[email protected]> wrote: >>>>> >>>>>> I thought the digest only included the IDs of the inputs, not the >>> digest >>>>>> of the inputs. Am I mistaken? >>>>>> >>>>>> Could you give an example of large description & digest? >>>>>> >>>>>>> On Aug 15, 2018, at 1:46 PM, Laurent Goujon <[email protected]> >>> wrote: >>>>>>> >>>>>>> Hi folks, >>>>>>> >>>>>>> I'm looking for some guidance here before opening JIRAs/pull >>> requests. >>>>>>> >>>>>>> I'm examining a memory dump during a planning operation and a >>>>>> significant >>>>>>> amount of memory are strings used for RelNode digest and description >>>>>> (some >>>>>>> strings being around 130kb). In that particular case, the relnode >>> tree >>>>>> is >>>>>>> particularly deep, and since the digest is basically done >>> recursively, >>>>>> the >>>>>>> deepest/widest the tree, the longer the digest. >>>>>>> >>>>>>> The easy solution would be to not go deep when adding inputs to the >>>>>> digest, >>>>>>> and instead of adding the input description to only add their type, >>> id >>>>>> and >>>>>>> traits (and also not recurse). Would this break parts of calcite, or >>>>>> cause >>>>>>> other inconvenience because some use-cases rely on digest/description >>>>>> to be >>>>>>> basically the whole tree in a textual form? >>>>>>> >>>>>>> Laurent >>>>>> >>>>>> >>> >>>
