[
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292245#comment-17292245
]
Botong Huang commented on CALCITE-4514:
---------------------------------------
getChildSets is now changed to compute and maintain on demand. It is
essentially a cache and for most RelSets it will be null. It is at least better
than compute from scratch every time it is needed. I am not sure if there is a
better way.
Size of RelSet is already there RelSet.rels, we do not add anything tracking in
this patch.
Performance side, this Jira is originally intended to fix and improve
performance in a special case (when two relsets are parent sets of each other),
which the added ut covered. For this special case, the performance improvement
is straightforward. But for the overall perf, I am not sure if there is an easy
way for a ut to quantify the improvement, without a comprehensive benchmark of
queries.
> Fine tune the merge order of two RelSets, cache RelSet's childSet computation
> -----------------------------------------------------------------------------
>
> Key: CALCITE-4514
> URL: https://issues.apache.org/jira/browse/CALCITE-4514
> Project: Calcite
> Issue Type: Improvement
> Reporter: Botong Huang
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences:
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly
> pick a merge order without checking the second condition above. For
> performance reasons, we should, to avoid unnecessary churn.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)