[ 
https://issues.apache.org/jira/browse/CALCITE-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17292245#comment-17292245
 ] 

Botong Huang commented on CALCITE-4514:
---------------------------------------

getChildSets is now changed to compute and maintain on demand. It is 
essentially a cache and for most RelSets it will be null. It is at least better 
than compute from scratch every time it is needed. I am not sure if there is a 
better way. 

Size of RelSet is already there RelSet.rels, we do not add anything tracking in 
this patch. 

Performance side, this Jira is originally intended to fix and improve 
performance in a special case (when two relsets are parent sets of each other), 
which the added ut covered. For this special case, the performance improvement 
is straightforward. But for the overall perf, I am not sure if there is an easy 
way for a ut to quantify the improvement, without a comprehensive benchmark of 
queries. 


> Fine tune the merge order of two RelSets, cache RelSet's childSet computation
> -----------------------------------------------------------------------------
>
>                 Key: CALCITE-4514
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4514
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Botong Huang
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When merging two relsets, we have two preferences: 
> 1. Merge parent relset into child relset
> 2. Merge newer relset into older relset
> Currently, when the two relsets are parent set of each other, we randomly 
> pick a merge order without checking the second condition above. For 
> performance reasons, we should, to avoid unnecessary churn. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to