[ 
https://issues.apache.org/jira/browse/CALCITE-2927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16794688#comment-16794688
 ] 

Hongze Zhang edited comment on CALCITE-2927 at 3/18/19 4:24 PM:
----------------------------------------------------------------

Hi [~danny0405], it seems that the inconsistent does exists more or less, so 
I've reopened the issue.

Reason (AFAICS): 

Below is the definition of "weight" (you can find it from doc[1]):
{code:java}
W{n, p} = Cost{n} / (SelfCost{p} + Cost{n0} + ... + Cost{nk})
{code}
And here is the definition of importance of a node n (from doc[1] too):
{code:java}
I{n} = Sum{parents p of n}{I{p} . W{n, p}}
{code}
But algorithmically speaking, the method RuleQueue#computeImportanceOfChild[2] 
emits:
{code:java}
I'{n} = I{p} . W{n, p}
{code}
And the method RuleQueue#computeImportance[3] take the result above and emits:
{code:java}
I{n} = Max{parents p of n}{I'{n}} = Max{parents p of n}{I{p} . W{n, p}}
{code}
Apparently it is not consistent with what's described in Javadoc.

[1] 
[https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L374]
[2] 
[https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L568]
[3] 
[https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L380]


was (Author: zhztheplayer):
I am not very familiar with the importance computing part of Volcano, but it 
seems that the inconsistent does exists more or less.

Below is the definition of "weight" (you can find it from doc[1]):


{code}
W{n, p} = Cost{n} / (SelfCost{p} + Cost{n0} + ... + Cost{nk})
{code}


And here is the definition of importance of a node n (from doc[1] too):


{code}
I{n} = Sum{parents p of n}{I{p} . W{n, p}}
{code}


But algorithmically speaking, the method RuleQueue#computeImportanceOfChild[2] 
emits:


{code}
I'{n} = I{p} . W{n, p}
{code}


Where the method RuleQueue#computeImportance[3] take the result above and emits:


{code}
I{n} = Max{parents p of n}{I'{n}} = Max{parents p of n}{I{p} . W{n, p}}
{code}



Am I missing something (or compute it wrong)?

[1] 
https://github.com/apache/calcite/blob/ffca956be03a99cd11e440d652b09674aaa727e6/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L374
[2] 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L568
[3] 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L380

> The javadoc and implement of RuleQueue.computeImportance() is inconsistent
> --------------------------------------------------------------------------
>
>                 Key: CALCITE-2927
>                 URL: https://issues.apache.org/jira/browse/CALCITE-2927
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.18.0
>            Reporter: Meng Wang
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: next
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In the javadoc of _computeImportance()_, it shows the importance of node is 
> Sum{_computeImportanceOfChild()_}, but in the implementation it shows 
> Max{_computeImportanceOfChild()_}, this is inconsistent. The Javadoc has some 
> errors, it will cause some confusion.
>  
> {code:java}
> // /**
>  * Computes the <dfn>importance</dfn> of a node. Importance is defined as
>  * follows:
>  *
>  * <ul>
>  * <li>the root {@link RelSubset} has an importance of 1</li>
>  * <li>the importance of any other subset is the sum of its importance to
>  * its parents</li>
>  * <li>The importance of children is pro-rated according to the cost of the
>  * children. Consider a node which has a cost of 3, and children with costs
>  * of 2 and 5. The total cost is 10. If the node has an importance of .5,
>  * then the children will have importance of .1 and .25. The retains .15
>  * importance points, to reflect the fact that work needs to be done on the
>  * node's algorithm.</li>
>  * </ul>
>  *
>  * <p>The formula for the importance <i>I</i> of node n is:
>  *
>  * <blockquote>I<sub>n</sub> = Sum<sub>parents p of n</sub>{I<sub>p</sub> .
>  * W <sub>n, p</sub>}</blockquote>
>  *
>  * <p>where W<sub>n, p</sub>, the weight of n within its parent p, is
>  *
>  * <blockquote>W<sub>n, p</sub> = Cost<sub>n</sub> / (SelfCost<sub>p</sub> +
>  * Cost<sub>n0</sub> + ... + Cost<sub>nk</sub>)
>  * </blockquote>
>  */
> double computeImportance(RelSubset subset) {
>   double importance;
>   if (subset == planner.root) {
>     // The root always has importance = 1
>     importance = 1.0;
>   } else {
>     final RelMetadataQuery mq = subset.getCluster().getMetadataQuery();
>     // The importance of a subset is the max of its importance to its
>     // parents
>     importance = 0.0;
>     for (RelSubset parent : subset.getParentSubsets(planner)) {
>       final double childImportance =
>           computeImportanceOfChild(mq, subset, parent);
>       importance = Math.max(importance, childImportance);
>     }
>   }
>   LOGGER.trace("Importance of [{}] is {}", subset, importance);
>   return importance;
> }
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to