[
https://issues.apache.org/jira/browse/SPARK-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635469#comment-14635469
]
Joseph K. Bradley commented on SPARK-3157:
------------------------------------------
Good point, I'll close this. Thanks!
> Avoid duplicated stats in DecisionTree extractLeftRightNodeAggregates
> ---------------------------------------------------------------------
>
> Key: SPARK-3157
> URL: https://issues.apache.org/jira/browse/SPARK-3157
> Project: Spark
> Issue Type: Improvement
> Components: MLlib
> Reporter: Joseph K. Bradley
> Priority: Minor
>
> Improvement: computation, memory usage
> For ordered features, extractLeftRightNodeAggregates() computes pairs of
> cumulative sums. However, these sums are redundant since they are simply
> cumulative sums accumulating from the left and right ends, respectively.
> Only compute one sum.
> For unordered features, the left and right aggregates are essentially the
> same data, copied from the original aggregates, but shifted by one index.
> Avoid copying data.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]