Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2595#discussion_r18260055
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala ---
    @@ -633,16 +659,13 @@ object DecisionTree extends Serializable with Logging 
{
       /**
        * Find the best split for a node.
        * @param binAggregates Bin statistics.
    -   * @param nodeIndex Index into aggregates for node to split in this 
group.
        * @return tuple for best split: (Split, information gain, prediction at 
node)
        */
       private def binsToBestSplit(
    -      binAggregates: DTStatsAggregator,
    -      nodeIndex: Int,
    +      binAggregates: NodeStatsAggregator,
           splits: Array[Array[Split]],
    -      featuresForNode: Option[Array[Int]]): (Split, InformationGainStats, 
Predict) = {
    -
    -    val metadata: DecisionTreeMetadata = binAggregates.metadata
    +      featuresForNode: Option[Array[Int]],
    +      metadata: DecisionTreeMetadata): (Split, InformationGainStats, 
Predict) = {
    --- End diff --
    
    Do not need metadata since binAggregates stores the metadata.  (Unless you 
remove metadata from binAggregates; see comment below.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to