[GitHub] spark pull request: [SPARK-9963] [ML] RandomForest cleanup: replac...

jkbradley Wed, 07 Oct 2015 11:44:16 -0700

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8609#discussion_r41429104
  
    --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala ---
    @@ -279,6 +279,38 @@ private[tree] class LearningNode(
         }
       }
     
    +  /**
    +   * Get the node corresponding to this data point.
    +   * This function mimics prediction, passing an example from the root 
node down to a leaf
    +   * or unsplit node; that node is returned.
    +   *
    +   * @param binnedFeatures  Binned feature vector for data point.
    +   * @param splits possible splits for all features, indexed 
(numFeatures)(numSplits)
    +   */
    +  def predictImpl(binnedFeatures: Array[Int], splits: 
Array[Array[Split]]): LearningNode = {
    +    if (this.isLeaf || this.split.isEmpty) {
    +      this
    +    } else {
    +      val split = this.split.get
    +      val featureIndex = split.featureIndex
    +      val splitLeft = split.shouldGoLeft(binnedFeatures(featureIndex), 
splits(featureIndex))
    +      if (this.leftChild.isEmpty) {
    +        // Not yet split. Return next layer of nodes to train
    +        if (splitLeft) {
    +          leftChild.get
    --- End diff --
    
    This won't work.  I'm running the tests to show the failure.
    
    This makes me realize my original plan (to return the Node itself) won't 
work; we'll need to stick with returning the ID.  But I like having the code in 
LearningNode.  Can you please update this method so it returns the ID, not the 
node?
    
    I'll update the JIRA to match.  Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-9963] [ML] RandomForest cleanup: replac...

Reply via email to