[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

erikerlandson Mon, 17 Sep 2018 15:50:27 -0700

Github user erikerlandson commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13440#discussion_r218252156
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Gini.scala ---
    @@ -71,6 +71,23 @@ object Gini extends Impurity {
       @Since("1.1.0")
       def instance: this.type = this
     
    +  /**
    +   * :: DeveloperApi ::
    +   * p-values for test-statistic measures, unsupported for [[Gini]]
    +   */
    +  @Since("2.2.0")
    +  @DeveloperApi
    +  def calculate(calcL: ImpurityCalculator, calcR: ImpurityCalculator): 
Double =
    --- End diff --
    
    I suspect that the generalization is closer to my newer signature 
    `val pval = imp.calculate(leftImpurityCalculator, rightImpurityCalculator)`
    where you have all the context from the left and right nodes. The existing 
gain-based calculation should fit into this framework, just doing its current 
weighted average of purity gain.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #13440: [SPARK-15699] [ML] Implement a Chi-Squared test s...

Reply via email to