[ https://issues.apache.org/jira/browse/SPARK-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16717505#comment-16717505 ]
ASF GitHub Bot commented on SPARK-24102: ---------------------------------------- imatiach-msft commented on a change in pull request #17085: [SPARK-24102][ML][MLLIB] ML Evaluators should use weight column - added weight column for regression evaluator URL: https://github.com/apache/spark/pull/17085#discussion_r240690346 ########## File path: mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala ########## @@ -52,7 +52,7 @@ class MultivariateOnlineSummarizer extends MultivariateStatisticalSummary with S private var totalCnt: Long = 0 private var totalWeightSum: Double = 0.0 private var weightSquareSum: Double = 0.0 - private var weightSum: Array[Double] = _ + private var currWeightSum: Array[Double] = _ Review comment: Nevermind, it looks like the build failed because the private variable conflicts with the public function that was defined: /** * Sum of weights. */ override def weightSum: Double = totalWeightSum I think this may be the best name for the public variable so I would prefer to keep it. The private variable now follows the naming convention of the other private array variables so I think this makes sense. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RegressionEvaluator should use sample weight data > ------------------------------------------------- > > Key: SPARK-24102 > URL: https://issues.apache.org/jira/browse/SPARK-24102 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 2.0.2 > Reporter: Ilya Matiach > Priority: Major > Labels: starter > > The LogisticRegression and LinearRegression models support training with a > weight column, but the corresponding evaluators do not support computing > metrics using those weights. This breaks model selection using CrossValidator. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org