[GitHub] spark pull request: [SPARK-11439][ML] Optimization of creating spa...

holdenk Thu, 03 Dec 2015 12:06:02 -0800

Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9756#discussion_r46603847
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/evaluation/RegressionEvaluatorSuite.scala
 ---
    @@ -65,15 +65,15 @@ class RegressionEvaluatorSuite
     
         // default = rmse
         val evaluator = new RegressionEvaluator()
    -    assert(evaluator.evaluate(predictions) ~== 0.1019382 absTol 0.001)
    +    assert(evaluator.evaluate(predictions) ~== 0.1013829 absTol 0.001)
    --- End diff --
    
    @srowen I think the difference is more - the original tolerance were based 
on the idea that this should match the R implementation values closely, but 
since we've changed the data (still has the same global distribution) the exact 
value is a little different - so its a question of if we want the tolerance to 
be based on R predicting on the same data set or the tolerance to be based on 
what it should look like for a model trained on data with this distribution. 
Does that sound correct or am I out in left field?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-11439][ML] Optimization of creating spa...

Reply via email to