imatiach-msft commented on a change in pull request #21632: 
[SPARK-19591][ML][MLlib] Add sample weights to decision trees
URL: https://github.com/apache/spark/pull/21632#discussion_r250051805
 
 

 ##########
 File path: mllib/src/test/scala/org/apache/spark/ml/util/MLTestingUtils.scala
 ##########
 @@ -268,4 +269,20 @@ object MLTestingUtils extends SparkFunSuite {
     assert(newDatasetF.schema(featuresColName).dataType.equals(new 
ArrayType(FloatType, false)))
     (newDataset, newDatasetD, newDatasetF)
   }
+
+  def modelPredictionEquals[M <: PredictionModel[_, M]](
 
 Review comment:
   For regression case, it seems I can slightly increase the tolerance and get 
.99 of the cases within tolerance, but there still seems to be a prediction 
that differs - the difference is due to the model being slightly different due 
to the propagation of error (eg the splits in the trees are slightly different 
and over the course of training the trees diverge).  For the classification 
case, the predictions differ more - we are comparing the 0/1 labels, tolerance 
isn't used there; again the difference in the models seems to be due to 
propagation of error.
   I've updated the regressor tests; unfortunately for classification I don't 
think I can do much else.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to