[GitHub] spark pull request #20121: [SPARK-22882][ML][TESTS] ML test for structured s...

jkbradley Mon, 05 Mar 2018 10:52:16 -0800

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20121#discussion_r172288890
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/classification/LogisticRegressionSuite.scala
 ---
    @@ -2567,10 +2504,13 @@ class LogisticRegressionSuite
         val model1 = lr.fit(smallBinaryDataset)
         val lr2 = new 
LogisticRegression().setInitialModel(model1).setMaxIter(5).setFamily("binomial")
         val model2 = lr2.fit(smallBinaryDataset)
    -    val predictions1 = 
model1.transform(smallBinaryDataset).select("prediction").collect()
    -    val predictions2 = 
model2.transform(smallBinaryDataset).select("prediction").collect()
    -    predictions1.zip(predictions2).foreach { case (Row(p1: Double), 
Row(p2: Double)) =>
    -      assert(p1 === p2)
    +    val binaryExpected = 
model1.transform(smallBinaryDataset).select("prediction").collect()
    +      .map(_.getDouble(0))
    +    for (model <- Seq(model1, model2)) {
    --- End diff --
    
    My thought is that testing binaryExpected (from model1) against model2 
would already test the 2 things we care about:
    * batch vs streaming prediction
    * initial model
    
    I'll just merge this though since it's not a big deal (just a bit longer 
testing time).



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20121: [SPARK-22882][ML][TESTS] ML test for structured s...

Reply via email to