[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

MLnick Tue, 05 Sep 2017 01:49:54 -0700

Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16774#discussion_r136928086
  
    --- Diff: 
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
    @@ -120,6 +120,33 @@ class CrossValidatorSuite
         }
       }
     
    +  test("cross validation with parallel evaluation") {
    +    val lr = new LogisticRegression
    +    val lrParamMaps = new ParamGridBuilder()
    +      .addGrid(lr.regParam, Array(0.001, 1000.0))
    +      .addGrid(lr.maxIter, Array(0, 3))
    +      .build()
    +    val eval = new BinaryClassificationEvaluator
    +    val cv = new CrossValidator()
    +      .setEstimator(lr)
    +      .setEstimatorParamMaps(lrParamMaps)
    +      .setEvaluator(eval)
    +      .setNumFolds(2)
    +      .setParallelism(1)
    --- End diff --
    
    Yeah seed defaults to a hash of the class name. There has been debate over 
this (see [SPARK-16832](https://issues.apache.org/jira/browse/SPARK-16832)). 
Personally I also don't like that behavior, but for now that's what it is.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16774: [SPARK-19357][ML] Adding parallel model evaluatio...

Reply via email to