Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/16774#discussion_r136928086
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala ---
@@ -120,6 +120,33 @@ class CrossValidatorSuite
}
}
+ test("cross validation with parallel evaluation") {
+ val lr = new LogisticRegression
+ val lrParamMaps = new ParamGridBuilder()
+ .addGrid(lr.regParam, Array(0.001, 1000.0))
+ .addGrid(lr.maxIter, Array(0, 3))
+ .build()
+ val eval = new BinaryClassificationEvaluator
+ val cv = new CrossValidator()
+ .setEstimator(lr)
+ .setEstimatorParamMaps(lrParamMaps)
+ .setEvaluator(eval)
+ .setNumFolds(2)
+ .setParallelism(1)
--- End diff --
Yeah seed defaults to a hash of the class name. There has been debate over
this (see [SPARK-16832](https://issues.apache.org/jira/browse/SPARK-16832)).
Personally I also don't like that behavior, but for now that's what it is.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]