Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/18281#discussion_r131104871
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/classification/OneVsRestSuite.scala ---
@@ -101,6 +101,44 @@ class OneVsRestSuite extends SparkFunSuite with
MLlibTestSparkContext with Defau
assert(expectedMetrics.confusionMatrix ~== ovaMetrics.confusionMatrix
absTol 400)
}
+ test("one-vs-rest: tuning parallelism does not change output") {
+ val ovaPar1 = new OneVsRest()
+ .setClassifier(new LogisticRegression)
+
+ val ovaModelPar1 = ovaPar1.fit(dataset)
+
+ val transformedDatasetPar1 = ovaModelPar1.transform(dataset)
+
+ val ovaResultsPar1 = transformedDatasetPar1.select("prediction",
"label").rdd.map {
+ row => (row.getDouble(0), row.getDouble(1))
+ }
+
+ val ovaPar2 = new OneVsRest()
+ .setClassifier(new LogisticRegression)
+ .setParallelism(2)
+
+ val ovaModelPar2 = ovaPar2.fit(dataset)
+
+ val transformedDatasetPar2 = ovaModelPar2.transform(dataset)
+
+ val ovaResultsPar2 = transformedDatasetPar2.select("prediction",
"label").rdd.map {
+ row => (row.getDouble(0), row.getDouble(1))
+ }
+
+ val metricsPar1 = new MulticlassMetrics(ovaResultsPar1)
+ val metricsPar2 = new MulticlassMetrics(ovaResultsPar2)
+ assert(metricsPar1.confusionMatrix == metricsPar2.confusionMatrix)
+
+ ovaModelPar1.models.zip(ovaModelPar2.models).foreach {
+ case (lrModel1: LogisticRegressionModel, lrModel2:
LogisticRegressionModel) =>
+ assert(lrModel1.coefficients === lrModel2.coefficients)
--- End diff --
Perhaps we should use the approx equal version for vectors and matrices
here and above? It seems the test does pass, but perhaps that would be better,
to avoid future flakiness for whatever reason. Also, we do so in the Python
tests so it would be more consistent.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]