Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19715#discussion_r153786414
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/QuantileDiscretizerSuite.scala
---
@@ -146,4 +146,166 @@ class QuantileDiscretizerSuite
val model = discretizer.fit(df)
assert(model.hasParent)
}
+
--- End diff --
We should add 2 tests:
1. test setting `numBuckets` is the same as setting `numBucketsArray`
explicitly with identical values
2. test that QD over multiple columns produces the same results as 2x QDs
over the same columns (as we did for Bucketizer)
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]