Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r162955686
--- Diff:
mllib/src/test/scala/org/apache/spark/ml/feature/BucketizerSuite.scala ---
@@ -401,15 +390,14 @@ class BucketizerSuite extends SparkFunSuite with
MLlibTestSparkContext with Defa
}
}
- test("Both inputCol and inputCols are set") {
- val bucket = new Bucketizer()
- .setInputCol("feature1")
- .setOutputCol("result")
- .setSplits(Array(-0.5, 0.0, 0.5))
- .setInputCols(Array("feature1", "feature2"))
-
- // When both are set, we ignore `inputCols` and just map the column
specified by `inputCol`.
- assert(bucket.isBucketizeMultipleColumns() == false)
+ test("assert exception is thrown if both multi-column and single-column
params are set") {
+ val df = Seq((0.5, 0.3), (0.5, -0.4)).toDF("feature1", "feature2")
+ ParamsSuite.testExclusiveParams(new Bucketizer, df, ("inputCol",
"feature1"),
+ ("inputCols", Array("feature1", "feature2")))
+ ParamsSuite.testExclusiveParams(new Bucketizer, df, ("outputCol",
"result1"),
+ ("outputCols", Array("result1", "result2")))
+ ParamsSuite.testExclusiveParams(new Bucketizer, df, ("splits",
Array(-0.5, 0.0, 0.5)),
--- End diff --
@MLnick actually it will fail for both reasons. We can add more test cases
to check each of these two cases if you think it is needed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]