Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/19993#discussion_r162717142
--- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/Bucketizer.scala
---
@@ -201,9 +184,13 @@ final class Bucketizer @Since("1.4.0")
(@Since("1.4.0") override val uid: String
@Since("1.4.0")
override def transformSchema(schema: StructType): StructType = {
- if (isBucketizeMultipleColumns()) {
+ ParamValidators.checkExclusiveParams(this, "inputCol", "inputCols")
--- End diff --
The problem with trying to use a general method like this is that it's hard
to capture model-specific requirements. This currently misses checking to make
sure that exactly one (not just <= 1) of each pair is available, plus that all
of the single-column OR all of the multi-column Params are available. (The
same issue occurs in https://github.com/apache/spark/pull/20146 ) It will also
be hard to check these items and account for defaults.
I'd argue that it's not worth trying to use generic checking functions here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]