Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/20442#discussion_r164939903
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala ---
@@ -167,25 +167,31 @@ final class QuantileDiscretizer @Since("1.6.0")
(@Since("1.6.0") override val ui
@Since("2.3.0")
def setOutputCols(value: Array[String]): this.type = set(outputCols,
value)
- private[feature] def getInOutCols: (Array[String], Array[String]) = {
- require((isSet(inputCol) && isSet(outputCol) && !isSet(inputCols) &&
!isSet(outputCols)) ||
- (!isSet(inputCol) && !isSet(outputCol) && isSet(inputCols) &&
isSet(outputCols)),
- "QuantileDiscretizer only supports setting either inputCol/outputCol
or" +
- "inputCols/outputCols."
- )
+ @Since("1.6.0")
+ override def transformSchema(schema: StructType): StructType = {
+ ParamValidators.checkSingleVsMultiColumnParams(this, Seq(outputCol),
--- End diff --
Setting `numBucketsArray` when single-column can be an error. Since
`checkSingleVsMultiColumnParams` doesn't support this usage, I think we may
need to check it here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]