Github user MLnick commented on a diff in the pull request:
https://github.com/apache/spark/pull/19715#discussion_r153774332
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala ---
@@ -50,10 +50,26 @@ private[feature] trait QuantileDiscretizerBase extends
Params
/** @group getParam */
def getNumBuckets: Int = getOrDefault(numBuckets)
+ /**
+ * Array of number of buckets (quantiles, or categories) into which data
points are grouped.
+ *
+ * See also [[handleInvalid]], which can optionally create an additional
bucket for NaN values.
+ *
+ * @group param
+ */
+ val numBucketsArray = new IntArrayParam(this, "numBucketsArray", "Array
of number of buckets " +
+ "(quantiles, or categories) into which data points are grouped. This
is for multiple " +
+ "columns input. If numBucketsArray is not set but numBuckets is set,
it means user wants " +
--- End diff --
"If transforming multiple columns and numBucketsArray is not set, but
numBuckets is set, then numBuckets will be applied across all columns."
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]