huaxingao commented on a change in pull request #26305:
[SPARK-29645][ML][PYSPARK] ML add param RelativeError
URL: https://github.com/apache/spark/pull/26305#discussion_r340801735
##########
File path:
mllib/src/main/scala/org/apache/spark/ml/feature/QuantileDiscretizer.scala
##########
@@ -67,22 +68,6 @@ private[feature] trait QuantileDiscretizerBase extends
Params
/** @group getParam */
def getNumBucketsArray: Array[Int] = $(numBucketsArray)
- /**
- * Relative error (see documentation for
- * `org.apache.spark.sql.DataFrameStatFunctions.approxQuantile` for
description)
- * Must be in the range [0, 1].
- * Note that in multiple columns case, relative error is applied to all
columns.
Review comment:
Nit: Seems the above line got removed in the new documentation. I guess
maybe put it somewhere else in the doc? Maybe put it in the end of line 97?
``` Since 2.3.0, `QuantileDiscretizer` can map multiple columns at once by
setting the `inputCols`
parameter. If both of the `inputCol` and `inputCols` parameters are set, an
Exception will be
thrown. To specify the number of buckets for each column, the
`numBucketsArray` parameter can
be set, or if the number of buckets should be the same across columns,
`numBuckets` can be
set as a convenience. Note that in multiple columns case, relative error is
applied to all columns.```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]