HyukjinKwon commented on a change in pull request #28133: [SPARK-31156][SQL] 
DataFrameStatFunctions API to be consistent with respect to Column type
URL: https://github.com/apache/spark/pull/28133#discussion_r404546852
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala
 ##########
 @@ -97,14 +97,38 @@ final class DataFrameStatFunctions private[sql](df: 
DataFrame) {
       cols: Array[String],
       probabilities: Array[Double],
       relativeError: Double): Array[Array[Double]] = {
-    StatFunctions.multipleApproxQuantiles(
-      df.select(cols.map(col): _*),
+    approxQuantile(cols.map(df.col), probabilities, relativeError)
+  }
+
+  /**
+   * Calculates the approximate quantiles of numerical columns of a DataFrame.
+   * @see `approxQuantile(col:Str* approxQuantile)` for detailed description.
+   *
+   * @param cols the numerical columns
+   * @param probabilities a list of quantile probabilities
+   *   Each number must belong to [0, 1].
+   *   For example 0 is the minimum, 0.5 is the median, 1 is the maximum.
+   * @param relativeError The relative target precision to achieve (greater 
than or equal to 0).
+   *   If set to zero, the exact quantiles are computed, which could be very 
expensive.
+   *   Note that values greater than 1 are accepted but give the same result 
as 1.
+   * @return the approximate quantiles at the given probabilities of each 
column
+   *
+   * @note null and NaN values will be ignored in numerical columns before 
calculation. For
+   *   columns only containing null or NaN values, an empty array is returned.
+   *
+   * @since 3.0.0
 
 Review comment:
   nit 3.0.0 -> 3.1.0
   New features will be landed to Spark 3.1.0 because `branch-3.0` for Spark 
3.0 is already out and it's code-frozen.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to