Github user mengxr commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11325#discussion_r53824820
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameStatFunctions.scala ---
    @@ -37,13 +37,27 @@ import org.apache.spark.util.sketch.{BloomFilter, 
CountMinSketch}
     final class DataFrameStatFunctions private[sql](df: DataFrame) {
     
       /**
    -   * Calculate the approximate quantile of numerical column of a DataFrame.
    +   * Calculates the approximate quantiles of a numerical column of a 
DataFrame.
    +   *
    +   * Note on the target error:
    +   *
    +   * The result of this algorithm has the following deterministic bound:
    +   * if the DataFrame has N elements and if we request the quantile `phi` 
up to error `epsi`,
    +   * then the algorithm will return a sample `x` from the DataFrame so 
that the *exact* rank
    +   * of `x` is close to (phi * N). More precisely:
    +   *
    +   *   floor((phi - epsi) * N) <= rank(x) <= ceil((phi + epsi) * N)
    +   *
    +   *
        * @param col the name of the column
    -   * @param quantile the quantile number
    -   * @return the approximate quantile
    +   * @param quantiles a list of quantiles to approximate. Each number must 
belong to [0, 1]. For example 0 is the
    --- End diff --
    
    * Please rename `quantiles` to `probabilities`. For the documentation, we 
can take R's doc as reference: http://www.inside-r.org/r-doc/stats/quantile.
    * This line too long. Please run `dev/lint-scala` to check style.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to