Github user JulienPeloton commented on a diff in the pull request:

    https://github.com/apache/spark/pull/23025#discussion_r234099956
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ---
    @@ -2813,6 +2819,11 @@ class Dataset[T] private[sql](
        * When no explicit sort order is specified, "ascending nulls first" is 
assumed.
        * Note, the rows are not sorted in each partition of the resulting 
Dataset.
        *
    +   * [SPARK-26024] Note that due to performance reasons this method uses 
sampling to
    +   * estimate the ranges. Hence, the output may not be consistent, since 
sampling can return
    +   * different values. The sample size can be controlled by setting the 
value of the parameter
    +   * {{spark.sql.execution.rangeExchange.sampleSizePerPartition}}.
    --- End diff --
    
    Thanks. Done.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to