[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`

gurwls223 Wed, 23 Sep 2020 04:48:52 -0700

This is an automated email from the ASF dual-hosted git repository.

gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.0 by this push:
     new 58124bd  [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
58124bd is described below

commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20
Author: Max Gekk <[email protected]>
AuthorDate: Wed Sep 23 20:14:12 2020 +0900

    [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
    
    ### What changes were proposed in this pull request?
    In the PR, I propose to replace current examples for `percentile_approx()` 
with **only one** input value by example **with multiple values** in the input 
column.
    
    ### Why are the changes needed?
    Current examples are pretty trivial, and don't demonstrate function's 
behaviour on a sequence of values.
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    - by running `ExpressionInfoSuite`
    - `./dev/scalastyle`
    
    Authored-by: Max Gekk <max.gekkgmail.com>
    Signed-off-by: HyukjinKwon <gurwls223apache.org>
    (cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752)
    Signed-off-by: Max Gekk <max.gekkgmail.com>
    
    Closes #29848 from MaxGekk/example-percentile_approx-3.0.
    
    Authored-by: Max Gekk <[email protected]>
    Signed-off-by: HyukjinKwon <[email protected]>
---
 .../catalyst/expressions/aggregate/ApproximatePercentile.scala    | 8 ++++----
 .../src/test/resources/sql-functions/sql-expression-schema.md     | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
index d06eeee..32f21fc 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
@@ -60,10 +60,10 @@ import org.apache.spark.sql.types._
   """,
   examples = """
     Examples:
-      > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100);
-       [10.0,10.0,10.0]
-      > SELECT _FUNC_(10.0, 0.5, 100);
-       10.0
+      > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1), 
(2), (10) AS tab(col);
+       [1,1,0]
+      > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS 
tab(col);
+       7
   """,
   group = "agg_funcs",
   since = "2.1.0")
diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md 
b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
index 070a6f3..b84abe5 100644
--- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
+++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
@@ -285,8 +285,8 @@
 | org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT 
xxhash64('Spark', array(123), 2) | struct<xxhash64(Spark, array(123), 
2):bigint> |
 | org.apache.spark.sql.catalyst.expressions.Year | year | SELECT 
year('2016-07-30') | struct<year(CAST(2016-07-30 AS DATE)):int> |
 | org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT 
zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) | 
struct<zip_with(array(1, 2, 3), array(a, b, c), lambdafunction(named_struct(y, 
namedlambdavariable(), x, namedlambdavariable()), namedlambdavariable(), 
namedlambdavariable())):array<struct<y:string,x:int>>> |
-| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | 
approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) | 
struct<approx_percentile(10.0, array(0.5, 0.4, 0.1), 100):array<decimal(3,1)>> |
-| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | 
percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) | 
struct<percentile_approx(10.0, array(0.5, 0.4, 0.1), 100):array<decimal(3,1)>> |
+| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | 
approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100) 
FROM VALUES (0), (1), (2), (10) AS tab(col) | struct<approx_percentile(col, 
array(0.5, 0.4, 0.1), 100):array<int>> |
+| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile | 
percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100) 
FROM VALUES (0), (1), (2), (10) AS tab(col) | struct<percentile_approx(col, 
array(0.5, 0.4, 0.1), 100):array<int>> |
 | org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT 
avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct<avg(col):double> |
 | org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT 
mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct<mean(col):double> |
 | org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and | 
SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct<bit_and(col):int> 
|


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[spark] branch branch-3.0 updated: [MINOR][SQL][3.0] Improve examples for `percentile_approx()`

Reply via email to