This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new 58124bd [MINOR][SQL][3.0] Improve examples for `percentile_approx()`
58124bd is described below
commit 58124bd4e5ab2cfdfdc0a6b7c553c25678258c20
Author: Max Gekk <[email protected]>
AuthorDate: Wed Sep 23 20:14:12 2020 +0900
[MINOR][SQL][3.0] Improve examples for `percentile_approx()`
### What changes were proposed in this pull request?
In the PR, I propose to replace current examples for `percentile_approx()`
with **only one** input value by example **with multiple values** in the input
column.
### Why are the changes needed?
Current examples are pretty trivial, and don't demonstrate function's
behaviour on a sequence of values.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
- by running `ExpressionInfoSuite`
- `./dev/scalastyle`
Authored-by: Max Gekk <max.gekkgmail.com>
Signed-off-by: HyukjinKwon <gurwls223apache.org>
(cherry picked from commit b53da23a28fe149cc75d593c5c36f7020a8a2752)
Signed-off-by: Max Gekk <max.gekkgmail.com>
Closes #29848 from MaxGekk/example-percentile_approx-3.0.
Authored-by: Max Gekk <[email protected]>
Signed-off-by: HyukjinKwon <[email protected]>
---
.../catalyst/expressions/aggregate/ApproximatePercentile.scala | 8 ++++----
.../src/test/resources/sql-functions/sql-expression-schema.md | 4 ++--
2 files changed, 6 insertions(+), 6 deletions(-)
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
index d06eeee..32f21fc 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/ApproximatePercentile.scala
@@ -60,10 +60,10 @@ import org.apache.spark.sql.types._
""",
examples = """
Examples:
- > SELECT _FUNC_(10.0, array(0.5, 0.4, 0.1), 100);
- [10.0,10.0,10.0]
- > SELECT _FUNC_(10.0, 0.5, 100);
- 10.0
+ > SELECT _FUNC_(col, array(0.5, 0.4, 0.1), 100) FROM VALUES (0), (1),
(2), (10) AS tab(col);
+ [1,1,0]
+ > SELECT _FUNC_(col, 0.5, 100) FROM VALUES (0), (6), (7), (9), (10) AS
tab(col);
+ 7
""",
group = "agg_funcs",
since = "2.1.0")
diff --git a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
index 070a6f3..b84abe5 100644
--- a/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
+++ b/sql/core/src/test/resources/sql-functions/sql-expression-schema.md
@@ -285,8 +285,8 @@
| org.apache.spark.sql.catalyst.expressions.XxHash64 | xxhash64 | SELECT
xxhash64('Spark', array(123), 2) | struct<xxhash64(Spark, array(123),
2):bigint> |
| org.apache.spark.sql.catalyst.expressions.Year | year | SELECT
year('2016-07-30') | struct<year(CAST(2016-07-30 AS DATE)):int> |
| org.apache.spark.sql.catalyst.expressions.ZipWith | zip_with | SELECT
zip_with(array(1, 2, 3), array('a', 'b', 'c'), (x, y) -> (y, x)) |
struct<zip_with(array(1, 2, 3), array(a, b, c), lambdafunction(named_struct(y,
namedlambdavariable(), x, namedlambdavariable()), namedlambdavariable(),
namedlambdavariable())):array<struct<y:string,x:int>>> |
-| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile |
approx_percentile | SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100) |
struct<approx_percentile(10.0, array(0.5, 0.4, 0.1), 100):array<decimal(3,1)>> |
-| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile |
percentile_approx | SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100) |
struct<percentile_approx(10.0, array(0.5, 0.4, 0.1), 100):array<decimal(3,1)>> |
+| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile |
approx_percentile | SELECT approx_percentile(col, array(0.5, 0.4, 0.1), 100)
FROM VALUES (0), (1), (2), (10) AS tab(col) | struct<approx_percentile(col,
array(0.5, 0.4, 0.1), 100):array<int>> |
+| org.apache.spark.sql.catalyst.expressions.aggregate.ApproximatePercentile |
percentile_approx | SELECT percentile_approx(col, array(0.5, 0.4, 0.1), 100)
FROM VALUES (0), (1), (2), (10) AS tab(col) | struct<percentile_approx(col,
array(0.5, 0.4, 0.1), 100):array<int>> |
| org.apache.spark.sql.catalyst.expressions.aggregate.Average | avg | SELECT
avg(col) FROM VALUES (1), (2), (3) AS tab(col) | struct<avg(col):double> |
| org.apache.spark.sql.catalyst.expressions.aggregate.Average | mean | SELECT
mean(col) FROM VALUES (1), (2), (3) AS tab(col) | struct<mean(col):double> |
| org.apache.spark.sql.catalyst.expressions.aggregate.BitAndAgg | bit_and |
SELECT bit_and(col) FROM VALUES (3), (5) AS tab(col) | struct<bit_and(col):int>
|
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]