[GitHub] [spark] beliefer commented on a diff in pull request #41444: [SPARK-43916][SQL][PYTHON][CONNECT] Add percentile like functions to Scala and Python API


beliefer commented on code in PR #41444:
URL: https://github.com/apache/spark/pull/41444#discussion_r1217334172



##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -812,6 +812,69 @@ object functions {
    */
   def min_by(e: Column, ord: Column): Column = Column.fn("min_by", e, ord)
 
+  /**
+   * Aggregate function: returns the exact percentile(s) of numeric column 
`expr` at the given
+   * percentage(s) with value range in [0.0, 1.0].
+   *
+   * @group agg_funcs
+   * @since 3.5.0
+   */
+  def percentile(e: Column, percentage: Column): Column = 
Column.fn("percentile", e, percentage)
+
+  /**
+   * Aggregate function: returns the exact percentile(s) of numeric column 
`expr` at the given
+   * percentage(s) with value range in [0.0, 1.0].
+   *
+   * @group agg_funcs
+   * @since 3.5.0
+   */
+  def percentile(e: Column, percentage: Column, frequency: Column): Column =
+    Column.fn("percentile", e, percentage, frequency)
+
+  /**
+   * Aggregate function: returns a percentile value based on a continuous 
distribution of numeric
+   * or ANSI interval column at the given percentage(s) with value range in 
[0.0, 1.0].
+   *
+   * @group agg_funcs
+   * @since 3.5.0
+   */
+  def percentile_cont(e: Column, percentage: Column): Column =
+    Column.fn("percentile_cont", e, percentage)
+
+  /**
+   * Aggregate function: returns a percentile value based on a continuous 
distribution of numeric
+   * or ANSI interval column at the given percentage(s) with value range in 
[0.0, 1.0].
+   *
+   * Note: reverse used to specify whether to reverse calculate percentile 
value.
+   *
+   * @group agg_funcs
+   * @since 3.5.0
+   */
+  def percentile_cont(e: Column, percentage: Column, reverse: Boolean): Column 
=
+    Column.fn("percentile_cont", e, percentage, lit(reverse))
+
+  /**
+   * Aggregate function: returns the percentile(s) based on a discrete 
distribution of numeric
+   * column `expr` at the given percentage(s) with value range in [0.0, 1.0].
+   *
+   * @group agg_funcs
+   * @since 3.5.0
+   */
+  def percentile_disc(e: Column, percentage: Column): Column =
+    Column.fn("percentile_disc", e, percentage)
+
+  /**
+   * Aggregate function: returns the percentile(s) based on a discrete 
distribution of numeric
+   * column `expr` at the given percentage(s) with value range in [0.0, 1.0].
+   *
+   * Note: reverse used to specify whether to reverse calculate percentile 
value.
+   *
+   * @group agg_funcs
+   * @since 3.5.0
+   */
+  def percentile_disc(e: Column, percentage: Column, reverse: Boolean): Column 
=

Review Comment:
   I know that.
   But there are exists many same cases. such as:
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L473
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L489
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L506
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L621
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L633
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L686
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L702
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L1093
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L1164
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L1179
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L2745
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L2759
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L3498
   
https://github.com/apache/spark/blob/b00210bd0320afef282b68c8ef7f8d972e9e19f5/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala#L4993



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #41444: [SPARK-43916][SQL][PYTHON][CONNECT] Add percentile like functions to Scala and Python API

Reply via email to