[spark] branch branch-3.4 updated: [SPARK-42468][CONNECT][FOLLOW-UP] Add .agg variants in Dataset

hvanhovell Wed, 22 Feb 2023 12:55:31 -0800

This is an automated email from the ASF dual-hosted git repository.

hvanhovell pushed a commit to branch branch-3.4
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.4 by this push:
     new 6998f973d10 [SPARK-42468][CONNECT][FOLLOW-UP] Add .agg variants in 
Dataset
6998f973d10 is described below

commit 6998f973d10985a92a5e1b9119d8f20a935cb33f
Author: Rui Wang <rui.w...@databricks.com>
AuthorDate: Wed Feb 22 16:54:59 2023 -0400

    [SPARK-42468][CONNECT][FOLLOW-UP] Add .agg variants in Dataset
    
    ### What changes were proposed in this pull request?
    
    Add `.agg` in Dataset in Scala client.
    
    ### Why are the changes needed?
    
    API coverage.
    
    ### Does this PR introduce _any_ user-facing change?
    
    NO
    ### How was this patch tested?
    
    Existing UT
    
    Closes #40125 from amaliujia/rw_add_agg_dataset.
    
    Authored-by: Rui Wang <rui.w...@databricks.com>
    Signed-off-by: Herman van Hovell <her...@databricks.com>
    (cherry picked from commit 1232309e44d8ed65528c2b29ee4087e4173a3e06)
    Signed-off-by: Herman van Hovell <her...@databricks.com>
---
 .../main/scala/org/apache/spark/sql/Dataset.scala  | 55 ++++++++++++++++++++++
 1 file changed, 55 insertions(+)

diff --git 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
index 33125e5fd87..c7ded04a963 100644
--- 
a/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
+++ 
b/connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/Dataset.scala
@@ -1058,6 +1058,61 @@ class Dataset[T] private[sql] (val session: 
SparkSession, private[sql] val plan:
     new RelationalGroupedDataset(toDF(), cols.map(_.expr))
   }
 
+  /**
+   * (Scala-specific) Aggregates on the entire Dataset without groups.
+   * {{{
+   *   // ds.agg(...) is a shorthand for ds.groupBy().agg(...)
+   *   ds.agg("age" -> "max", "salary" -> "avg")
+   *   ds.groupBy().agg("age" -> "max", "salary" -> "avg")
+   * }}}
+   *
+   * @group untypedrel
+   * @since 3.4.0
+   */
+  def agg(aggExpr: (String, String), aggExprs: (String, String)*): DataFrame = 
{
+    groupBy().agg(aggExpr, aggExprs: _*)
+  }
+
+  /**
+   * (Scala-specific) Aggregates on the entire Dataset without groups.
+   * {{{
+   *   // ds.agg(...) is a shorthand for ds.groupBy().agg(...)
+   *   ds.agg(Map("age" -> "max", "salary" -> "avg"))
+   *   ds.groupBy().agg(Map("age" -> "max", "salary" -> "avg"))
+   * }}}
+   *
+   * @group untypedrel
+   * @since 3.4.0
+   */
+  def agg(exprs: Map[String, String]): DataFrame = groupBy().agg(exprs)
+
+  /**
+   * (Java-specific) Aggregates on the entire Dataset without groups.
+   * {{{
+   *   // ds.agg(...) is a shorthand for ds.groupBy().agg(...)
+   *   ds.agg(Map("age" -> "max", "salary" -> "avg"))
+   *   ds.groupBy().agg(Map("age" -> "max", "salary" -> "avg"))
+   * }}}
+   *
+   * @group untypedrel
+   * @since 3.4.0
+   */
+  def agg(exprs: java.util.Map[String, String]): DataFrame = 
groupBy().agg(exprs)
+
+  /**
+   * Aggregates on the entire Dataset without groups.
+   * {{{
+   *   // ds.agg(...) is a shorthand for ds.groupBy().agg(...)
+   *   ds.agg(max($"age"), avg($"salary"))
+   *   ds.groupBy().agg(max($"age"), avg($"salary"))
+   * }}}
+   *
+   * @group untypedrel
+   * @since 3.4.0
+   */
+  @scala.annotation.varargs
+  def agg(expr: Column, exprs: Column*): DataFrame = groupBy().agg(expr, 
exprs: _*)
+
   /**
    * Unpivot a DataFrame from wide format to long format, optionally leaving 
identifier columns
    * set. This is the reverse to `groupBy(...).pivot(...).agg(...)`, except 
for the aggregation,


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.4 updated: [SPARK-42468][CONNECT][FOLLOW-UP] Add .agg variants in Dataset

Reply via email to