kevinyu98 commented on a change in pull request #28120: 
[SPARK-31349][SQL][DOCS] Sql ref buildin-aggregate
URL: https://github.com/apache/spark/pull/28120#discussion_r403766350
 
 

 ##########
 File path: docs/sql-ref-functions-builtin-aggregate.md
 ##########
 @@ -19,4 +19,626 @@ license: |
   limitations under the License.
 ---
 
-Aggregate functions
\ No newline at end of file
+Spark SQL provides build-in Aggregate functions defines in dataset API and SQL 
interface. Aggregate functions
+operate on a group of rows and return a single value.
+
+Spark SQL Aggregate functions are grouped as <code>agg_funcs</code> in spark 
SQL. Below is the list of functions.
+
+**Note:** Every below function has another signature which take String as a 
column name instead of Column.
+
+* Table of contents
+{:toc}
+<table class="table">
+  <thead>
+    <tr><th 
style="width:25%">Function</th><th>Parameters</th><th>Description</th></tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td> <b>{avg | mean}</b>(<i>e: Column</i>)</td>
+      <td>Column name</td>
+      <td> Returns the average of values in the input column.</td> 
+    </tr>
+    <tr>
+      <td> <b>{bool_and | every}</b>(<i>e: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns true if all values are true</td>
+    </tr>
+    <tr>
+      <td> <b>{any | some | bool_or}</b>(<i>e: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns true if at least one value is true</td>
+    </tr>
+    <tr>
+      <td> <b>approx_count_distinct</b>(<i>e: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns the estimated cardinality by HyperLogLog++</td>
+    </tr>
+    <tr>
+      <td> <b>corr</b>(<i>e1: Column, e2: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns Pearson coefficient of correlation between a set of number 
pairs</td>
+    </tr>
+    <tr>
+      <td> <b>count</b>(<i>*</i>)</td>
+      <td>None</td>
+      <td>Returns the total number of retrieved rows, including rows 
containing null</td>
+    </tr>
+    <tr>
+      <td> <b>count</b>(<i>e: Column[, e: Column]</i>)</td>
+      <td>Column name</td>
+      <td>Returns the number of rows for which the supplied column(s) are all 
not null</td>
+    </tr>
+    <tr>
+      <td> <b>count</b>(<b>DISTINCT</b> <i> e: Column[, e: Column</i>])</td>
+      <td>Column name</td>
+      <td>Returns the number of rows for which the supplied column(s) are 
unique and not null</td>
+    </tr> 
+    <tr>
+      <td> <b>count_if</b>(<i>Predicate</i>)</td>
+      <td>Expression that will be used for aggregation calculation</td>
+      <td>Returns the count number from the predicate evaluate to `TRUE` 
values</td>
+    </tr> 
+    <tr>
+      <td> <b>covar_pop</b>(<i>e1: Column, e2: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns the population covariance of a set of number pairs</td>
+    </tr> 
+    <tr>
+      <td> <b>covar_samp</b>(<i>e1: Column, e2: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns the sample covariance of a set of number pairs</td>
+    </tr>  
+    <tr>
+      <td> <b>{first | first_value}</b>(<i>e: Column[, isIgnoreNull]</i>)</td>
+      <td>Column name[, True/False(default)]</td>
+      <td>Returns the first value of column for a group of rows. If 
`isIgnoreNull` is true, returns only non-null values, default is false. This 
function is non-deterministic</td>
 
 Review comment:
   done

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to