maropu commented on a change in pull request #28120: [SPARK-31349][SQL][DOCS] 
Document built-in aggregate functions in SQL Reference
URL: https://github.com/apache/spark/pull/28120#discussion_r404089466
 
 

 ##########
 File path: docs/sql-ref-functions-builtin-aggregate.md
 ##########
 @@ -19,4 +19,628 @@ license: |
   limitations under the License.
 ---
 
-Aggregate functions
\ No newline at end of file
+Spark SQL provides build-in aggregate functions defined in the dataset API and 
SQL interface. Aggregate functions
+operate on a group of rows and return a single value.
+
+Spark SQL aggregate functions are grouped as <code>agg_funcs</code> in Spark 
SQL. Below is the list of functions.
+
+**Note:** All functions below have another signature which takes String as a 
column name instead of Column.
+
+* Table of contents
+{:toc}
+<table class="table">
+  <thead>
+    <tr><th 
style="width:25%">Function</th><th>Parameters</th><th>Description</th></tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td> <b>{any | some | bool_or}</b>(<i>c: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns true if at least one value is true</td>
+    </tr>
+    <tr>
+      <td> <b>approx_count_distinct</b>(<i>c: Column[, relativeSD: 
Double]]</i>)</td>
+      <td>Column name; relativeSD: the maximum estimation error allowed.</td>
+      <td>Returns the estimated cardinality by HyperLogLog++</td>
+    </tr>   
+    <tr>
+      <td> <b>{avg | mean}</b>(<i>c: Column</i>)</td>
+      <td>Column name</td>
+      <td> Returns the average of values in the input column.</td> 
+    </tr>
+    <tr>
+      <td> <b>{bool_and | every}</b>(<i>c: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns true if all values are true</td>
+    </tr>
+    <tr>
+      <td> <b>collect_list</b>(<i>c: Column</i>)</td>
+      <td>Column name</td>
+      <td>Collects and returns a list of non-unique elements. The function is 
non-deterministic because the order of collected results depends on the order 
of the rows which may be non-deterministic after a shuffle</td>
+    </tr>       
+    <tr>
+      <td> <b>collect_set</b>(<i>c: Column</i>)</td>
+      <td>Column name</td>
+      <td>Collects and returns a set of unique elements. The function is 
non-deterministic because the order of collected results depends on the order 
of the rows which may be non-deterministic after a shuffle.</td>
+    </tr>
+    <tr>
+      <td> <b>corr</b>(<i>c1: Column, c2: Column</i>)</td>
+      <td>Column name</td>
+      <td>Returns Pearson coefficient of correlation between a set of number 
pairs</td>
+    </tr>
+    <tr>
+      <td> <b>count</b>(<i>*</i>)</td>
+      <td>None</td>
+      <td>Returns the total number of retrieved rows, including rows 
containing null</td>
+    </tr>
+    <tr>
+      <td> <b>count</b>(<i>c: Column[, c: Column]</i>)</td>
+      <td>Column name</td>
+      <td>Returns the number of rows for which the supplied column(s) are all 
not null</td>
+    </tr>
+    <tr>
+      <td> <b>count</b>(<b>DISTINCT</b> <i> c: Column[, c: Column</i>])</td>
+      <td>Column name</td>
+      <td>Returns the number of rows for which the supplied column(s) are 
unique and not null</td>
+    </tr> 
+    <tr>
+      <td> <b>count_if</b>(<i>Predicate</i>)</td>
+      <td>Expression that will be used for aggregation calculation</td>
+      <td>Returns the count number from the predicate evaluate to 
<code>TRUE</code> values</td>
 
 Review comment:
   It seems no `<code>TRUE</code> ` exists in the existing docs, so 
`<code>TRUE</code>` ->  \`TRUE\`?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to