jorisvandenbossche commented on code in PR #34759:
URL: https://github.com/apache/arrow/pull/34759#discussion_r1150583383


##########
python/pyarrow/table.pxi:
##########
@@ -5515,6 +5515,9 @@ list[tuple(str, str, FunctionOptions)]
             column names, for unary, nullary and n-ary aggregation functions
             respectively.
 
+            For the list of function names and respective aggregation
+            function options see: :ref:`py-grouped-aggrs`.

Review Comment:
   ```suggestion
               function options see :ref:`py-grouped-aggrs`.
   ```



##########
python/pyarrow/table.pxi:
##########
@@ -5527,20 +5530,58 @@ list[tuple(str, str, FunctionOptions)]
         ...       pa.array(["a", "a", "b", "b", "c"]),
         ...       pa.array([1, 2, 3, 4, 5]),
         ... ], names=["keys", "values"])
+
+        Sum the column "values" over the grouped column "keys":
+
         >>> t.group_by("keys").aggregate([("values", "sum")])
         pyarrow.Table
         values_sum: int64
         keys: string
         ----
         values_sum: [[3,7,5]]
         keys: [["a","b","c"]]
+
+        Count the rows over the grouped column "keys":
+
         >>> t.group_by("keys").aggregate([([], "count_all")])
         pyarrow.Table
         count_all: int64
         keys: string
         ----
         count_all: [[2,2,1]]
         keys: [["a","b","c"]]
+
+        Do multiple aggregations:
+
+        >>> t.group_by("keys").aggregate([
+        ...    ("values", "sum"),
+        ...    ("keys", "count")
+        ... ])
+        pyarrow.Table
+        values_sum: int64
+        keys_count: int64
+        keys: string
+        ----
+        values_sum: [[3,7,5]]
+        keys_count: [[2,2,1]]
+        keys: [["a","b","c"]]
+
+        Count the number of non-null values for column "values"
+        over the grouped column "keys":
+
+        >>> import pyarrow.compute as pc
+        >>> t.group_by(["keys"]).aggregate([
+        ...    ("values", "count", pc.CountOptions(mode="all"))

Review Comment:
   ```suggestion
           ...    ("values", "count", pc.CountOptions(mode="only_valid"))
   ```
   
   If you want the "number of non-null values" as mentioned above, you need 
this option (which is actually the default, but OK to show it explicitly I 
think)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to