bkietz commented on a change in pull request #9621:
URL: https://github.com/apache/arrow/pull/9621#discussion_r592436887



##########
File path: cpp/src/arrow/compute/api_aggregate.h
##########
@@ -306,5 +326,34 @@ Result<Datum> TDigest(const Datum& value,
                       const TDigestOptions& options = 
TDigestOptions::Defaults(),
                       ExecContext* ctx = NULLPTR);
 
+/// \brief Calculate multiple aggregations grouped on multiple keys
+///
+/// \param[in] aggregands datums to which aggregations will be applied
+/// \param[in] keys datums which will be used to group the aggregations
+/// \param[in] options GroupByOptions, encapsulating the names and options of 
aggregate
+///            functions to be applied and the field names for results in the 
output.
+/// \return a StructArray with len(aggregands) + len(keys) fields. The first
+///         len(aggregands) fields are the results of the aggregations for the 
group
+///         specified by keys in the final len(keys) fields.
+///
+/// For example:
+///   GroupByOptions options = {
+///     .aggregates = {
+///       {"sum", nullptr, "sum result"},
+///       {"mean", nullptr, "mean result"},
+///     },
+///     .key_names = {"str key", "date key"},
+///   };
+/// assert(*GroupBy({[2, 5, 8], [1.5, 2.0, 3.0]},
+///                 {["a", "b", "a"], [today, today, today]},
+///                 options).Equals([
+///   {"sum result": 10, "mean result": 2.25, "str key": "a", "date key": 
today},
+///   {"sum result": 5,  "mean result": 2.0,  "str key": "b", "date key": 
today},
+/// ]))

Review comment:
       Since the group id lists are temporary (except in the rare case where we 
need to partition batches for writing), we will be computing and discarding 
them on the fly rather than materializing an O(N) set of them.
   
   I'll be removing this compute function; as mentioned above it's not 
necessary for group by to live in the function registry.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to