[GitHub] [arrow] ianmcook commented on a change in pull request #10887: ARROW-13311: [C++][Documentation] Document hash aggregate kernels

GitBox Tue, 17 Aug 2021 13:48:34 -0700


ianmcook commented on a change in pull request #10887:
URL: https://github.com/apache/arrow/pull/10887#discussion_r690711344




##########
File path: docs/source/cpp/compute.rst
##########
@@ -230,10 +234,64 @@ Notes:
   Note that the output can have less than *N* elements if the input has
   less than *N* distinct values.
 
+  The mode kernel is not a proper aggregate (it is actually a vector
+  function, see below).
+
 * \(5) Output is Int64, UInt64 or Float64, depending on the input type.
 
 * \(6) Output is Float64 or input type, depending on QuantileOptions.
 
+  The quantile kernel is not a proper aggregate (it is actually a vector
+  function, see below).
+
+* \(6) tdigest/t-digest computes approximate quantiles, and so only needs a
+  fixed amount of memory. See the `reference implementation
+  <https://github.com/tdunning/t-digest>`_ for details.
+
+Hash Aggregations ("group by")

Review comment:
       👍 I like "grouped aggregations"
   
   I think it's also worth explaining briefly what the meaning of "hash" is so 
that users understand why these function names all begin with `hash_` (as noted 
in my other comment below). I can imagine some confused user thinking this is a 
list of cryptographic hash functions.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] ianmcook commented on a change in pull request #10887: ARROW-13311: [C++][Documentation] Document hash aggregate kernels

Reply via email to