nealrichardson commented on a change in pull request #9435:
URL: https://github.com/apache/arrow/pull/9435#discussion_r572322195



##########
File path: cpp/src/arrow/compute/api_aggregate.h
##########
@@ -105,12 +105,15 @@ struct ARROW_EXPORT VarianceOptions : public 
FunctionOptions {
 /// By default, returns the median value.
 struct ARROW_EXPORT QuantileOptions : public FunctionOptions {
   /// Interpolation method to use when quantile lies between two data points
+  /// TDIGEST is useful to approximate quantiles from large volume inputs.
+  /// It has constant memory footprint, but lower accuracy.
   enum Interpolation {
     LINEAR = 0,
     LOWER,
     HIGHER,
     NEAREST,
     MIDPOINT,
+    TDIGEST,

Review comment:
       FWIW that tdigest R function is not part of base R, it is a contributed 
package.
   
   R's quantile function also supports multiple methods, 9 in fact, via the 
`type` parameter: 
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html
   
   None of these are approximate in the same way as this, but it's a further 
argument that it could be a function parameter rather than a separate function. 
TBH I don't know how much it matters, as a compute API consumer I can make 
either work. It's marginally easier if they're separate functions rather than 
having to mess with `FunctionOptions` to select them--are maybe there are 
downsides to that way?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to