gitmodimo opened a new pull request, #47553:
URL: https://github.com/apache/arrow/pull/47553

   ### Rationale for this change
   Tdigest algorithm enables merging multiple centroid sets that allow 
efficient map-reduce implementation. This PR enables user to make reduce step 
outside of aggregator and in result enables incremental tdigest calculation.
   Also option to select different scaler function is added.
   
   ### What changes are included in this PR?
   This change introduces 3 new aggregate functions:
   
   `tdigest_map`- functionally the same as tdigest but instead of quantiles 
outputs centroids_vector as fixed_size_list of length delta (with nullable 
values) each element would be struct{double mean; double weight}
   `tdigest_reduce` - function that takes vector of centroids_vectors and 
output merged centroids_vector
   `tdigest_quantile` - function that takes centroids_vector and calculates 
quantiles as tdigest does
   
   ### Are these changes tested?
   Yes
   ### Are there any user-facing changes?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to