[GitHub] [arrow-datafusion] realno commented on pull request #1539: approx_quantile() aggregation function

GitBox Sun, 16 Jan 2022 10:45:08 -0800


realno commented on pull request #1539:
URL: 
https://github.com/apache/arrow-datafusion/pull/1539#issuecomment-1013930130



   > The T-Digest algorithm returns an interpolated result, so I think 
`percentile_cont` makes most sense - however T-digest is an approximation of 
the quantile. Should we include `approx` in the name to signify this? If so, 
would `approx_percentile_cont` be the desired name, keeping in line with the 
existing `approx_distinct` aggregate?
   
   This sounds reasonable. I suggest also check if anyone from the community is 
more familiar with Postgres implementation. I remember I read somewhere its 
quantile is more efficient than median function, so I assume it uses something 
like KLL., if that's the case it may not be a big deal if we use approx or not. 
 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] realno commented on pull request #1539: approx_quantile() aggregation function

Reply via email to