realno commented on pull request #1539: URL: https://github.com/apache/arrow-datafusion/pull/1539#issuecomment-1013930130
> The T-Digest algorithm returns an interpolated result, so I think `percentile_cont` makes most sense - however T-digest is an approximation of the quantile. Should we include `approx` in the name to signify this? If so, would `approx_percentile_cont` be the desired name, keeping in line with the existing `approx_distinct` aggregate? This sounds reasonable. I suggest also check if anyone from the community is more familiar with Postgres implementation. I remember I read somewhere its quantile is more efficient than median function, so I assume it uses something like KLL., if that's the case it may not be a big deal if we use approx or not. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
