cyb70289 commented on pull request #9435:
URL: https://github.com/apache/arrow/pull/9435#issuecomment-783839807


   > Also, it seems there's a performance problem here where TDigest is much 
slower than Quantile:
   
   Thanks for careful review. As you found, the bottleneck of TDigest is 
sorting. I once talked with the author about possibility of improving TDigest 
performance by removing sorting: 
https://github.com/tdunning/t-digest/issues/153. There are some ideas to 
mitigate the penalty of sorting. I also plan to evaluate faster sorting 
algorithm like radix sort for floating numbers (folly has such an 
implementation).
   
   `Quantile` kernel is expensive in calculating quantiles (call 
`std::nth_element`). So centils() performances the worst, simply because it 
calculates more quantiles.
   `TDigest` kernel is trivial in calculating quantiles, so its performance is 
almost constant regardless number of quantiles to calculate. But TDigest is 
expensive in adding data points (TDigestMerger::Add), when buffer is full, it 
sorts (call `std::sort`) the input buffer which is much slower than 
`std::nth_element`.
   
   I think `Quantile` and `TDigest` are for different purpose. You will use 
`TDigest` if you have large volume of inputs (or streaming data), and copying 
all of them leads to huge memory footprint. Another benefit of `TDigest` is 
that chunks can be processed in parallel, then merge into one tdigest, though 
current aggregate kernel framework doesn't support it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to