Yibo Cai created ARROW-11727:
--------------------------------

             Summary: [C++][FlightRPC] Use TDigest to estimate latency 
quantiles in benchmark
                 Key: ARROW-11727
                 URL: https://issues.apache.org/jira/browse/ARROW-11727
             Project: Apache Arrow
          Issue Type: Improvement
          Components: FlightRPC
            Reporter: Yibo Cai
            Assignee: Yibo Cai


In Flight benchmark, boost accumulator is used to estimate latency quantiles 
(0.5, 0.95, 0.99). Internally, boost adopts P-Square algorithm [1]. P-Square is 
very bad at estimating skewed quantiles like 0.99, where TDigest shines.

Test result shows 0.99 latency is much better than what current code tells us. 
We should switch to TDigest.

- run flight-benchmark with default parameters
- calculate 0.99 quantile of latencies
- compare exact value (store all data points), value from tdigest, and value 
from boost
{noformat}
Exact Tdigest Boost-P2
86    93      2130
175   235     1526
151   165     1926
147   153     302
251   313     561
{noformat}

[1] https://www.cse.wustl.edu/~jain/papers/ftp/psqr.pdf



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to