You might also want to have a look at Ted Dunning's t-digest: https://github.com/tdunning/t-digest
There is a paper with some theory here: https://github.com/tdunning/t-digest/blob/master/docs/theory/t-digest-paper/histo.pdf?raw=true t-digests of partitions can also be merged hence suitable for parallel implementations. -- Olivier