On Sat, Aug 10, 2013 at 8:59 AM, Ajo Fod <ajo....@gmail.com> wrote:

> If the data doesn't fit, you probably need a StorelessQuantile estimator
> like QuantileBin1D from the colt libraries. Then pick a resolution and do
> the single pass search.
>

Peripheral to the actual topic, but the Colt libraries are out of date in
almost every respect.  When we added unit tests, even the most basic
functions turned up dozens of serious bugs.  With respect to more advanced
estimation such as quantiles, nothing in Colt comes close to streamlib.
 Even the Mahout on-line estimators are generally superior.

QuantileBin1D, in particular, lacks the machinery of QDigests (not
suprising since they were published in 2004, long after Colt went dormant).
 Check out

https://github.com/clearspring/stream-lib/blob/master/src/main/java/com/clearspring/analytics/stream/quantile/QDigest.java

and the original paper

http://www.cs.virginia.edu/~son/cs851/papers/ucsb.sensys04.pdf

Reply via email to