As soon as it is for sure done.  I have one more significant improvement to 
make so that it works on sequential values.  I will hand the code to suneel who 
will be packaging it for mahout. You can def have it at the same time.  

I would love a review from you guys when I am ready. The theory doc is nearly 
to that point.  Would you like I start there?  Also, can I get some info from 
you about how q digests work in practice?  

Sent from my iPhone

On Nov 13, 2013, at 20:46, Matt Abrams <[email protected]> wrote:

> Ted -
> 
> Any chance we can add your quantile estimator to stream-lib?
> 
> Matt
> 
> On Wed, Nov 13, 2013 at 5:38 AM, Ted Dunning <[email protected]> wrote:
>> I also have a new quantile estimator that dominates all other
>> implementations that I know of on speed and accuracy (10us per point added,
>> 8K data size to get a few ppm accuracy for high or low quantiles and about
>> 0.05% accuracy on middle quantiles like the median).
>> 
>> 
>> 
>> 
>> On Wed, Nov 13, 2013 at 8:53 AM, Dmitriy Ryaboy <[email protected]> wrote:
>> 
>>> Summingbird uses algebird. I think Stripe might also have a library, Avi
>>> Bryant was toying with this for a while.
>>> 
>>> Algebird has some nice features like not doing approximation at all for
>>> small sets (just use the real values), etc. we also recently did a bunch of
>>> work to make sure we can serialize all approximate structures so they can
>>> be correctly reused by different computations, sent across the wire, etc.
>>> 
>>> I don't recall doing speed comparisons and the like, it would be
>>> interesting to see them if you guys are choosing what library to use.
>>> 
>>> On Nov 13, 2013, at 12:33 AM, Ted Dunning <[email protected]> wrote:
>>> 
>>>> stream-lib is used quite widely and is generally high quality.
>>>> 
>>>> The other competitive library is Brick House from Klout.
>>> http://engineering.klout.com/2013/01/introducing-brickhouse-major-open-source-release-from-klout/
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Nov 12, 2013 at 7:28 PM, Timothy Chen <[email protected]> wrote:
>>>> 
>>>>> Just saw this library today and thought it's something we can
>>> potentially
>>>>> leverage:
>>>>> 
>>>>> https://github.com/addthis/stream-lib
>>>>> 
>>>>> It has a number of algo for approximation streams and has code for
>>>>> cardinality estimation (HyperLogLog) and others.
>>>>> 
>>>>> Looks like Twitter's SummingBird uses this library too.
>>>>> 
>>>>> Tim
>>> 

Reply via email to