Summingbird uses algebird. I think Stripe might also have a library, Avi Bryant 
was toying with this for a while. 

Algebird has some nice features like not doing approximation at all for small 
sets (just use the real values), etc. we also recently did a bunch of work to 
make sure we can serialize all approximate structures so they can be correctly 
reused by different computations, sent across the wire, etc. 

I don't recall doing speed comparisons and the like, it would be interesting to 
see them if you guys are choosing what library to use. 

On Nov 13, 2013, at 12:33 AM, Ted Dunning <[email protected]> wrote:

> stream-lib is used quite widely and is generally high quality.
> 
> The other competitive library is Brick House from Klout.
> 
> http://engineering.klout.com/2013/01/introducing-brickhouse-major-open-source-release-from-klout/
> 
> 
> 
> 
> On Tue, Nov 12, 2013 at 7:28 PM, Timothy Chen <[email protected]> wrote:
> 
>> Just saw this library today and thought it's something we can potentially
>> leverage:
>> 
>> https://github.com/addthis/stream-lib
>> 
>> It has a number of algo for approximation streams and has code for
>> cardinality estimation (HyperLogLog) and others.
>> 
>> Looks like Twitter's SummingBird uses this library too.
>> 
>> Tim
>> 

Reply via email to