Thanks Uri, I came across that and took a quick look, seems interesting.

On a related note, it would be quite cool to have a sort of port of Algebird 
(or at least count-min, top-k and HLL, perhaps bloom filter) to Python, that 
are monoid-style for us in PySpark...
—
Sent from Mailbox for iPhone

On Sat, Feb 1, 2014 at 2:34 AM, Uri Laserson <[email protected]>
wrote:

> Hi everyone,
> I implemented a version of distributed streaming quantiles for PySpark.  It
> uses a count-min sketch approach.  You can find the code here:
> https://github.com/laserson/dsq
> Thought it might be of interest...
> Uri
> -- 
> Uri Laserson, PhD
> Data Scientist, Cloudera
> Twitter/GitHub: @laserson
> +1 617 910 0447
> [email protected]

Reply via email to