Hi,

We use it in SPM (performance monitoring).  We've contributed a patch
to lower the memory footprint for QDigest.  The lib works well! :)

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Sat, Nov 16, 2013 at 12:11 AM, Eugene Kirpichov <[email protected]> wrote:
> Hi,
>
> Actually no, I have not used it in production, I wrote it for fun :)
> But I know that it's used in TempoDB. You might want to contact them and
> ask -
> http://blog.tempo-db.com/post/42318820124/estimating-percentiles-on-streams-of-data
>
> On Thu Nov 14 2013 at 6:17:15 AM, Matt Abrams <[email protected]> wrote:
>
>> Great!  I have not used Q digest in production yet but I believe
>> Eugene, the author of stream-lib's Q digest implementation, has.
>> Eugene, can you comment on how it performs in practice?
>>
>> Matt
>>
>>
>> On Wed, Nov 13, 2013 at 4:45 PM, Ted Dunning <[email protected]>
>> wrote:
>> >
>> >
>> > As soon as it is for sure done.  I have one more significant improvement
>> to make so that it works on sequential values.  I will hand the code to
>> suneel who will be packaging it for mahout. You can def have it at the same
>> time.
>> >
>> > I would love a review from you guys when I am ready. The theory doc is
>> nearly to that point.  Would you like I start there?  Also, can I get some
>> info from you about how q digests work in practice?
>> >
>> > Sent from my iPhone
>> >
>> > On Nov 13, 2013, at 20:46, Matt Abrams <[email protected]> wrote:
>> >
>> >> Ted -
>> >>
>> >> Any chance we can add your quantile estimator to stream-lib?
>> >>
>> >> Matt
>> >>
>> >> On Wed, Nov 13, 2013 at 5:38 AM, Ted Dunning <[email protected]>
>> wrote:
>> >>> I also have a new quantile estimator that dominates all other
>> >>> implementations that I know of on speed and accuracy (10us per point
>> added,
>> >>> 8K data size to get a few ppm accuracy for high or low quantiles and
>> about
>> >>> 0.05% accuracy on middle quantiles like the median).
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Nov 13, 2013 at 8:53 AM, Dmitriy Ryaboy <[email protected]>
>> wrote:
>> >>>
>> >>>> Summingbird uses algebird. I think Stripe might also have a library,
>> Avi
>> >>>> Bryant was toying with this for a while.
>> >>>>
>> >>>> Algebird has some nice features like not doing approximation at all
>> for
>> >>>> small sets (just use the real values), etc. we also recently did a
>> bunch of
>> >>>> work to make sure we can serialize all approximate structures so they
>> can
>> >>>> be correctly reused by different computations, sent across the wire,
>> etc.
>> >>>>
>> >>>> I don't recall doing speed comparisons and the like, it would be
>> >>>> interesting to see them if you guys are choosing what library to use.
>> >>>>
>> >>>> On Nov 13, 2013, at 12:33 AM, Ted Dunning <[email protected]>
>> wrote:
>> >>>>
>> >>>>> stream-lib is used quite widely and is generally high quality.
>> >>>>>
>> >>>>> The other competitive library is Brick House from Klout.
>> >>>> http://engineering.klout.com/2013/01/introducing-
>> brickhouse-major-open-source-release-from-klout/
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Tue, Nov 12, 2013 at 7:28 PM, Timothy Chen <[email protected]>
>> wrote:
>> >>>>>
>> >>>>>> Just saw this library today and thought it's something we can
>> >>>> potentially
>> >>>>>> leverage:
>> >>>>>>
>> >>>>>> https://github.com/addthis/stream-lib
>> >>>>>>
>> >>>>>> It has a number of algo for approximation streams and has code for
>> >>>>>> cardinality estimation (HyperLogLog) and others.
>> >>>>>>
>> >>>>>> Looks like Twitter's SummingBird uses this library too.
>> >>>>>>
>> >>>>>> Tim
>> >>>>
>>

Reply via email to