Michael Miklavcic created METRON-627:
----------------------------------------

             Summary: Add HyperLogLogPlus implementation to Stellar
                 Key: METRON-627
                 URL: https://issues.apache.org/jira/browse/METRON-627
             Project: Metron
          Issue Type: Improvement
            Reporter: Michael Miklavcic


Calculating set cardinality can be a useful tool for a security analyst. For 
instance, a large volume of non-unique src ip addresses hitting your network 
may be an indication that you are currently under attack. There have been many 
advancements in distinct value (DV) estimation over the years. We have seen 
implementations evolve from K-Minimum-Values (KMV), to LogLog, to HyperLogLog, 
and now to Google's much-improved HyperLogLogPlu algorithm. The key 
improvements in this latest manifestation of the algorithm are:
moves to a 64-bit hash
handles sparse sets
is more accurate with small cardinality

This Jira tracks the effort to add a HyperLogLogPlus implementation to Metron.

References:
https://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll/
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to