[ 
https://issues.apache.org/jira/browse/METRON-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755355#comment-15755355
 ] 

ASF GitHub Bot commented on METRON-627:
---------------------------------------

Github user mmiklavc commented on the issue:

    https://github.com/apache/incubator-metron/pull/397
  
    @ottobackwards +1 on some working examples. We currently plop the whole lot 
of Stellar documentation in the metron-common README and I wonder if we should 
start splitting any of this out into separate md files. This set of functions 
really works as a small suite. Same for the Bloom filter functions. And some 
deeper-dive might prove useful. I like @cestella 's recommendation of using the 
profiler for an end-to-end test. This really gets at illustrating the value 
from a user/customer perspective. 
    
    I think we want to consider how we choose to use the Metron wiki vs the 
README files. It feels like we could keep the README's lean and mean, dealing 
specifically with relevant options and links to external papers and 
documentation. And then on the wiki we can go into greater technical detail and 
maybe even start assembling a cookbook of examples. Thoughts?


> Add HyperLogLogPlus implementation to Stellar
> ---------------------------------------------
>
>                 Key: METRON-627
>                 URL: https://issues.apache.org/jira/browse/METRON-627
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Michael Miklavcic
>
> Calculating set cardinality can be a useful tool for a security analyst. For 
> instance, a large volume of non-unique src ip addresses hitting your network 
> may be an indication that you are currently under attack. There have been 
> many advancements in distinct value (DV) estimation over the years. We have 
> seen implementations evolve from K-Minimum-Values (KMV), to LogLog, to 
> HyperLogLog, and now to Google's much-improved HyperLogLogPlu algorithm. The 
> key improvements in this latest manifestation of the algorithm are:
> moves to a 64-bit hash
> handles sparse sets
> is more accurate with small cardinality
> This Jira tracks the effort to add a HyperLogLogPlus implementation to Metron.
> References:
> https://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll/
> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to