[ 
https://issues.apache.org/jira/browse/METRON-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770848#comment-15770848
 ] 

ASF GitHub Bot commented on METRON-627:
---------------------------------------

Github user nickwallen commented on the issue:

    https://github.com/apache/incubator-metron/pull/397
  
    I was thinking how I could use your HLLP functionality with the Profiler.  
I think I could use this functionality to track the in-degree and out-degree of 
a host over time.  This might be an interesting motivating example.
    
    For example, calculating the in-degree would look like the following.
    ```
    {
      "profile": "in-degree",
      "onlyif": "source.type == 'yaf'"
      "foreach": "ip_dst_addr",
      "init": {
        "in": "HLLP_INIT(5, 6)"
      }
      "update": {
        "in": "HLLP_ADD(in, ip_src_addr)"
      }
      "result": {
        "HLLP_CARDINALITY(in)"
      }
    }
    ```
    
    Calculating the out-degree would look like this.
    ```
    {
      "profile": "out-degree",
      "onlyif": "source.type == 'yaf'"
      "foreach": "ip_src_addr",
      "init": {
        "out": "HLLP_INIT(5, 6)"
      }
      "update": {
        "out": "HLLP_ADD(out, ip_dst_addr)"
      }
      "result": {
        "HLLP_CARDINALITY(out)"
      }
    }
    ```
    
    
    Of course, with your `HLLP_MERGE` it might be more interesting to store the 
HLLP object itself and call `HLLP_CARDINALITY` after-the-fact, but I just want 
to make sure I'm using the API correctly.



> Add HyperLogLogPlus implementation to Stellar
> ---------------------------------------------
>
>                 Key: METRON-627
>                 URL: https://issues.apache.org/jira/browse/METRON-627
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Michael Miklavcic
>            Assignee: Michael Miklavcic
>
> Calculating set cardinality can be a useful tool for a security analyst. For 
> instance, a large volume of non-unique src ip addresses hitting your network 
> may be an indication that you are currently under attack. There have been 
> many advancements in distinct value (DV) estimation over the years. We have 
> seen implementations evolve from K-Minimum-Values (KMV), to LogLog, to 
> HyperLogLog, and now to Google's much-improved HyperLogLogPlu algorithm. The 
> key improvements in this latest manifestation of the algorithm are:
> moves to a 64-bit hash
> handles sparse sets
> is more accurate with small cardinality
> This Jira tracks the effort to add a HyperLogLogPlus implementation to Metron.
> References:
> https://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll/
> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to