[ 
https://issues.apache.org/jira/browse/METRON-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754931#comment-15754931
 ] 

ASF GitHub Bot commented on METRON-627:
---------------------------------------

Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/397#discussion_r92846025
  
    --- Diff: metron-platform/metron-common/README.md ---
    @@ -229,21 +249,33 @@ Using parens such as: "foo" : "\<ok\>" requires 
escaping; "foo": "\'\<ok\>\'"
         * input - List
       * Returns: Last element of the list
     
    -### `FILL_LEFT`
    -  * Description: Fills or pads a given string with a given character, to a 
given length on the left
    +### `HLLP_CARDINALITY`
    +  * Description: Returns HyperLogLogPlus-estimated cardinality for this set
       * Input:
    -    * input - string
    -    * fill - the fill character
    -    * len - the required length
    -  * Returns: the filled string
    +    * hyperLogLogPlus - the hllp set
    +  * Returns: Long value representing the cardinality for this set
     
    -### `FILL_RIGHT`
    -  * Description: Fills or pads a given string with a given character, to a 
given length on the right
    +### `HLLP_INIT`
    +  * Description: Initializes the set
       * Input:
    -    * input - string
    -    * fill - the fill character string
    -    * len - the required length
    -  * Returns: Last element of the list
    +    * p (required) - the precision value for the normal set
    +    * sp - the precision value for the sparse set. If sp is not specified 
the sparse set will be disabled.
    +  * Returns: A new HyperLogLogPlus set
    +
    +### `HLLP_MERGE`
    +  * Description: Merge hllp sets together
    +  * Input:
    +    * hllp1 - first hllp set
    +    * hllp2 - second hllp set
    +    * hllpn - additional sets to merge
    +  * Returns: A new merged HyperLogLogPlus estimator set
    +
    +### `HLLP_OFFER`
    +  * Description: Add value to the set
    +  * Input:
    +    * hyperLogLogPlus - the hllp set
    +    * o - Object to add to the set
    --- End diff --
    
    Heh, I just made the same comment above. Incidentally, our BLOOM_ADD 
currently works like a vararg. I may open a separate PR to change that as well.


> Add HyperLogLogPlus implementation to Stellar
> ---------------------------------------------
>
>                 Key: METRON-627
>                 URL: https://issues.apache.org/jira/browse/METRON-627
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Michael Miklavcic
>
> Calculating set cardinality can be a useful tool for a security analyst. For 
> instance, a large volume of non-unique src ip addresses hitting your network 
> may be an indication that you are currently under attack. There have been 
> many advancements in distinct value (DV) estimation over the years. We have 
> seen implementations evolve from K-Minimum-Values (KMV), to LogLog, to 
> HyperLogLog, and now to Google's much-improved HyperLogLogPlu algorithm. The 
> key improvements in this latest manifestation of the algorithm are:
> moves to a 64-bit hash
> handles sparse sets
> is more accurate with small cardinality
> This Jira tracks the effort to add a HyperLogLogPlus implementation to Metron.
> References:
> https://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll/
> http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to