[ https://issues.apache.org/jira/browse/METRON-627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15768313#comment-15768313 ]
ASF GitHub Bot commented on METRON-627: --------------------------------------- Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/incubator-metron/pull/397#discussion_r93529216 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/utils/HyperLogLogPlus.java --- @@ -0,0 +1,102 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.common.utils; + +import com.clearspring.analytics.stream.cardinality.CardinalityMergeException; +import com.clearspring.analytics.stream.cardinality.ICardinality; +import com.google.common.collect.Lists; + +import java.io.Serializable; +import java.util.List; + --- End diff -- Heads up, I searched arxiv and it doesn't appear that this paper is there. > Add HyperLogLogPlus implementation to Stellar > --------------------------------------------- > > Key: METRON-627 > URL: https://issues.apache.org/jira/browse/METRON-627 > Project: Metron > Issue Type: Improvement > Reporter: Michael Miklavcic > Assignee: Michael Miklavcic > > Calculating set cardinality can be a useful tool for a security analyst. For > instance, a large volume of non-unique src ip addresses hitting your network > may be an indication that you are currently under attack. There have been > many advancements in distinct value (DV) estimation over the years. We have > seen implementations evolve from K-Minimum-Values (KMV), to LogLog, to > HyperLogLog, and now to Google's much-improved HyperLogLogPlu algorithm. The > key improvements in this latest manifestation of the algorithm are: > moves to a 64-bit hash > handles sparse sets > is more accurate with small cardinality > This Jira tracks the effort to add a HyperLogLogPlus implementation to Metron. > References: > https://research.neustar.biz/2013/01/24/hyperloglog-googles-take-on-engineering-hll/ > http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/40671.pdf -- This message was sent by Atlassian JIRA (v6.3.4#6332)