[jira] Commented: (HADOOP-2654) CountingBloomFilter can overflow its storage

Bryan Duxbury (JIRA) Mon, 04 Feb 2008 10:05:39 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565441#action_12565441
 ]


Bryan Duxbury commented on HADOOP-2654:
---------------------------------------

We've been discussing making bloomfilters mandatory on mapfiles. Also, the plan 
would be to use standard bloomfilters, since we'd only create them when we're 
flushing or compacting, and never have to delete from them. In light of that, 
should we bother with this issue, or just resolve it Won'tFix?

> CountingBloomFilter can overflow its storage
> --------------------------------------------
>
>                 Key: HADOOP-2654
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2654
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/hbase
>            Reporter: Stu Hood
>         Attachments: counting-overflow-fourbit.patch, 
> counting-overflow-fourbit.patch, counting-overflow.patch
>
>
> The org.onelab.filter.CountingBloomFilter implementation does not check the 
> value of a bucket before incrementing/decrementing it. The buckets in a 
> Counting Bloom filter must not be allowed to overflow, and if they reach 
> their maximum value, they must not be allowed to decrement. This is the only 
> way to preserve the assumptions of the filter (without larger buckets). See: 
> http://en.wikipedia.org/wiki/Bloom_filter#Counting_filters
> Currently, if enough values hash to a bucket, the CountingBloomFilter may 
> begin reporting false negatives when it wraps back around to 0.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-2654) CountingBloomFilter can overflow its storage

Reply via email to