[ 
https://issues.apache.org/jira/browse/HBASE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicolas Spiegelberg updated HBASE-1200:
---------------------------------------

        Summary: Add bloomfilters  (was: Add bloomfilters; use 
dynamicbloomfilter instead of base bloomfilter)
    Description: Add bloomfiltering to hfile.  Can be enabled on a family-level 
basis.  Ability to configure a row vs row+col level bloom.  We size the 
bloomfilter with the number of entries we are about to flush which seems like 
usually we'd be making a filter too big, so our implementation needs to take 
that into account.  (was: Add bloomfiltering to hfile.  Should it be optional 
or on always?  Currently, we bloom filter rows only, not the column + ts 
component, which seems good place to start but we size the bloomfilter with the 
number of entries we are about to flush which seems like usually we'd be making 
a filter too big.  How to figure how many rows in the flush?   We should use 
the DynamicBloomFilter as Andrezj does up in hadoop BloomFilterMapFile.  Start 
small and let it resize as entries are added.)

updating the title & description text.  Note that I took out DynamicBloomFilter 
requirement.  I will send out a document to compliment the code fix, talking 
about the implementation reasoning and possible future alternatives.

> Add bloomfilters
> ----------------
>
>                 Key: HBASE-1200
>                 URL: https://issues.apache.org/jira/browse/HBASE-1200
>             Project: Hadoop HBase
>          Issue Type: Task
>            Reporter: stack
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.21.0
>
>         Attachments: ryan_bloomfilter.patch
>
>
> Add bloomfiltering to hfile.  Can be enabled on a family-level basis.  
> Ability to configure a row vs row+col level bloom.  We size the bloomfilter 
> with the number of entries we are about to flush which seems like usually 
> we'd be making a filter too big, so our implementation needs to take that 
> into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to