[ 
https://issues.apache.org/jira/browse/HBASE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-1200:
-------------------------

    Attachment: Bloom_Filters_in_HBase.pdf

Doc as PDF.

Here's some Nicolas answers to a few questions on doc:

{code}
15:41 < St^Ack> So, what you do your hashing w/?
15:42 < nspiegelberg> I do murmur hash with combinatoral generation
15:43 < nspiegelberg> it's cache miss, but only need to compute the murmur 
twice, no matter the hashKey count
15:44  * St^Ack excellent
15:44 < St^Ack> So, its in the LRU cache.. whats that mean?
15:45 < nspiegelberg> every call to bloom.contain calls getMetaBlock(BF_DATA), 
which is LRU cache
15:45 < nspiegelberg> so CFs that aren't used don't have their blooms cached
15:46 < St^Ack> excellent
{code}

> Add bloomfilters
> ----------------
>
>                 Key: HBASE-1200
>                 URL: https://issues.apache.org/jira/browse/HBASE-1200
>             Project: Hadoop HBase
>          Issue Type: Task
>    Affects Versions: 0.20.5
>            Reporter: stack
>            Assignee: Nicolas Spiegelberg
>             Fix For: 0.20.5
>
>         Attachments: Bloom Filters in HBase.docx, Bloom_Filters_in_HBase.pdf, 
> HBASE-1200-0.20.5.patch, ryan_bloomfilter.patch
>
>
> Add bloomfiltering to hfile.  Can be enabled on a family-level basis.  
> Ability to configure a row vs row+col level bloom.  We size the bloomfilter 
> with the number of entries we are about to flush which seems like usually 
> we'd be making a filter too big, so our implementation needs to take that 
> into account.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to