[
https://issues.apache.org/jira/browse/HBASE-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stack updated HBASE-1200:
-------------------------
Attachment: Bloom_Filters_in_HBase.pdf
Doc as PDF.
Here's some Nicolas answers to a few questions on doc:
{code}
15:41 < St^Ack> So, what you do your hashing w/?
15:42 < nspiegelberg> I do murmur hash with combinatoral generation
15:43 < nspiegelberg> it's cache miss, but only need to compute the murmur
twice, no matter the hashKey count
15:44 * St^Ack excellent
15:44 < St^Ack> So, its in the LRU cache.. whats that mean?
15:45 < nspiegelberg> every call to bloom.contain calls getMetaBlock(BF_DATA),
which is LRU cache
15:45 < nspiegelberg> so CFs that aren't used don't have their blooms cached
15:46 < St^Ack> excellent
{code}
> Add bloomfilters
> ----------------
>
> Key: HBASE-1200
> URL: https://issues.apache.org/jira/browse/HBASE-1200
> Project: Hadoop HBase
> Issue Type: Task
> Affects Versions: 0.20.5
> Reporter: stack
> Assignee: Nicolas Spiegelberg
> Fix For: 0.20.5
>
> Attachments: Bloom Filters in HBase.docx, Bloom_Filters_in_HBase.pdf,
> HBASE-1200-0.20.5.patch, ryan_bloomfilter.patch
>
>
> Add bloomfiltering to hfile. Can be enabled on a family-level basis.
> Ability to configure a row vs row+col level bloom. We size the bloomfilter
> with the number of entries we are about to flush which seems like usually
> we'd be making a filter too big, so our implementation needs to take that
> into account.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.