[ 
https://issues.apache.org/jira/browse/CASSANDRA-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288012#comment-13288012
 ] 

Peter Schuller commented on CASSANDRA-4303:
-------------------------------------------

I'm highly skeptical of not locking BF in memory.

Whenever you page out to disk, you're instantly killed from a disk I/O 
perspective since by definition there will be absolutely no locality 
what-so-ever in the bloom filter access pattern, nor will caching be efficient 
(with a sparsely accessed BF you're pulling in a 4k page or more to read a 
single bit of information).

Put it this way, if even 1% of your bloom filter is not in memory, your 
performance will be *abysmal* in relation to any CPU bound workload, if you're 
on platters.

I don't think CPU efficiency is the interest here, nor overhead of page faults. 
The problem is rather that you will be absolutely killed at soon as even a tiny 
fraction is no longer in memory.

SSD:s may change the abysmal bit since they are so fast that a multi-SSD 
machine will easily be CPU bound with Cassandra, but then simply reading from 
the sstables isn't obviously slower than looking up the bloom filter. I'd 
expect it to be faster in many cases (less I/O:s) if you're relying on page 
cache for any significant amount of the bloom filter to not be in memory.
                
> Compressed bloomfilters
> -----------------------
>
>                 Key: CASSANDRA-4303
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4303
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Brandon Williams
>             Fix For: 1.2
>
>
> Very commonly, people encountering an OOM need to increase their bloom filter 
> false positive ratio to reduce memory pressure, since BFs tend to be the 
> largest shareholder.  It would make sense if we could alleviate the memory 
> pressure from BFs with compression while maintaining the FP ratio (at the 
> cost of a bit of cpu) that some users have come to expect.  One possible 
> implementation is at http://code.google.com/p/javaewah/

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to