[ 
https://issues.apache.org/jira/browse/CASSANDRA-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605093#comment-13605093
 ] 

Sylvain Lebresne commented on CASSANDRA-4885:
---------------------------------------------

Let's say that I don't feel extremely strongly either way. On the one side, I 
agree that they are almost surely of limited use in practice, but at the same 
time, keeping it a disabled-by-default option for one or two versions wouldn't 
cost us much and seems safer to me. If no-one complains of a performance drop, 
then cool, we drop them in a few releases. But if some people do experience a 
performance degradation in some of their workload, then at least we have the 
option to check if it is indeed due to columns BF. And if it is, we might learn 
they are more useful than we though.

bq. a rare simplification of our increasingly baroque storage engine code

I hear you. But at the same time columns BF don't add much complexity. Besides, 
all I'm suggesting is a more incremental/prudent way to remove them.

bq. in the rare case of reading single (CQL) rows from a large partition

I note that it can also help for static rows as this might save you from 
checking the data file at all in some cases. Not to pretend that this would be 
very common either.  
                
> Remove or rework per-row bloom filters
> --------------------------------------
>
>                 Key: CASSANDRA-4885
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4885
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Jason Brown
>             Fix For: 2.0
>
>         Attachments: 0001-CASSANRDA-4885-Remove-per-row-bloom-filter.patch, 
> 0002-CASSANRDA-4885-update-test.patch, 4885-v1.patch
>
>
> Per-row bloom filters may be a misfeature.
> On small rows we don't create them.
> On large rows we essentially only do slice queries that can't take advantage 
> of it.
> And on very large rows if we ever did deserialize it, the performance hit of 
> doing so would outweigh the benefit of skipping the actual read.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to