[jira] [Commented] (CASSANDRA-5182) Deletable rows are sometimes not removed during compaction

Sylvain Lebresne (JIRA) Fri, 01 Mar 2013 08:37:16 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13590683#comment-13590683
 ]


Sylvain Lebresne commented on CASSANDRA-5182:
---------------------------------------------

bq.  if we say "only disable BF where you're not doing deletes," it has a 
legitimate if narrow use case

I guess I agree on the principle that we should say "only disable BF where 
you're not doing deletes". That being said, if we do use getPosition, we extend 
the possible use cases, since it become "only disable BF where you're not doing 
deletes or your index fit entirely in RAM" (because getPosition will not 
destroy performance for the "not doing delete case", since we don't even call 
shouldPurge() unless we know there is tombstones).

bq. and IMO the bite from getPosition is worse, since it will destroy 
compaction performance

I'm not totally sure I agree on the worse. As said above, if people have not 
tombstone, it won't destroy compaction performance. So I guess the question is: 
for people that 1) do not follow recommendation (cause we should definitively 
say when disabling BF is ok or not) and that 2) do have deletes, is it better 
for them to be bitten by a) bad compaction performance or b) their tombstones 
not being purged ever.

I don't doubt that which of a) or b) is worse is a matter of perspective. That 
being said, my own personal preference goes to avoiding because:
* to me b) is a break of correctness which somewhat trumps performance 
consideration. It purely subjective though.
* accumulating tombstones forever is a pretty nasty time-bomb. Having 
compaction being slow because it hit disk more than it should on the other 
seems easier to me to detect (and thus fix by following the recommendation of 
not disabling BF when you shouldn't).

So, I still have a preference for using Yuki's last patch (and making it clear 
that you shall "only disable BF where you're not doing deletes or your index 
fit entirely in RAM"). If only because that's a bit better than "only disable 
BF where you're not doing deletes". But if you still prefer keeping the status 
quo, I won't oppose, do feel free to close that issue (we still should write 
the recommendation on when to disable BF somewhere in any case).
                
> Deletable rows are sometimes not removed during compaction
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-5182
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5182
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.0.7
>            Reporter: Binh Van Nguyen
>            Assignee: Yuki Morishita
>             Fix For: 1.2.3
>
>         Attachments: 5182-1.1.txt, 5182-1.2.txt, test_ttl.tar.gz
>
>
> Our use case is write heavy and read seldom.  To optimize the space used, 
> we've set the bloom_filter_fp_ratio=1.0  That along with the fact that each 
> row is only written to one time and that there are more than 20 SSTables 
> keeps the rows from ever being compacted. Here is the code:
> https://github.com/apache/cassandra/blob/cassandra-1.1/src/java/org/apache/cassandra/db/compaction/CompactionController.java#L162
> We hit this conner case and because of this C* keeps consuming more and more 
> space on disk while it should not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-5182) Deletable rows are sometimes not removed during compaction

Reply via email to