OOM while performing major compaction

2014-02-27 Thread Nish garg
I am having OOM during major compaction on one of the column family where there are lot of SStables (33000) to be compacted. Is there any other way for them to be compacted? Any help will be really appreciated. Here are the details /opt/cassandra/current/bin/nodetool -h us1emscsm-01 compact

Re: OOM while performing major compaction

2014-02-27 Thread Edward Capriolo
One big downside about major compaction is that (depending on your cassandra version) the bloom filters size is pre-calculated. Thus cassandra needs enough heap for your existing 33 k+ sstables and the new large compacted one. In the past this happened to us when the compaction thread got hung up,

Re: OOM while performing major compaction

2014-02-27 Thread Robert Coli
On Thu, Feb 27, 2014 at 11:09 AM, Nish garg pipeli...@gmail.com wrote: I am having OOM during major compaction on one of the column family where there are lot of SStables (33000) to be compacted. Is there any other way for them to be compacted? Any help will be really appreciated. You can

Re: OOM while performing major compaction

2014-02-27 Thread Nish garg
Thanks for replying. We are on Cassandra 1.2.9. We have time series like data structure where we need to keep only last 6 hours of data. So we expire data using expireddatetime column on column family and then we run expire script via cron to create tombstones. We don't use ttl yet and

Re: OOM while performing major compaction

2014-02-27 Thread Tupshin Harper
If you can programmatically roll over onto a new column family every 6 hours (or every day or other reasonable increment), and then just drop your existing column family after all the columns would have been expired, you could skip your compaction entirely. It was not clear to me from your

Re: OOM while performing major compaction

2014-02-27 Thread Nish garg
Hello Tupshin, Yes all the data needs to be kept for just last 6 hours. Yes changing to new CF every 6 hours solves the compaction issue, but between the change we will have less than 6 hours of data. We can use CF1 and CF2 and truncate them one at a time every 6 hours in loop but we need some

Re: OOM while performing major compaction

2014-02-27 Thread Tupshin Harper
You are right that modifying your code to access two CFs is a hack, and not an ideal solution, but I think it should be pretty easy to implement, and would help you get out of this jam pretty quickly. Not saying you should go down that path, but if you lack better options, that would probably be