If you can programmatically roll over onto a new column family every 6
hours (or every day or other reasonable increment), and then just drop your
existing column family after all the columns would have been expired, you
could skip your compaction entirely. It was not clear to me from your
description whether *all* of the data only needs to be retained for 6
hours. If that is true, rolling over to a new cf will be your simplest
option.

-Tupshin


On Thu, Feb 27, 2014 at 5:31 PM, Nish garg <pipeli...@gmail.com> wrote:

> Thanks for replying.
>
> We are  on Cassandra 1.2.9.
>
> We have time series like data structure where we need to keep only last 6
> hours of data. So we expire data using  expireddatetime column on column
> family and then we run expire script via cron to create tombstones. We
> don't use ttl yet and planning to use it in our future release. Hope that
> will fix some of the issues caused by expire script as it needs to read the
> data first before creating tombstones.
>
> So to answer your question, we have almost 80% of tombstones in those
> sstables. (There is no easy way to confirm this unless I convert all those
> 33000 sstables to JSON file and query them for tombstones).
> The reason of 33000 of them may be due to machine load too high for minor
> compaction and it was falling behind or some thing happened to minor
> compaction thread on this node. Other two nodes in this cluster are fine.
> Yes, we are using sized compaction strategy.
>
> I am inclined towards 'decommission and bootstrap' this node as it seems
> like performing major compaction on this node is impossible.
>
> However still looking for other solutions...
>
>
>
>
> On Thu, Feb 27, 2014 at 4:03 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Thu, Feb 27, 2014 at 11:09 AM, Nish garg <pipeli...@gmail.com> wrote:
>>
>>> I am having OOM during major compaction on one of the column family
>>> where there are lot of SStables (33000) to be compacted. Is there any other
>>> way for them to be compacted? Any help will be really appreciated.
>>>
>>
>> You can use user defined compaction to reduce the working set, but only a
>> major compaction is capable of purging 100% of tombstones.
>>
>> How much garbage is actually in the files? Why do you have 33,000 of
>> them? You mention a major compaction so you are likely not using LCS with
>> the bad 5mb default... how did you end up with so many SSTables?
>>
>> Have you removed the throttle from compaction, generally?
>>
>> What version of Cassandra?
>>
>> =Rob
>>
>>
>

Reply via email to