I already experienced the same problem (hundreds of thousands of SSTables)
with Cassandra 2.1.2. It seems to appear when running an incremental repair
while there is a medium to high insert load on the cluster. The repair goes
in a bad state and starts creating way more SSTables than it should (even
when there should be nothing to repair).

On 10 February 2015 at 15:46, Eric Stevens <migh...@gmail.com> wrote:

> This kind of recovery is definitely not my strong point, so feedback on
> this approach would certainly be welcome.
>
> As I understand it, if you really want to keep that data, you ought to be
> able to mv it out of the way to get your node online, then move those files
> in a several thousand at a time, nodetool refresh OpsCenter rollups60 &&
> nodetool compact OpsCenter rollups60; rinse and repeat.  This should let
> you incrementally restore the data in that keyspace without putting so many
> sstables in there that it ooms your cluster again.
>
> On Tue, Feb 10, 2015 at 3:38 PM, Chris Lohfink <clohfin...@gmail.com>
> wrote:
>
>> yeah... probably just 2.1.2 things and not compactions.  Still probably
>> want to do something about the 1.6 million files though.  It may be worth
>> just mv/rm'ing to 60 sec rollup data though unless really attached to it.
>>
>> Chris
>>
>> On Tue, Feb 10, 2015 at 4:04 PM, Paul Nickerson <pgn...@gmail.com> wrote:
>>
>>> I was having trouble with snapshots failing while trying to repair that
>>> table (
>>> http://www.mail-archive.com/user@cassandra.apache.org/msg40686.html). I
>>> have a repair running on it now, and it seems to be going successfully this
>>> time. I am going to wait for that to finish, then try a manual nodetool
>>> compact. If that goes successfully, then would it be safe to chalk the lack
>>> of compaction on this table in the past up to 2.1.2 problems?
>>>
>>>
>>>  ~ Paul Nickerson
>>>
>>> On Tue, Feb 10, 2015 at 3:34 PM, Chris Lohfink <clohfin...@gmail.com>
>>> wrote:
>>>
>>>> Your cluster is probably having issues with compactions (with STCS you
>>>> should never have this many).  I would probably punt with
>>>> OpsCenter/rollups60. Turn the node off and move all of the sstables off to
>>>> a different directory for backup (or just rm if you really don't care about
>>>> 1 minute metrics), than turn the server back on.
>>>>
>>>> Once you get your cluster running again go back and investigate why
>>>> compactions stopped, my guess is you hit an exception in past that killed
>>>> your CompactionExecutor and things just built up slowly until you got to
>>>> this point.
>>>>
>>>> Chris
>>>>
>>>> On Tue, Feb 10, 2015 at 2:15 PM, Paul Nickerson <pgn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you Rob. I tried a 12 GiB heap size, and still crashed out.
>>>>> There are 1,617,289 files under OpsCenter/rollups60.
>>>>>
>>>>> Once I downgraded Cassandra to 2.1.1 (apt-get install
>>>>> cassandra=2.1.1), I was able to start up Cassandra OK with the default 
>>>>> heap
>>>>> size formula.
>>>>>
>>>>> Now my cluster is running multiple versions of Cassandra. I think I
>>>>> will downgrade the rest to 2.1.1.
>>>>>
>>>>>  ~ Paul Nickerson
>>>>>
>>>>> On Tue, Feb 10, 2015 at 2:05 PM, Robert Coli <rc...@eventbrite.com>
>>>>> wrote:
>>>>>
>>>>>> On Tue, Feb 10, 2015 at 11:02 AM, Paul Nickerson <pgn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I am getting an out of memory error why I try to start Cassandra on
>>>>>>> one of my nodes. Cassandra will run for a minute, and then exit without
>>>>>>> outputting any error in the log file. It is happening while 
>>>>>>> SSTableReader
>>>>>>> is opening a couple hundred thousand things.
>>>>>>>
>>>>>> ...
>>>>>>
>>>>>>> Does anyone know how I might get Cassandra on this node running
>>>>>>> again? I'm not very familiar with correctly tuning Java memory 
>>>>>>> parameters,
>>>>>>> and I'm not sure if that's the right solution in this case anyway.
>>>>>>>
>>>>>>
>>>>>> Try running 2.1.1, and/or increasing heap size beyond 8gb.
>>>>>>
>>>>>> Are there actually that many SSTables on disk?
>>>>>>
>>>>>> =Rob
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to