Re: Nodes stopping

Alain RODRIGUEZ Thu, 11 May 2017 15:09:13 -0700

>
> For some context, I'm trying to get regular repairs going but am having
> issues with it.



You're not the only one, repairs are a real concern for many people.

For what it is worth, my team is actively working on this project initiated
at Spotify: https://github.com/thelastpickle/cassandra-reaper.

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2017-05-11 23:04 GMT+01:00 Alain RODRIGUEZ <arodr...@gmail.com>:

> Hi Daniel,
>
> Could you paste the exact GC options in use?
>
> Also 30 GB is not much. I would not use more than 8 GB for the JVM and
> probably CMS in those conditions for what it is worth. The thing is if
> memtables, bloom filter, caches, indexes, etc are off heap, then you
> probably ran out of Native memory. In any case it is good to have some
> space for page cache.
>
> As a reminder you can try new GC option in a canary node, see how it goes.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2017-05-11 22:29 GMT+01:00 Daniel Steuernol <dan...@sendwithus.com>:
>
>> Thank you, it's an Out of memory crash according to dmesg. I have the
>> heap size set to 15G in the jvm.options for cassandra, and there is 30G on
>> the machine.
>>
>>
>>
>> On May 11 2017, at 2:22 pm, Cogumelos Maravilha <
>> cogumelosmaravi...@sapo.pt> wrote:
>>
>>> Have a look at dmesg. It have already happened to me regarding type i
>>> instances at AWS.
>>>
>>> On 11-05-2017 22:17, Daniel Steuernol wrote:
>>>
>>> I had 2 nodes go down today, here is the ERRORs from the system log on
>>> both nodes
>>> https://gist.github.com/dlsteuer/28c610bc733a2bff22c8d3953ef8c218
>>> For some context, I'm trying to get regular repairs going but am having
>>> issues with it.
>>>
>>>
>>> On May 11 2017, at 2:10 pm, Cogumelos Maravilha
>>> <cogumelosmaravi...@sapo.pt> <cogumelosmaravi...@sapo.pt> wrote:
>>>
>>> Can you grep ERROR system.log
>>>
>>> On 11-05-2017 21:52, Daniel Steuernol wrote:
>>>
>>> There is nothing in the system log about it being drained or shutdown,
>>> I'm not sure how else it would be pre-empted. No one else on the team is on
>>> the servers and I haven't been shutting them down. There also is no java
>>> memory dump on the server either. It appears that the process just died.
>>>
>>>
>>> On May 11 2017, at 1:36 pm, Varun Gupta <var...@uber.com>
>>> <var...@uber.com> wrote:
>>>
>>>
>>> What do you mean by "no obvious error in the logs", do you see node was
>>> drained or shutdown. Are you sure, no other process is calling nodetool
>>> drain or shutdown, OR pre-empting cassandra process?
>>>
>>> On Thu, May 11, 2017 at 1:30 PM, Daniel Steuernol <dan...@sendwithus.com
>>> > wrote:
>>>
>>>
>>> I have a 6 node cassandra cluster running, and frequently a node will go
>>> down with no obvious error in the logs. This is starting to happen quite
>>> often, almost daily now. Any suggestions on how to track down what is
>>> causing the node to stop? ------------------------------
>>> --------------------------------------- To unsubscribe, e-mail:
>>> user-unsubscr...@cassandra.apache.org For additional commands, e-mail:
>>> user-h...@cassandra.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>>> additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For
>> additional commands, e-mail: user-h...@cassandra.apache.org
>>
>
>

Re: Nodes stopping

Reply via email to