> > For some context, I'm trying to get regular repairs going but am having > issues with it.
You're not the only one, repairs are a real concern for many people. For what it is worth, my team is actively working on this project initiated at Spotify: https://github.com/thelastpickle/cassandra-reaper. C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2017-05-11 23:04 GMT+01:00 Alain RODRIGUEZ <arodr...@gmail.com>: > Hi Daniel, > > Could you paste the exact GC options in use? > > Also 30 GB is not much. I would not use more than 8 GB for the JVM and > probably CMS in those conditions for what it is worth. The thing is if > memtables, bloom filter, caches, indexes, etc are off heap, then you > probably ran out of Native memory. In any case it is good to have some > space for page cache. > > As a reminder you can try new GC option in a canary node, see how it goes. > > C*heers, > ----------------------- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > 2017-05-11 22:29 GMT+01:00 Daniel Steuernol <dan...@sendwithus.com>: > >> Thank you, it's an Out of memory crash according to dmesg. I have the >> heap size set to 15G in the jvm.options for cassandra, and there is 30G on >> the machine. >> >> >> >> On May 11 2017, at 2:22 pm, Cogumelos Maravilha < >> cogumelosmaravi...@sapo.pt> wrote: >> >>> Have a look at dmesg. It have already happened to me regarding type i >>> instances at AWS. >>> >>> On 11-05-2017 22:17, Daniel Steuernol wrote: >>> >>> I had 2 nodes go down today, here is the ERRORs from the system log on >>> both nodes >>> https://gist.github.com/dlsteuer/28c610bc733a2bff22c8d3953ef8c218 >>> For some context, I'm trying to get regular repairs going but am having >>> issues with it. >>> >>> >>> On May 11 2017, at 2:10 pm, Cogumelos Maravilha >>> <cogumelosmaravi...@sapo.pt> <cogumelosmaravi...@sapo.pt> wrote: >>> >>> Can you grep ERROR system.log >>> >>> On 11-05-2017 21:52, Daniel Steuernol wrote: >>> >>> There is nothing in the system log about it being drained or shutdown, >>> I'm not sure how else it would be pre-empted. No one else on the team is on >>> the servers and I haven't been shutting them down. There also is no java >>> memory dump on the server either. It appears that the process just died. >>> >>> >>> On May 11 2017, at 1:36 pm, Varun Gupta <var...@uber.com> >>> <var...@uber.com> wrote: >>> >>> >>> What do you mean by "no obvious error in the logs", do you see node was >>> drained or shutdown. Are you sure, no other process is calling nodetool >>> drain or shutdown, OR pre-empting cassandra process? >>> >>> On Thu, May 11, 2017 at 1:30 PM, Daniel Steuernol <dan...@sendwithus.com >>> > wrote: >>> >>> >>> I have a 6 node cassandra cluster running, and frequently a node will go >>> down with no obvious error in the logs. This is starting to happen quite >>> often, almost daily now. Any suggestions on how to track down what is >>> causing the node to stop? ------------------------------ >>> --------------------------------------- To unsubscribe, e-mail: >>> user-unsubscr...@cassandra.apache.org For additional commands, e-mail: >>> user-h...@cassandra.apache.org >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >>> additional commands, e-mail: user-h...@cassandra.apache.org >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >>> additional commands, e-mail: user-h...@cassandra.apache.org >>> >>> >>> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For >> additional commands, e-mail: user-h...@cassandra.apache.org >> > >