I neglected to mention, I also adjust the oom score of cassandra, to tell the 
kernel to kill something else other than cassandra. (Like if one of your dev’s 
runs a script that uses a lot of memory, so it kills your dev’s script instead).

http://lwn.net/Articles/317814/ <http://lwn.net/Articles/317814/>

> On 19 Feb 2015, at 5:28 am, Michał Łowicki <mlowi...@gmail.com> wrote:
> 
> Hi,
> 
> Couple of times a day 2 out of 4 members cluster nodes are killed
> 
> root@db4:~# dmesg | grep -i oom
> [4811135.792657] [ pid ]   uid  tgid total_vm      rss cpu oom_adj 
> oom_score_adj name
> [6559049.307293] java invoked oom-killer: gfp_mask=0x201da, order=0, 
> oom_adj=0, oom_score_adj=0
> 
> Nodes are using 8GB heap (confirmed with *nodetool info*) and aren't using 
> row cache. 
> 
> Noticed that couple of times a day used RSS is growing really fast within 
> couple of minutes and I see CPU spikes at the same time - 
> https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0
>  
> <https://www.dropbox.com/s/khco2kdp4qdzjit/Screenshot%202015-02-18%2015.10.54.png?dl=0>.
> 
> Could be related to compaction but after compaction is finished used RSS 
> doesn't shrink. Output from pmap when C* process uses 50GB RAM (out of 64GB) 
> is available on http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb 
> <http://paste.ofcode.org/ZjLUA2dYVuKvJHAk9T3Hjb>. At the time dump was made 
> heap usage is far below 8GB (~3GB) but total RSS is ~50GB.
> 
> Any help will be appreciated.
> 
> -- 
> BR,
> Michał Łowicki

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to