Vincent, only the 2.68GB partition is out of bounds here, all the others (<256MB) shouldn't be much of a problem. It could put pressure on your heap if it is often read and/or compacted. But to answer your question about the 1% harming the cluster, a few big partitions can definitely be a big problem depending on your access patterns. Which compaction strategy are you using on this table ?
Could you provide/check the following things on a node that crashed recently : - Hardware specifications (how many cores ? how much RAM ? Bare metal or VMs ?) - Java version - GC pauses throughout a day (grep GCInspector /var/log/cassandra/system.log) : check if you have many pauses that take more than 1 second - GC logs at the time of a crash (if you don't produce any, you should activate them in cassandra-env.sh) - Tombstone warnings in the logs and high number of tombstone read in cfstats - Make sure swap is disabled Cheers, On Mon, Nov 21, 2016 at 2:57 PM Vincent Rischmann <m...@vrischmann.me> wrote: @Vladimir We tried with 12Gb and 16Gb, the problem appeared eventually too. In this particular cluster we have 143 tables across 2 keyspaces. @Alexander We have one table with a max partition of 2.68GB, one of 256 MB, a bunch with the size varying between 10MB to 100MB ~. Then there's the rest with the max lower than 10MB. On the biggest, the 99% is around 60MB, 98% around 25MB, 95% around 5.5MB. On the one with max of 256MB, the 99% is around 4.6MB, 98% around 2MB. Could the 1% here really have that much impact ? We do write a lot to the biggest table and read quite often too, however I have no way to know if that big partition is ever read. On Mon, Nov 21, 2016, at 01:09 PM, Alexander Dejanovski wrote: Hi Vincent, one of the usual causes of OOMs is very large partitions. Could you check your nodetool cfstats output in search of large partitions ? If you find one (or more), run nodetool cfhistograms on those tables to get a view of the partition sizes distribution. Thanks On Mon, Nov 21, 2016 at 12:01 PM Vladimir Yudovin <vla...@winguzone.com> wrote: Did you try any value in the range 8-20 (e.g. 60-70% of physical memory). Also how many tables do you have across all keyspaces? Each table can consume minimum 1M of Java heap. Best regards, Vladimir Yudovin, *Winguzone <https://winguzone.com?from=list> - Hosted Cloud CassandraLaunch your cluster in minutes.* ---- On Mon, 21 Nov 2016 05:13:12 -0500*Vincent Rischmann <m...@vrischmann.me <m...@vrischmann.me>>* wrote ---- Hello, we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot of trouble lately. The problem is simple: nodes regularly die because of an out of memory exception or the Linux OOM killer decides to kill the process. For a couple of weeks now we increased the heap to 20Gb hoping it would solve the out of memory errors, but in fact it didn't; instead of getting out of memory exception the OOM killer killed the JVM. We reduced the heap on some nodes to 8Gb to see if it would work better, but some nodes crashed again with out of memory exception. I suspect some of our tables are badly modelled, which would cause Cassandra to allocate a lot of data, however I don't how to prove that and/or find which table is bad, and which query is responsible. I tried looking at metrics in JMX, and tried profiling using mission control but it didn't really help; it's possible I missed it because I have no idea what to look for exactly. Anyone have some advice for troubleshooting this ? Thanks. -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com -- ----------------- Alexander Dejanovski France @alexanderdeja Consultant Apache Cassandra Consulting http://www.thelastpickle.com