Re: Out of memory and/or OOM kill on a cluster

2016-11-22 Thread Vincent Rischmann
Thanks for the detailed answer Alexander.



We'll look into your suggestions, it's definitely helpful. We have plans
to reduce tombstones and remove the table with the big partitions,
hopefully after we've done that the cluster will be stable again.


Thanks again.





On Tue, Nov 22, 2016, at 09:03 AM, Alexander Dejanovski wrote:

> Hi Vincent, 

> 

> Here are a few pointers for disabling swap : 

> - 
> https://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html
> - 
> http://stackoverflow.com/questions/22988824/why-swap-needs-to-be-turned-off-in-datastax-cassandra
> 

> Tombstones are definitely the kind of object that can clutter your
> heap, lead to frequent GC pauses and could be part of why you run into
> OOM from time to time. I cannot answer for sure though as it is a bit
> more complex than that actually.
> You do not have crazy high GC pauses, although a 5s pause should not
> happen on a healthy cluster.
> 

> Getting back to big partitions, I've had the case in production where
> a multi GB partition was filling a 26GB G1 heap when being compacted.
> Eventually, the old gen took all the available space in the heap,
> leaving no room for the young gen, but it actually never OOMed. To be
> honest, I would have preferred an OOM to the inefficient 50s GC pauses
> we've had because such a slow node can (and did) affect the whole
> cluster.
> 

> I think you may have a combination of things happening here and you
> should work on improving them all :
> - spot precisely which are your big partitions to understand why you
>   have some (data modeling issue or data source bad behavior) : look
>   for "large partition" warnings in the cassandra logs, it will give
>   you the partition key
> - try to reduce the number of tombstones you're reading by changing
>   your queries or data model, or maybe by setting up an aggressive
>   tombstone pruning strategy :
>   
> http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=unchecked_tombstone_compaction#common-options
> You could benefit from setting unchecked_tombstone_compaction to true
> and tuning both tombstone_threshold and tombstone_compaction_interval
> - Follow recommended production settings and fully disable swap from
>   your Cassandra nodes
> 

> You might want to scale down from the 20GB heap as the OOM Killer will
> stop your process either way, and it might allow you to have an
> analyzable heap dump. Such a heap dump could tell us if there are lots
> of tombstones there when the JVM dies.
> 

> I hope that's helpful as there is no easy answer here, and the problem
> should be narrowed down by fixing all potential causes.
> 

> Cheers,

> 

> 

> 

> 

> On Mon, Nov 21, 2016 at 5:10 PM Vincent Rischmann
>  wrote:
>> __

>> Thanks for your answer Alexander.

>> 

>> We're writing constantly to the table, we estimate it's something
>> like 1.5k to 2k writes per second. Some of these requests update a
>> bunch of fields, some update fields + append something to a set.
>> We don't read constantly from it but when we do it's a lot of read,
>> up to 20k reads per second sometimes.
>> For this particular keyspace everything is using the size tiered
>> compaction strategy.
>> 

>>  - Every node is a physical server, has a 8-Core CPU, 32GB of ram and
>>3TB of SSD.
>>  - Java version is 1.8.0_101 for all nodes except one which is using
>>1.8.0_111 (only for about a week I think, before that it used
>>1.8.0_101 too).
>>  - We're using the G1 GC. I looked at the 19th and on that day we
>>had:
>>   - 1505 GCs

>>   - 2 Old Gen GCs which took around 5s each

>>   - the rest are New Gen GCs, with only 1 other 1s. There's 15 to 20
>> GCs which took between 0.4 and 0.7s. The rest is between 250ms
>> and 400ms approximately.
>> Sometimes, there are 3/4/5 GCs in a row in like 2 seconds, each
>> taking between 250ms to 400ms, but it's kinda rare from what I
>> can see.
>>  - So regarding GC logs, I have them enabled, I've got a bunch of
>>gc.log.X files in /var/log/cassandra, but somehow I can't find any
>>log files for certain periods. On one node which crashed this
>>morning I lost like a week of GC logs, no idea what is happening
>>there...
>>  - I'll just put a couple of warnings here, there are around 9k just
>>for today.
>> 

>> WARN  [SharedPool-Worker-8] 2016-11-21 17:02:00,497
>> SliceQueryFilter.java:320 - Read 2001 live and 11129 tombstone cells
>> in foo.install_info for key: foo@IOS:7 (see
>> tombstone_warn_threshold). 2000 columns were requested, slices=[-]
>> WARN  [SharedPool-Worker-1] 2016-11-21 17:02:02,559
>> SliceQueryFilter.java:320 - Read 2001 live and 11064 tombstone cells
>> in foo.install_info for key: foo@IOS:7 (see
>> tombstone_warn_threshold). 2000 columns were requested, 
>> slices=[di[42FB29E1-8C99-45BE-8A44-9480C50C6BC4]:!-
>> ]
>> WARN  [SharedPool-Worker-2] 2016-11-21 17:02:05,286
>> SliceQueryFilter.java:320 - 

Re: Out of memory and/or OOM kill on a cluster

2016-11-22 Thread Alexander Dejanovski
Hi Vincent,

Here are a few pointers for disabling swap :
-
https://docs.datastax.com/en/cassandra/2.0/cassandra/install/installRecommendSettings.html
-
http://stackoverflow.com/questions/22988824/why-swap-needs-to-be-turned-off-in-datastax-cassandra

Tombstones are definitely the kind of object that can clutter your heap,
lead to frequent GC pauses and could be part of why you run into OOM from
time to time. I cannot answer for sure though as it is a bit more complex
than that actually.
You do not have crazy high GC pauses, although a 5s pause should not happen
on a healthy cluster.

Getting back to big partitions, I've had the case in production where a
multi GB partition was filling a 26GB G1 heap when being compacted.
Eventually, the old gen took all the available space in the heap, leaving
no room for the young gen, but it actually never OOMed. To be honest, I
would have preferred an OOM to the inefficient 50s GC pauses we've had
because such a slow node can (and did) affect the whole cluster.

I think you may have a combination of things happening here and you should
work on improving them all :
- spot precisely which are your big partitions to understand why you have
some (data modeling issue or data source bad behavior) : look for "large
partition" warnings in the cassandra logs, it will give you the partition
key
- try to reduce the number of tombstones you're reading by changing your
queries or data model, or maybe by setting up an aggressive tombstone
pruning strategy :
http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=unchecked_tombstone_compaction#common-options
You could benefit from setting unchecked_tombstone_compaction to true and
tuning both tombstone_threshold and tombstone_compaction_interval
- Follow recommended production settings and fully disable swap from your
Cassandra nodes

You might want to scale down from the 20GB heap as the OOM Killer will stop
your process either way, and it might allow you to have an analyzable heap
dump. Such a heap dump could tell us if there are lots of tombstones there
when the JVM dies.

I hope that's helpful as there is no easy answer here, and the problem
should be narrowed down by fixing all potential causes.

Cheers,




On Mon, Nov 21, 2016 at 5:10 PM Vincent Rischmann  wrote:

> Thanks for your answer Alexander.
>
> We're writing constantly to the table, we estimate it's something like
> 1.5k to 2k writes per second. Some of these requests update a bunch of
> fields, some update fields + append something to a set.
> We don't read constantly from it but when we do it's a lot of read, up to
> 20k reads per second sometimes.
> For this particular keyspace everything is using the size tiered
> compaction strategy.
>
>  - Every node is a physical server, has a 8-Core CPU, 32GB of ram and 3TB
> of SSD.
>  - Java version is 1.8.0_101 for all nodes except one which is using
> 1.8.0_111 (only for about a week I think, before that it used 1.8.0_101
> too).
>  - We're using the G1 GC. I looked at the 19th and on that day we had:
>   - 1505 GCs
>   - 2 Old Gen GCs which took around 5s each
>   - the rest are New Gen GCs, with only 1 other 1s. There's 15 to 20 GCs
> which took between 0.4 and 0.7s. The rest is between 250ms and 400ms
> approximately.
> Sometimes, there are 3/4/5 GCs in a row in like 2 seconds, each taking
> between 250ms to 400ms, but it's kinda rare from what I can see.
>  - So regarding GC logs, I have them enabled, I've got a bunch of gc.log.X
> files in /var/log/cassandra, but somehow I can't find any log files for
> certain periods. On one node which crashed this morning I lost like a week
> of GC logs, no idea what is happening there...
>  - I'll just put a couple of warnings here, there are around 9k just for
> today.
>
> WARN  [SharedPool-Worker-8] 2016-11-21 17:02:00,497
> SliceQueryFilter.java:320 - Read 2001 live and 11129 tombstone cells in
> foo.install_info for key: foo@IOS:7 (see tombstone_warn_threshold). 2000
> columns were requested, slices=[-]
> WARN  [SharedPool-Worker-1] 2016-11-21 17:02:02,559
> SliceQueryFilter.java:320 - Read 2001 live and 11064 tombstone cells in
> foo.install_info for key: foo@IOS:7 (see tombstone_warn_threshold). 2000
> columns were requested, slices=[di[42FB29E1-8C99-45BE-8A44-9480C50C6BC4]:!-]
> WARN  [SharedPool-Worker-2] 2016-11-21 17:02:05,286
> SliceQueryFilter.java:320 - Read 2001 live and 11064 tombstone cells in
> foo.install_info for key: foo@IOS:7 (see tombstone_warn_threshold). 2000
> columns were requested, slices=[di[42FB29E1-8C99-45BE-8A44-9480C50C6BC4]:!-]
> WARN  [SharedPool-Worker-11] 2016-11-21 17:02:08,860
> SliceQueryFilter.java:320 - Read 2001 live and 19966 tombstone cells in
> foo.install_info for key: foo@IOS:10 (see tombstone_warn_threshold). 2000
> columns were requested, slices=[-]
>
> So, we're guessing this is bad since it's warning us, however does this
> have a significant on the heap / GC ? I don't really know.
>
> - 

Re: Out of memory and/or OOM kill on a cluster

2016-11-21 Thread Vincent Rischmann
Thanks for your answer Alexander.



We're writing constantly to the table, we estimate it's something like
1.5k to 2k writes per second. Some of these requests update a bunch of
fields, some update fields + append something to a set.
We don't read constantly from it but when we do it's a lot of read, up
to 20k reads per second sometimes.
For this particular keyspace everything is using the size tiered
compaction strategy.


 - Every node is a physical server, has a 8-Core CPU, 32GB of ram and
   3TB of SSD.
 - Java version is 1.8.0_101 for all nodes except one which is using
   1.8.0_111 (only for about a week I think, before that it used
   1.8.0_101 too).
 - We're using the G1 GC. I looked at the 19th and on that day we had:

  - 1505 GCs

  - 2 Old Gen GCs which took around 5s each

  - the rest are New Gen GCs, with only 1 other 1s. There's 15 to 20 GCs
which took between 0.4 and 0.7s. The rest is between 250ms and 400ms
approximately.
Sometimes, there are 3/4/5 GCs in a row in like 2 seconds, each taking
between 250ms to 400ms, but it's kinda rare from what I can see.
 - So regarding GC logs, I have them enabled, I've got a bunch of
   gc.log.X files in /var/log/cassandra, but somehow I can't find any
   log files for certain periods. On one node which crashed this morning
   I lost like a week of GC logs, no idea what is happening there...
 - I'll just put a couple of warnings here, there are around 9k just
   for today.


WARN  [SharedPool-Worker-8] 2016-11-21 17:02:00,497
SliceQueryFilter.java:320 - Read 2001 live and 11129 tombstone cells in
foo.install_info for key: foo@IOS:7 (see tombstone_warn_threshold). 2000
columns were requested, slices=[-]
WARN  [SharedPool-Worker-1] 2016-11-21 17:02:02,559
SliceQueryFilter.java:320 - Read 2001 live and 11064 tombstone cells in
foo.install_info for key: foo@IOS:7 (see tombstone_warn_threshold). 2000
columns were requested, slices=[di[42FB29E1-8C99-45BE-8A44-9480C50C6BC4]:!-
]
WARN  [SharedPool-Worker-2] 2016-11-21 17:02:05,286
SliceQueryFilter.java:320 - Read 2001 live and 11064 tombstone cells in
foo.install_info for key: foo@IOS:7 (see tombstone_warn_threshold). 2000
columns were requested, slices=[di[42FB29E1-8C99-45BE-8A44-9480C50C6BC4]:!-
]
WARN  [SharedPool-Worker-11] 2016-11-21 17:02:08,860
SliceQueryFilter.java:320 - Read 2001 live and 19966 tombstone cells in
foo.install_info for key: foo@IOS:10 (see tombstone_warn_threshold).
2000 columns were requested, slices=[-]


So, we're guessing this is bad since it's warning us, however does this
have a significant on the heap / GC ? I don't really know.


- cfstats tells me this:



Average live cells per slice (last five minutes): 1458.029594846951

Maximum live cells per slice (last five minutes): 2001.0

Average tombstones per slice (last five minutes): 1108.2466913854232

Maximum tombstones per slice (last five minutes): 22602.0



- regarding swap, it's not disabled anywhere, I must say we never really
  thought about it. Does it provide a significant benefit ?


Thanks for your help, really appreciated !



On Mon, Nov 21, 2016, at 04:13 PM, Alexander Dejanovski wrote:

> Vincent,

> 

> only the 2.68GB partition is out of bounds here, all the others
> (<256MB) shouldn't be much of a problem.
> It could put pressure on your heap if it is often read and/or
> compacted.
> But to answer your question about the 1% harming the cluster, a few
> big partitions can definitely be a big problem depending on your
> access patterns.
> Which compaction strategy are you using on this table ?

> 

> Could you provide/check the following things on a node that crashed
> recently :
>  * Hardware specifications (how many cores ? how much RAM ? Bare metal
>or VMs ?)
>  * Java version
>  * GC pauses throughout a day (grep GCInspector
>/var/log/cassandra/system.log) : check if you have many pauses that
>take more than 1 second
>  * GC logs at the time of a crash (if you don't produce any, you
>should activate them in cassandra-env.sh)
>  * Tombstone warnings in the logs and high number of tombstone read in
>cfstats
>  * Make sure swap is disabled
> 

> Cheers,

> 

> 

> On Mon, Nov 21, 2016 at 2:57 PM Vincent Rischmann
>  wrote:
>> __

>> @Vladimir

>> 

>> We tried with 12Gb and 16Gb, the problem appeared eventually too.

>> In this particular cluster we have 143 tables across 2 keyspaces.

>> 

>> @Alexander

>> 

>> We have one table with a max partition of 2.68GB, one of 256 MB, a
>> bunch with the size varying between 10MB to 100MB ~. Then there's the
>> rest with the max lower than 10MB.
>> 

>> On the biggest, the 99% is around 60MB, 98% around 25MB, 95%
>> around 5.5MB.
>> On the one with max of 256MB, the 99% is around 4.6MB, 98%
>> around 2MB.
>> 

>> Could the 1% here really have that much impact ? We do write a lot to
>> the biggest table and read quite often too, however I have no way to
>> know if that big partition is ever read.
>> 

>> 

>> On Mon, Nov 

Re: Out of memory and/or OOM kill on a cluster

2016-11-21 Thread Alexander Dejanovski
Vincent,

only the 2.68GB partition is out of bounds here, all the others (<256MB)
shouldn't be much of a problem.
It could put pressure on your heap if it is often read and/or compacted.
But to answer your question about the 1% harming the cluster, a few big
partitions can definitely be a big problem depending on your access
patterns.
Which compaction strategy are you using on this table ?

Could you provide/check the following things on a node that crashed
recently :

   - Hardware specifications (how many cores ? how much RAM ? Bare metal or
   VMs ?)
   - Java version
   - GC pauses throughout a day (grep GCInspector
   /var/log/cassandra/system.log) : check if you have many pauses that take
   more than 1 second
   - GC logs at the time of a crash (if you don't produce any, you should
   activate them in cassandra-env.sh)
   - Tombstone warnings in the logs and high number of tombstone read in
   cfstats
   - Make sure swap is disabled


Cheers,


On Mon, Nov 21, 2016 at 2:57 PM Vincent Rischmann  wrote:

@Vladimir

We tried with 12Gb and 16Gb, the problem appeared eventually too.
In this particular cluster we have 143 tables across 2 keyspaces.

@Alexander

We have one table with a max partition of 2.68GB, one of 256 MB, a bunch
with the size varying between 10MB to 100MB ~. Then there's the rest with
the max lower than 10MB.

On the biggest, the 99% is around 60MB, 98% around 25MB, 95% around 5.5MB.
On the one with max of 256MB, the 99% is around 4.6MB, 98% around 2MB.

Could the 1% here really have that much impact ? We do write a lot to the
biggest table and read quite often too, however I have no way to know if
that big partition is ever read.


On Mon, Nov 21, 2016, at 01:09 PM, Alexander Dejanovski wrote:

Hi Vincent,

one of the usual causes of OOMs is very large partitions.
Could you check your nodetool cfstats output in search of large partitions
? If you find one (or more), run nodetool cfhistograms on those tables to
get a view of the partition sizes distribution.

Thanks

On Mon, Nov 21, 2016 at 12:01 PM Vladimir Yudovin 
wrote:


Did you try any value in the range 8-20 (e.g. 60-70% of physical memory).
Also how many tables do you have across all keyspaces? Each table can
consume minimum 1M of Java heap.

Best regards, Vladimir Yudovin,

*Winguzone  - Hosted Cloud CassandraLaunch
your cluster in minutes.*


 On Mon, 21 Nov 2016 05:13:12 -0500*Vincent Rischmann >* wrote 

Hello,

we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot
of trouble lately.

The problem is simple: nodes regularly die because of an out of memory
exception or the Linux OOM killer decides to kill the process.
For a couple of weeks now we increased the heap to 20Gb hoping it would
solve the out of memory errors, but in fact it didn't; instead of getting
out of memory exception the OOM killer killed the JVM.

We reduced the heap on some nodes to 8Gb to see if it would work better,
but some nodes crashed again with out of memory exception.

I suspect some of our tables are badly modelled, which would cause
Cassandra to allocate a lot of data, however I don't how to prove that
and/or find which table is bad, and which query is responsible.

I tried looking at metrics in JMX, and tried profiling using mission
control but it didn't really help; it's possible I missed it because I have
no idea what to look for exactly.

Anyone have some advice for troubleshooting this ?

Thanks.

-- 
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


-- 
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Out of memory and/or OOM kill on a cluster

2016-11-21 Thread Vincent Rischmann
@Vladimir



We tried with 12Gb and 16Gb, the problem appeared eventually too.

In this particular cluster we have 143 tables across 2 keyspaces.



@Alexander



We have one table with a max partition of 2.68GB, one of 256 MB, a bunch
with the size varying between 10MB to 100MB ~. Then there's the rest
with the max lower than 10MB.


On the biggest, the 99% is around 60MB, 98% around 25MB, 95%
around 5.5MB.
On the one with max of 256MB, the 99% is around 4.6MB, 98% around 2MB.



Could the 1% here really have that much impact ? We do write a lot to
the biggest table and read quite often too, however I have no way to
know if that big partition is ever read.




On Mon, Nov 21, 2016, at 01:09 PM, Alexander Dejanovski wrote:

> Hi Vincent,

> 

> one of the usual causes of OOMs is very large partitions.

> Could you check your nodetool cfstats output in search of large
> partitions ? If you find one (or more), run nodetool cfhistograms on
> those tables to get a view of the partition sizes distribution.
> 

> Thanks

> 

> On Mon, Nov 21, 2016 at 12:01 PM Vladimir Yudovin
>  wrote:
>> __

>> Did you try any value in the range 8-20 (e.g. 60-70% of physical
>> memory).
>> Also how many tables do you have across all keyspaces? Each table can
>> consume minimum 1M of Java heap.
>> 

>> Best regards, Vladimir Yudovin, 

>> *Winguzone[1] - Hosted Cloud Cassandra Launch your cluster in
>> minutes.*
>> 

>> 

>>  On Mon, 21 Nov 2016 05:13:12 -0500*Vincent Rischmann
>> * wrote 
>> 

>>> Hello,

>>> 

>>> we have a 8 node Cassandra 2.1.15 cluster at work which is giving us
>>> a lot of trouble lately.
>>> 

>>> The problem is simple: nodes regularly die because of an out of
>>> memory exception or the Linux OOM killer decides to kill the
>>> process.
>>> For a couple of weeks now we increased the heap to 20Gb hoping it
>>> would solve the out of memory errors, but in fact it didn't; instead
>>> of getting out of memory exception the OOM killer killed the JVM.
>>> 

>>> We reduced the heap on some nodes to 8Gb to see if it would work
>>> better, but some nodes crashed again with out of memory exception.
>>> 

>>> I suspect some of our tables are badly modelled, which would cause
>>> Cassandra to allocate a lot of data, however I don't how to prove
>>> that and/or find which table is bad, and which query is responsible.
>>> 

>>> I tried looking at metrics in JMX, and tried profiling using mission
>>> control but it didn't really help; it's possible I missed it because
>>> I have no idea what to look for exactly.
>>> 

>>> Anyone have some advice for troubleshooting this ?

>>> 

>>> Thanks.

> -- 

> -

> Alexander Dejanovski

> France

> @alexanderdeja

> 

> Consultant

> Apache Cassandra Consulting

> http://www.thelastpickle.com[2]




Links:

  1. https://winguzone.com?from=list
  2. http://www.thelastpickle.com/


Re: Out of memory and/or OOM kill on a cluster

2016-11-21 Thread Alexander Dejanovski
Hi Vincent,

one of the usual causes of OOMs is very large partitions.
Could you check your nodetool cfstats output in search of large partitions
? If you find one (or more), run nodetool cfhistograms on those tables to
get a view of the partition sizes distribution.

Thanks

On Mon, Nov 21, 2016 at 12:01 PM Vladimir Yudovin 
wrote:

> Did you try any value in the range 8-20 (e.g. 60-70% of physical memory).
> Also how many tables do you have across all keyspaces? Each table can
> consume minimum 1M of Java heap.
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
>
>
>  On Mon, 21 Nov 2016 05:13:12 -0500*Vincent Rischmann
> >* wrote 
>
> Hello,
>
> we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot
> of trouble lately.
>
> The problem is simple: nodes regularly die because of an out of memory
> exception or the Linux OOM killer decides to kill the process.
> For a couple of weeks now we increased the heap to 20Gb hoping it would
> solve the out of memory errors, but in fact it didn't; instead of getting
> out of memory exception the OOM killer killed the JVM.
>
> We reduced the heap on some nodes to 8Gb to see if it would work better,
> but some nodes crashed again with out of memory exception.
>
> I suspect some of our tables are badly modelled, which would cause
> Cassandra to allocate a lot of data, however I don't how to prove that
> and/or find which table is bad, and which query is responsible.
>
> I tried looking at metrics in JMX, and tried profiling using mission
> control but it didn't really help; it's possible I missed it because I have
> no idea what to look for exactly.
>
> Anyone have some advice for troubleshooting this ?
>
> Thanks.
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


Re: Out of memory and/or OOM kill on a cluster

2016-11-21 Thread Vladimir Yudovin
Did you try any value in the range 8-20 (e.g. 60-70% of physical memory).

Also how many tables do you have across all keyspaces? Each table can consume 
minimum 1M of Java heap.



Best regards, Vladimir Yudovin, 

Winguzone - Hosted Cloud Cassandra
Launch your cluster in minutes.





 On Mon, 21 Nov 2016 05:13:12 -0500Vincent Rischmann 
m...@vrischmann.me wrote 




Hello,



we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a lot of 
trouble lately.



The problem is simple: nodes regularly die because of an out of memory 
exception or the Linux OOM killer decides to kill the process.

For a couple of weeks now we increased the heap to 20Gb hoping it would solve 
the out of memory errors, but in fact it didn't; instead of getting out of 
memory exception the OOM killer killed the JVM.



We reduced the heap on some nodes to 8Gb to see if it would work better, but 
some nodes crashed again with out of memory exception.



I suspect some of our tables are badly modelled, which would cause Cassandra to 
allocate a lot of data, however I don't how to prove that and/or find which 
table is bad, and which query is responsible.



I tried looking at metrics in JMX, and tried profiling using mission control 
but it didn't really help; it's possible I missed it because I have no idea 
what to look for exactly.



Anyone have some advice for troubleshooting this ?



Thanks.








Out of memory and/or OOM kill on a cluster

2016-11-21 Thread Vincent Rischmann
Hello,



we have a 8 node Cassandra 2.1.15 cluster at work which is giving us a
lot of trouble lately.


The problem is simple: nodes regularly die because of an out of memory
exception or the Linux OOM killer decides to kill the process.
For a couple of weeks now we increased the heap to 20Gb hoping it would
solve the out of memory errors, but in fact it didn't; instead of
getting out of memory exception the OOM killer killed the JVM.


We reduced the heap on some nodes to 8Gb to see if it would work better,
but some nodes crashed again with out of memory exception.


 I suspect some of our tables are badly modelled, which would cause
 Cassandra to allocate a lot of data, however I don't how to prove that
 and/or find which table is bad, and which query is responsible.


I tried looking at metrics in JMX, and tried profiling using mission
control but it didn't really help; it's possible I missed it because I
have no idea what to look for exactly.


Anyone have some advice for troubleshooting this ?



Thanks.