Re: crash with OOM

2016-09-27 Thread Ben Slater
That is a very large heap size for C* - most installations I’ve seen are
running in the 8-12MB heap range. Apparently G1GC is better for larger
heaps so that may help. However, you are probably better off digging a bit
deeper into what is using all that heap? Massive IN clause lists? Massive
multi-partition batches? Massive partitions?

Especially given it hit two nodes simultaneously I would be looking for
 rogue query as my first point of investigation.

Cheers
Ben

On Tue, 27 Sep 2016 at 17:49 xutom  wrote:

>
> Hi, all
> I have a C* cluster with 12 nodes.  My cassandra version is 2.1.14; Just
> now two nodes crashed and client fails to export data with read consistency
> QUORUM. The following are logs of failed nodes:
>
> ERROR [SharedPool-Worker-159] 2016-09-26 20:51:14,124 Message.java:538 -
> Unexpected exception during request; channel = [id: 0xce43a388, /
> 13.13.13.80:55536 :> /13.13.13.149:9042]
> java.lang.AssertionError: null
> at
> org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:100)
> ~[apache-cassandra-2.1.14.jar:2.1.14]
> at
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:442)
> [apache-cassandra-2.1.14.jar:2.1.14]
> at
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
> [apache-cassandra-2.1.14.jar:2.1.14]
> at
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
> [netty-all-4.0.23.Final.jar:4.0.23.Final]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [na:1.7.0_65]
> at
> org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
> [apache-cassandra-2.1.14.jar:2.1.14]
> at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [apache-cassandra-2.1.14.jar:2.1.14]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
> ERROR [SharedPool-Worker-116] 2016-09-26 20:51:14,125
> JVMStabilityInspector.java:117 - JVM state determined to be unstable.
> Exiting forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
> ERROR [SharedPool-Worker-121] 2016-09-26 20:51:14,125
> JVMStabilityInspector.java:117 - JVM state determined to be unstable.
> Exiting forcefully due to:
> java.lang.OutOfMemoryError: Java heap space
> ERROR [SharedPool-Worker-157] 2016-09-26 20:51:14,124 Message.java:538 -
> Unexpected exception during request; channel = [id: 0xce43a388, /
> 13.13.13.80:55536 :> /13.13.13.149:9042]
>
> My server has total 256G memory so I set the MAX_HEAP_SIZE 60G, the config
> in cassandra-env.sh:
> MAX_HEAP_SIZE="60G"
> HEAP_NEWSIZE="20G"
> How to solve such OOM?
>
>
>
>
>
-- 

Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798


crash with OOM

2016-09-27 Thread xutom

Hi, all
I have a C* cluster with 12 nodes.  My cassandra version is 2.1.14; Just now 
two nodes crashed and client fails to export data with read consistency QUORUM. 
The following are logs of failed nodes:

ERROR [SharedPool-Worker-159] 2016-09-26 20:51:14,124 Message.java:538 - 
Unexpected exception during request; channel = [id: 0xce43a388, 
/13.13.13.80:55536 :> /13.13.13.149:9042]
java.lang.AssertionError: null
at 
org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:100)
 ~[apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:442)
 [apache-cassandra-2.1.14.jar:2.1.14]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:335)
 [apache-cassandra-2.1.14.jar:2.1.14]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$700(AbstractChannelHandlerContext.java:32)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$8.run(AbstractChannelHandlerContext.java:324)
 [netty-all-4.0.23.Final.jar:4.0.23.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
[na:1.7.0_65]
at 
org.apache.cassandra.concurrent.AbstractTracingAwareExecutorService$FutureTask.run(AbstractTracingAwareExecutorService.java:164)
 [apache-cassandra-2.1.14.jar:2.1.14]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105) 
[apache-cassandra-2.1.14.jar:2.1.14]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]
ERROR [SharedPool-Worker-116] 2016-09-26 20:51:14,125 
JVMStabilityInspector.java:117 - JVM state determined to be unstable.  Exiting 
forcefully due to:
java.lang.OutOfMemoryError: Java heap space
ERROR [SharedPool-Worker-121] 2016-09-26 20:51:14,125 
JVMStabilityInspector.java:117 - JVM state determined to be unstable.  Exiting 
forcefully due to:
java.lang.OutOfMemoryError: Java heap space
ERROR [SharedPool-Worker-157] 2016-09-26 20:51:14,124 Message.java:538 - 
Unexpected exception during request; channel = [id: 0xce43a388, 
/13.13.13.80:55536 :> /13.13.13.149:9042]

My server has total 256G memory so I set the MAX_HEAP_SIZE 60G, the config in 
cassandra-env.sh:
MAX_HEAP_SIZE="60G"
HEAP_NEWSIZE="20G"
How to solve such OOM?