Andrey Lataev created CASSANDRA-13931:
-----------------------------------------

             Summary: Cassandra JVM stop itself randomly
                 Key: CASSANDRA-13931
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
             Project: Cassandra
          Issue Type: Bug
          Components: Core
         Environment: RHEL 7.3
JDK HotSpot 1.8.0_121-b13
cassandra-3.11 cluster with 43 nodes in 9 datacenters
8vCPU, 32 GB RAM
            Reporter: Andrey Lataev
         Attachments: cassandra-env.sh, cassandra.yaml

Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;

# # grep "Out of" /var/log/messages-20170918
Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 (java) 
score 287 or sacrifice child
Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 (java) 
score 289 or sacrifice child

If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
HeapUtils.java:136 - Dumping heap to 
/egov/dumps/cassandra-1506868110-pid11155.hprof

It seems like  JVM kill itself when off-heap memory leaks occur.
Typical errors in  system.log before JVM begin dumping:

ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
CassandraDaemon.java:228 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.143,5,main]
ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 
- Unexpected exception during request; channel = [id: 0x3c0c1c26, 
L:/172.20.4.142:9042 - R:/172.20.4.139:44874]

Full stack traces:

ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 
- Unexpected exception during request; channel = [id: 0x3c0c1c26, 
L:/172.20.4.142:9042 -
R:/172.20.4.139:44874]
java.lang.AssertionError: null
        at 
org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
 [apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.11.0.jar:3.11.0]
        at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.11.0.jar:3.1
1.0]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.11.0.jar:3.11.0]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]



INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
Heap dump file created



ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 
CassandraDaemon.java:228 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.143,5,main]
java.io.IOError: java.io.EOFException: Stream ended prematurely
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
Caused by: java.io.EOFException: Stream ended prematurely
        at 
net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218) 
~[lz4-1.3.0.jar:na]
        at 
net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:150) 
~[lz4-1.3.0.jar:na]
        at 
net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117) 
~[lz4-1.3.0.jar:na]
        at java.io.DataInputStream.readFully(DataInputStream.java:195) 
~[na:1.8.0_121]
        at java.io.DataInputStream.readFully(DataInputStream.java:169) 
~[na:1.8.0_121]
        at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:437) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:639)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:604)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1242) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1197) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at org.apache.cassandra.db.Columns.apply(Columns.java:377) 
~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:600)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeOne(UnfilteredSerializer.java:475)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:431)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:222)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
        ... 11 common frames omitted


Also I try to set -XX:+ExplicitGCInvokesConcurrent on some other nodes but 
without success.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to