[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16635473#comment-16635473 ] Mhanna Abu Tareef commented on CASSANDRA-13931: --- [~Ljus] Any updates on this? Have you managed to overcome this issue? Because i think i have the same symptoms > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev >Priority: Major > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) >
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198812#comment-16198812 ] Andrey Lataev commented on CASSANDRA-13931: --- I am downgrade Cassndra til 3.10 Upgrade JDK til 1.8.0_144 And set {code:java} MAX_HEAP_SIZE="9G" {code} and do not change {code:java} JVM_OPTS="$JVM_OPTS -XX:MaxDirectMemorySize=24G" {code} But still periodicaly have a similar problem with off-heap: {code:java} #*egrep "Dumping|YamlConfigurationLoader.java|ERR" /var/log/cassandra/system.log | egrep "2017-10-10 15"* ERROR [NonPeriodicTasks:1] 2017-10-10 15:59:31,155 Ref.java:233 - Error when closing class org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy@954667024:/egov/data/cassandra/datafiles1/p00smevaudit/messagelog20171010-a50f6b00a1f511e78dc897891b876cc2/mc-4357-big ERROR [NonPeriodicTasks:1] 2017-10-10 15:59:32,103 Ref.java:233 - Error when closing class org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy@1640091777:/egov/data/cassandra/datafiles1/p00smevaudit/messagelog20171010-a50f6b00a1f511e78dc897891b876cc2/mc-4355-big # *egrep "Dumping|YamlConfigurationLoader.java|ERR" /var/log/cassandra/system.log | egrep "2017-10-10 16"* ERROR [MessagingService-Incoming-/172.20.4.125] 2017-10-10 16:00:17,421 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.125,5,main] INFO [MutationStage-128] 2017-10-10 16:00:17,690 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-196] 2017-10-10 16:00:17,721 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-18] 2017-10-10 16:00:17,754 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-184] 2017-10-10 16:00:17,757 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-235] 2017-10-10 16:00:17,768 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-197] 2017-10-10 16:00:17,769 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-28] 2017-10-10 16:00:17,780 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-2] 2017-10-10 16:00:17,846 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-152] 2017-10-10 16:00:17,873 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-241] 2017-10-10 16:00:17,876 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-223] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-16] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-189] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... ERROR [MessagingService-Incoming-/172.20.4.139] 2017-10-10 16:00:21,540 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.139,5,main] ERROR [MessagingService-Incoming-/172.20.4.145] 2017-10-10 16:00:21,540 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.145,5,main] INFO [MutationStage-224] 2017-10-10 16:00:21,543 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-222] 2017-10-10 16:00:21,545 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-101] 2017-10-10 16:00:21,574 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... INFO [MutationStage-40] 2017-10-10 16:00:25,095 HeapUtils.java:136 - Dumping heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ... ERROR [MessagingService-Incoming-/172.20.4.145] 2017-10-10 16:00:25,170 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.145,5,main] ERROR [MessagingService-Incoming-/172.20.4.109] 2017-10-10 16:00:25,212 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.109,5,main] ERROR [MessagingService-Incoming-/172.20.4.163] 2017-10-10 16:00:25,213 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.163,5,main] ERROR [MessagingService-Incoming-/172.20.4.162] 2017-10-10 16:00:25,216 CassandraDaemon.java:229 - Exception in thread Thread[MessagingService-Incoming-/172.20.4.162,5,main] ERROR [MutationStage-128] 2017-10-10 16:00:32,694 JVMStabilityInspector.java:142 - JVM state determined
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191781#comment-16191781 ] Andrey Lataev commented on CASSANDRA-13931: --- Now time I will try to set: {code:java} concurrent_reads: 32 concurrent_writes: 64 {code} and {code:java} MAX_HEAP_SIZE="16G" JVM_OPTS="$JVM_OPTS -XX:MaxDirectMemorySize=24G" {code} > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at >
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191450#comment-16191450 ] Andrey Lataev commented on CASSANDRA-13931: --- As you can see in attached cassandra-env.sh file row: {code:java} JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144" {code} - exist. I will try to enlarge RAM and and increase heap size til 16Gb. Eclipse Memory Analyser for heapdump shown top 3 problem suspect: *Problem Suspect 1* {code:java} The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps local variables with total size 306 114 312 (13,97%) bytes. The memory is accumulated in one instance of "org.apache.cassandra.net.OutboundTcpConnection" loaded by "sun.misc.Launcher$AppClassLoader @ 0x6c000". {code} * Problem Suspect 2* {code:java} 529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by "sun.misc.Launcher$AppClassLoader @ 0x6c000" occupy 776 362 840 (35,43%) bytes. Biggest instances: •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. •io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 66 494 040 (3,03%) bytes. {code} *Problem Suspect 3* {code:java} 126 instances of "byte[]", loaded by "" occupy 268 549 640 (12,26%) bytes. These instances are referenced from one instance of "java.util.HashMap$Node[]", loaded by "" Keywords byte[] java.util.HashMap$Node[] {code} > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at >
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190324#comment-16190324 ] Chris Lohfink commented on CASSANDRA-13931: --- with {{-Xms6G, -Xmx6G, -Xmn2048M}} your going to have issues running C* with default settings. I would strongly recommend a minimum 8GB. The JVM defaults MaxDirectMemorySize to same as heap size (6G), although this usually doesnt fill up unless hitting a netty or jdk leak. If your running on a limited system where your getting hit by OOM killer you might want to consider smaller heap yet (ie 4gb) but then you will need to limit other settings since this is not going to be able to handle the default settings. ie {{concurrent_reads}} and {{concurrent_writes}} should be perhaps 1/2 or 1/4 the 64/128 you have atm. Also look to decrease just about anything that takes up resources offheap. > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at >
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190306#comment-16190306 ] Chris Lohfink commented on CASSANDRA-13931: --- There is a JDK memory leak on the direct memory ({{-Djdk.nio.maxCachedBufferSize=262144}}) that may help if you running jdk >= 1.8u102 Your giving JVM more heap+off heap than your system has, so the OS out of memory killer kills java or the JVM fails a malloc and shuts down. > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at >
[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly
[ https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190166#comment-16190166 ] Andrey Lataev commented on CASSANDRA-13931: --- Also, I can attach JVM heap dump if it help. > Cassandra JVM stop itself randomly > -- > > Key: CASSANDRA-13931 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13931 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: RHEL 7.3 > JDK HotSpot 1.8.0_121-b13 > cassandra-3.11 cluster with 43 nodes in 9 datacenters > 8vCPU, 32 GB RAM >Reporter: Andrey Lataev > Attachments: cassandra-env.sh, cassandra.yaml, > system.log.2017-10-01.zip > > > Before I set -XX:MaxDirectMemorySize I receive OOM on OS level like; > # # grep "Out of" /var/log/messages-20170918 > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 > (java) score 287 or sacrifice child > Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 > (java) score 289 or sacrifice child > If set -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive: > HeapUtils.java:136 - Dumping heap to > /egov/dumps/cassandra-1506868110-pid11155.hprof > It seems like JVM kill itself when off-heap memory leaks occur. > Typical errors in system.log before JVM begin dumping: > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874] > Full stack traces: > ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 > Message.java:625 - Unexpected exception during request; channel = [id: > 0x3c0c1c26, L:/172.20.4.142:9042 - > R:/172.20.4.139:44874] > java.lang.AssertionError: null > at > org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521) > [apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) > [apache-cassandra-3.11.0.jar:3.11.0] > at > io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) > [netty-all-4.0.44.Final.jar:4.0.44.Final] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_121] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > [apache-cassandra-3.11.0.jar:3.1 > 1.0] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.0.jar:3.11.0] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] > INFO [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - > Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ... > Heap dump file created > ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 > CassandraDaemon.java:228 - Exception in thread > Thread[MessagingService-Incoming-/172.20.4.143,5,main] > java.io.IOError: java.io.EOFException: Stream ended prematurely > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at > org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800) > ~[apache-cassandra-3.11.0.jar:3.11.0] > at >