[jira] [Commented] (CASSANDRA-13937) Cassandra node's startup time increased after increase count of big tables

2017-10-10 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198850#comment-16198850
 ] 

Andrey Lataev commented on CASSANDRA-13937:
---


I try to replace LCS on STCS but without significant result.

>  Cassandra node's startup time increased after increase count of big tables
> ---
>
> Key: CASSANDRA-13937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 100 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra.zip, debug.zip
>
>
> In startup time Cassandra spends a long time on read some big Columnfamilies.
> For example, in debug.log:
> {code:java}
> grep SSTableReader.java:506 /var/log/cassandra/debug.log
> <...> 
> DEBUG [SSTableBatchOpen:3] 2017-10-04 22:40:05,297 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171003-b341cc709c7511e7b1cfed1e90eb03dc/mc-45242-big
>  (19.280MiB)
> DEBUG [SSTableBatchOpen:5] 2017-10-04 22:42:14,188 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-49002-big
>  (10.607MiB)
> <...>
> DEBUG [SSTableBatchOpen:4] 2017-10-04 22:42:19,792 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-47907-big
>  (128.172MiB)
> DEBUG [SSTableBatchOpen:1] 2017-10-04 22:44:23,560 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/pk4smevauditbody/messagelogbody20170324-f918bfa0107b11e7adfc2d0b45a372ac/mc-4-big
>  (96.310MiB)
> <..>
> {code}
> SSTableReader.java:506 spent ~ 2 min per every big table in p00smevauditbody 
> keyspace.
> I was planned too keep similar tables for the full month...
> So it seems like Cassandra will need more then 1h on startup...
> Does it available to speed up SSTableBatchOpen ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-10 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198812#comment-16198812
 ] 

Andrey Lataev edited comment on CASSANDRA-13931 at 10/10/17 3:17 PM:
-

I am downgrade Cassndra til 3.10
Upgrade JDK til 1.8.0_144
And set
{code:java}
MAX_HEAP_SIZE="9G"
{code}
and do not change 
{code:java}
JVM_OPTS="$JVM_OPTS -XX:MaxDirectMemorySize=24G"
{code}
But still periodicaly have a similar problem with off-heap:

{code:java}
#egrep "Dumping|YamlConfigurationLoader.java|ERR" /var/log/cassandra/system.log 
| egrep "2017-10-10 15"
ERROR [NonPeriodicTasks:1] 2017-10-10 15:59:31,155 Ref.java:233 - Error when 
closing class 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy@954667024:/egov/data/cassandra/datafiles1/p00smevaudit/messagelog20171010-a50f6b00a1f511e78dc897891b876cc2/mc-4357-big
ERROR [NonPeriodicTasks:1] 2017-10-10 15:59:32,103 Ref.java:233 - Error when 
closing class 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy@1640091777:/egov/data/cassandra/datafiles1/p00smevaudit/messagelog20171010-a50f6b00a1f511e78dc897891b876cc2/mc-4355-big


# egrep "Dumping|YamlConfigurationLoader.java|ERR" 
/var/log/cassandra/system.log | egrep "2017-10-10 16"
ERROR [MessagingService-Incoming-/172.20.4.125] 2017-10-10 16:00:17,421 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.125,5,main]
INFO  [MutationStage-128] 2017-10-10 16:00:17,690 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-196] 2017-10-10 16:00:17,721 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-18] 2017-10-10 16:00:17,754 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-184] 2017-10-10 16:00:17,757 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-235] 2017-10-10 16:00:17,768 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-197] 2017-10-10 16:00:17,769 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-28] 2017-10-10 16:00:17,780 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-2] 2017-10-10 16:00:17,846 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-152] 2017-10-10 16:00:17,873 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-241] 2017-10-10 16:00:17,876 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-223] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-16] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-189] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
ERROR [MessagingService-Incoming-/172.20.4.139] 2017-10-10 16:00:21,540 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.139,5,main]
ERROR [MessagingService-Incoming-/172.20.4.145] 2017-10-10 16:00:21,540 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.145,5,main]
INFO  [MutationStage-224] 2017-10-10 16:00:21,543 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-222] 2017-10-10 16:00:21,545 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-101] 2017-10-10 16:00:21,574 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-40] 2017-10-10 16:00:25,095 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
ERROR [MessagingService-Incoming-/172.20.4.145] 2017-10-10 16:00:25,170 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.145,5,main]
ERROR [MessagingService-Incoming-/172.20.4.109] 2017-10-10 16:00:25,212 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.109,5,main]
ERROR [MessagingService-Incoming-/172.20.4.163] 2017-10-10 16:00:25,213 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.163,5,main]
ERROR [MessagingService-Incoming-/172.20.4.162] 2017-10-10 16:00:25,216 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.162,5,main]
ERROR [MutationStage-128] 2017-10-10 16:00:32,694 

[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-10 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16198812#comment-16198812
 ] 

Andrey Lataev commented on CASSANDRA-13931:
---

I am downgrade Cassndra til 3.10
Upgrade JDK til 1.8.0_144
And set
{code:java}
MAX_HEAP_SIZE="9G"
{code}
and do not change 
{code:java}
JVM_OPTS="$JVM_OPTS -XX:MaxDirectMemorySize=24G"
{code}
But still periodicaly have a similar problem with off-heap:

{code:java}
#*egrep "Dumping|YamlConfigurationLoader.java|ERR" 
/var/log/cassandra/system.log | egrep "2017-10-10 15"*
ERROR [NonPeriodicTasks:1] 2017-10-10 15:59:31,155 Ref.java:233 - Error when 
closing class 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy@954667024:/egov/data/cassandra/datafiles1/p00smevaudit/messagelog20171010-a50f6b00a1f511e78dc897891b876cc2/mc-4357-big
ERROR [NonPeriodicTasks:1] 2017-10-10 15:59:32,103 Ref.java:233 - Error when 
closing class 
org.apache.cassandra.io.sstable.format.SSTableReader$GlobalTidy@1640091777:/egov/data/cassandra/datafiles1/p00smevaudit/messagelog20171010-a50f6b00a1f511e78dc897891b876cc2/mc-4355-big

# *egrep "Dumping|YamlConfigurationLoader.java|ERR" 
/var/log/cassandra/system.log | egrep "2017-10-10 16"*
ERROR [MessagingService-Incoming-/172.20.4.125] 2017-10-10 16:00:17,421 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.125,5,main]
INFO  [MutationStage-128] 2017-10-10 16:00:17,690 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-196] 2017-10-10 16:00:17,721 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-18] 2017-10-10 16:00:17,754 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-184] 2017-10-10 16:00:17,757 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-235] 2017-10-10 16:00:17,768 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-197] 2017-10-10 16:00:17,769 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-28] 2017-10-10 16:00:17,780 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-2] 2017-10-10 16:00:17,846 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-152] 2017-10-10 16:00:17,873 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-241] 2017-10-10 16:00:17,876 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-223] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-16] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-189] 2017-10-10 16:00:21,540 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
ERROR [MessagingService-Incoming-/172.20.4.139] 2017-10-10 16:00:21,540 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.139,5,main]
ERROR [MessagingService-Incoming-/172.20.4.145] 2017-10-10 16:00:21,540 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.145,5,main]
INFO  [MutationStage-224] 2017-10-10 16:00:21,543 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-222] 2017-10-10 16:00:21,545 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-101] 2017-10-10 16:00:21,574 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
INFO  [MutationStage-40] 2017-10-10 16:00:25,095 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1507584313-pid17345.hprof ...
ERROR [MessagingService-Incoming-/172.20.4.145] 2017-10-10 16:00:25,170 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.145,5,main]
ERROR [MessagingService-Incoming-/172.20.4.109] 2017-10-10 16:00:25,212 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.109,5,main]
ERROR [MessagingService-Incoming-/172.20.4.163] 2017-10-10 16:00:25,213 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.163,5,main]
ERROR [MessagingService-Incoming-/172.20.4.162] 2017-10-10 16:00:25,216 
CassandraDaemon.java:229 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.162,5,main]
ERROR [MutationStage-128] 2017-10-10 16:00:32,694 
JVMStabilityInspector.java:142 - JVM state determined 

[jira] [Updated] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-10 Thread Andrey Lataev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Lataev updated CASSANDRA-13931:
--
Since Version: 3.10  (was: 3.11.0)

> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]
> Full stack traces:
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 -
> R:/172.20.4.139:44874]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [apache-cassandra-3.11.0.jar:3.1
> 1.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.0.jar:3.11.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - 
> Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
> Heap dump file created
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> java.io.IOError: java.io.EOFException: Stream ended prematurely
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
>

[jira] [Issue Comment Deleted] (CASSANDRA-13937) Cassandra node's startup time increased after increase count of big tables

2017-10-04 Thread Andrey Lataev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Lataev updated CASSANDRA-13937:
--
Comment: was deleted

(was:  LZ4 compression used:
{code:java}
sys@cqlsh> DESC TABLE p00smevauditbody.messagelogbody20171004;

CREATE TABLE p00smevauditbody.messagelogbody20171004 (
d timestamp,
mid text,
mt text,
rec_msg text,
rec_sid text,
sd_msg text,
PRIMARY KEY (d, mid, mt)
) WITH CLUSTERING ORDER BY (mid ASC, mt ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 
'sstable_size_in_mb': '160'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{code}

)

>  Cassandra node's startup time increased after increase count of big tables
> ---
>
> Key: CASSANDRA-13937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 100 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra.zip, debug.zip
>
>
> In startup time Cassandra spends a long time on read some big Columnfamilies.
> For example, in debug.log:
> {code:java}
> grep SSTableReader.java:506 /var/log/cassandra/debug.log
> <...> 
> DEBUG [SSTableBatchOpen:3] 2017-10-04 22:40:05,297 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171003-b341cc709c7511e7b1cfed1e90eb03dc/mc-45242-big
>  (19.280MiB)
> DEBUG [SSTableBatchOpen:5] 2017-10-04 22:42:14,188 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-49002-big
>  (10.607MiB)
> <...>
> DEBUG [SSTableBatchOpen:4] 2017-10-04 22:42:19,792 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-47907-big
>  (128.172MiB)
> DEBUG [SSTableBatchOpen:1] 2017-10-04 22:44:23,560 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/pk4smevauditbody/messagelogbody20170324-f918bfa0107b11e7adfc2d0b45a372ac/mc-4-big
>  (96.310MiB)
> <..>
> {code}
> SSTableReader.java:506 spent ~ 2 min per every big table in p00smevauditbody 
> keyspace.
> I was planned too keep similar tables for the full month...
> So it seems like Cassandra will need more then 1h on startup...
> Does it available to speed up SSTableBatchOpen ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13937) Cassandra node's startup time increased after increase count of big tables

2017-10-04 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192062#comment-16192062
 ] 

Andrey Lataev commented on CASSANDRA-13937:
---

 LZ4 compression used:
{code:java}
sys@cqlsh> DESC TABLE p00smevauditbody.messagelogbody20171004;

CREATE TABLE p00smevauditbody.messagelogbody20171004 (
d timestamp,
mid text,
mt text,
rec_msg text,
rec_sid text,
sd_msg text,
PRIMARY KEY (d, mid, mt)
) WITH CLUSTERING ORDER BY (mid ASC, mt ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 
'sstable_size_in_mb': '160'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{code}



>  Cassandra node's startup time increased after increase count of big tables
> ---
>
> Key: CASSANDRA-13937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 100 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra.zip, debug.zip
>
>
> In startup time Cassandra spends a long time on read some big Columnfamilies.
> For example, in debug.log:
> {code:java}
> grep SSTableReader.java:506 /var/log/cassandra/debug.log
> <...> 
> DEBUG [SSTableBatchOpen:3] 2017-10-04 22:40:05,297 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171003-b341cc709c7511e7b1cfed1e90eb03dc/mc-45242-big
>  (19.280MiB)
> DEBUG [SSTableBatchOpen:5] 2017-10-04 22:42:14,188 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-49002-big
>  (10.607MiB)
> <...>
> DEBUG [SSTableBatchOpen:4] 2017-10-04 22:42:19,792 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-47907-big
>  (128.172MiB)
> DEBUG [SSTableBatchOpen:1] 2017-10-04 22:44:23,560 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/pk4smevauditbody/messagelogbody20170324-f918bfa0107b11e7adfc2d0b45a372ac/mc-4-big
>  (96.310MiB)
> <..>
> {code}
> SSTableReader.java:506 spent ~ 2 min per every big table in p00smevauditbody 
> keyspace.
> I was planned too keep similar tables for the full month...
> So it seems like Cassandra will need more then 1h on startup...
> Does it available to speed up SSTableBatchOpen ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13937) Cassandra node's startup time increased after increase count of big tables

2017-10-04 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16192063#comment-16192063
 ] 

Andrey Lataev commented on CASSANDRA-13937:
---

 LZ4 compression used:
{code:java}
sys@cqlsh> DESC TABLE p00smevauditbody.messagelogbody20171004;

CREATE TABLE p00smevauditbody.messagelogbody20171004 (
d timestamp,
mid text,
mt text,
rec_msg text,
rec_sid text,
sd_msg text,
PRIMARY KEY (d, mid, mt)
) WITH CLUSTERING ORDER BY (mid ASC, mt ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy', 
'sstable_size_in_mb': '160'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';
{code}



>  Cassandra node's startup time increased after increase count of big tables
> ---
>
> Key: CASSANDRA-13937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13937
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 100 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra.zip, debug.zip
>
>
> In startup time Cassandra spends a long time on read some big Columnfamilies.
> For example, in debug.log:
> {code:java}
> grep SSTableReader.java:506 /var/log/cassandra/debug.log
> <...> 
> DEBUG [SSTableBatchOpen:3] 2017-10-04 22:40:05,297 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171003-b341cc709c7511e7b1cfed1e90eb03dc/mc-45242-big
>  (19.280MiB)
> DEBUG [SSTableBatchOpen:5] 2017-10-04 22:42:14,188 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-49002-big
>  (10.607MiB)
> <...>
> DEBUG [SSTableBatchOpen:4] 2017-10-04 22:42:19,792 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-47907-big
>  (128.172MiB)
> DEBUG [SSTableBatchOpen:1] 2017-10-04 22:44:23,560 SSTableReader.java:506 - 
> Opening 
> /egov/data/cassandra/datafiles1/pk4smevauditbody/messagelogbody20170324-f918bfa0107b11e7adfc2d0b45a372ac/mc-4-big
>  (96.310MiB)
> <..>
> {code}
> SSTableReader.java:506 spent ~ 2 min per every big table in p00smevauditbody 
> keyspace.
> I was planned too keep similar tables for the full month...
> So it seems like Cassandra will need more then 1h on startup...
> Does it available to speed up SSTableBatchOpen ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-13937) Cassandra node's startup time increased after increase count of big tables

2017-10-04 Thread Andrey Lataev (JIRA)
Andrey Lataev created CASSANDRA-13937:
-

 Summary:  Cassandra node's startup time increased after increase 
count of big tables
 Key: CASSANDRA-13937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13937
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: RHEL 7.3
JDK HotSpot 1.8.0_121-b13
cassandra-3.11 cluster with 43 nodes in 9 datacenters
8vCPU, 100 GB RAM
Reporter: Andrey Lataev
 Attachments: cassandra.zip, debug.zip

In startup time Cassandra spends a long time on read some big Columnfamilies.
For example, in debug.log:
{code:java}
grep SSTableReader.java:506 /var/log/cassandra/debug.log
<...> 
DEBUG [SSTableBatchOpen:3] 2017-10-04 22:40:05,297 SSTableReader.java:506 - 
Opening 
/egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171003-b341cc709c7511e7b1cfed1e90eb03dc/mc-45242-big
 (19.280MiB)
DEBUG [SSTableBatchOpen:5] 2017-10-04 22:42:14,188 SSTableReader.java:506 - 
Opening 
/egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-49002-big
 (10.607MiB)
<...>
DEBUG [SSTableBatchOpen:4] 2017-10-04 22:42:19,792 SSTableReader.java:506 - 
Opening 
/egov/data/cassandra/datafiles1/p00smevauditbody/messagelogbody20171004-f82225509d3e11e7b1cfed1e90eb03dc/mc-47907-big
 (128.172MiB)
DEBUG [SSTableBatchOpen:1] 2017-10-04 22:44:23,560 SSTableReader.java:506 - 
Opening 
/egov/data/cassandra/datafiles1/pk4smevauditbody/messagelogbody20170324-f918bfa0107b11e7adfc2d0b45a372ac/mc-4-big
 (96.310MiB)
<..>

{code}
SSTableReader.java:506 spent ~ 2 min per every big table in p00smevauditbody 
keyspace.
I was planned too keep similar tables for the full month...
So it seems like Cassandra will need more then 1h on startup...
Does it available to speed up SSTableBatchOpen ?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-04 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191781#comment-16191781
 ] 

Andrey Lataev commented on CASSANDRA-13931:
---

Now time I will try to set:
{code:java}
concurrent_reads: 32
concurrent_writes: 64
{code}

and 

{code:java}
MAX_HEAP_SIZE="16G"
JVM_OPTS="$JVM_OPTS -XX:MaxDirectMemorySize=24G"
{code}


> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]
> Full stack traces:
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 -
> R:/172.20.4.139:44874]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [apache-cassandra-3.11.0.jar:3.1
> 1.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.0.jar:3.11.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - 
> Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
> Heap dump file created
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> java.io.IOError: java.io.EOFException: Stream ended prematurely
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> 

[jira] [Comment Edited] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-04 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191450#comment-16191450
 ] 

Andrey Lataev edited comment on CASSANDRA-13931 at 10/4/17 3:41 PM:


As you can see in attached cassandra-env.sh file
row:
{code:java}
JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144"
{code}

- exist.
I will try to enlarge RAM and and increase heap size til 16Gb.
Eclipse Memory Analyser for heapdump shown top 3 problem suspect:

*Problem Suspect 1*
{code:java}
The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 
MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps 
local variables with total size 306 114 312 (13,97%) bytes.

The memory is accumulated in one instance of 
"org.apache.cassandra.net.OutboundTcpConnection" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000".
{code}

*Problem Suspect 2*

{code:java}
529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000" occupy 776 362 840 (35,43%) 
bytes. 

Biggest instances:

•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 
epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 
epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 
epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 
epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 
66 494 040 (3,03%) bytes. 
{code}

*Problem Suspect 3*

{code:java}
126 instances of "byte[]", loaded by "" occupy 268 549 640 
(12,26%) bytes. These instances are referenced from one instance of 
"java.util.HashMap$Node[]", loaded by ""

Keywords
byte[]
java.util.HashMap$Node[]
{code}




was (Author: ljus):
As you can see in attached cassandra-env.sh file
row:
{code:java}
JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144"
{code}

- exist.
I will try to enlarge RAM and and increase heap size til 16Gb.
Eclipse Memory Analyser for heapdump shown top 3 problem suspect:

*Problem Suspect 1*
{code:java}
The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 
MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps 
local variables with total size 306 114 312 (13,97%) bytes.

The memory is accumulated in one instance of 
"org.apache.cassandra.net.OutboundTcpConnection" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000".
{code}

* Problem Suspect 2*

{code:java}
529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000" occupy 776 362 840 (35,43%) 
bytes. 

Biggest instances:

•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 
epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 
epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 
epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 
epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 
66 494 040 (3,03%) bytes. 
{code}

*Problem Suspect 3*

{code:java}
126 instances of "byte[]", loaded by "" occupy 268 549 640 
(12,26%) bytes. These instances are referenced from one instance of 
"java.util.HashMap$Node[]", loaded by ""

Keywords
byte[]
java.util.HashMap$Node[]
{code}



> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR 

[jira] [Comment Edited] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-04 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191450#comment-16191450
 ] 

Andrey Lataev edited comment on CASSANDRA-13931 at 10/4/17 3:40 PM:


As you can see in attached cassandra-env.sh file
row:
{code:java}
JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144"
{code}

- exist.
I will try to enlarge RAM and and increase heap size til 16Gb.
Eclipse Memory Analyser for heapdump shown top 3 problem suspect:

*Problem Suspect 1*
{code:java}
The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 
MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps 
local variables with total size 306 114 312 (13,97%) bytes.

The memory is accumulated in one instance of 
"org.apache.cassandra.net.OutboundTcpConnection" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000".
{code}

* Problem Suspect 2*

{code:java}
529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000" occupy 776 362 840 (35,43%) 
bytes. 

Biggest instances:

•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 
epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 
epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 
epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 
epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 
66 494 040 (3,03%) bytes. 
{code}

*Problem Suspect 3*

{code:java}
126 instances of "byte[]", loaded by "" occupy 268 549 640 
(12,26%) bytes. These instances are referenced from one instance of 
"java.util.HashMap$Node[]", loaded by ""

Keywords
byte[]
java.util.HashMap$Node[]
{code}




was (Author: ljus):
As you can see in attached cassandra-env.sh file
row:
{code:java}
JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144"
{code}

- exist.
I will try to enlarge RAM and and increase heap size til 16Gb.
Eclipse Memory Analyser for heapdump shown top 3 problem suspect:

*Problem Suspect 1*
{code:java}
The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 
MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps 
local variables with total size 306 114 312 (13,97%) bytes.

The memory is accumulated in one instance of 
"org.apache.cassandra.net.OutboundTcpConnection" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000".
{code}

* Problem Suspect 2*

{code:java}
529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000" occupy 776 362 840 (35,43%) 
bytes. 

Biggest instances:

•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 
epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 
epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 
epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 
epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 
66 494 040 (3,03%) bytes. 
{code}

*Problem Suspect 3*

{code:java}
126 instances of "byte[]", loaded by "" occupy 268 549 640 
(12,26%) bytes. These instances are referenced from one instance of 
"java.util.HashMap$Node[]", loaded by ""

Keywords
byte[]
java.util.HashMap$Node[]
{code}



> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR 

[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-04 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16191450#comment-16191450
 ] 

Andrey Lataev commented on CASSANDRA-13931:
---

As you can see in attached cassandra-env.sh file
row:
{code:java}
JVM_OPTS="$JVM_OPTS -Djdk.nio.maxCachedBufferSize=262144"
{code}

- exist.
I will try to enlarge RAM and and increase heap size til 16Gb.
Eclipse Memory Analyser for heapdump shown top 3 problem suspect:

*Problem Suspect 1*
{code:java}
The thread org.apache.cassandra.net.OutboundTcpConnection @ 0x6cd263100 
MessagingService-Outgoing-p00skimnosql10.00.egov.local/172.20.4.148-Large keeps 
local variables with total size 306 114 312 (13,97%) bytes.

The memory is accumulated in one instance of 
"org.apache.cassandra.net.OutboundTcpConnection" loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000".
{code}

* Problem Suspect 2*

{code:java}
529 instances of "io.netty.util.concurrent.FastThreadLocalThread", loaded by 
"sun.misc.Launcher$AppClassLoader @ 0x6c000" occupy 776 362 840 (35,43%) 
bytes. 

Biggest instances:

•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e1e0 
epollEventLoopGroup-2-7 - 156 689 680 (7,15%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719e7e0 
epollEventLoopGroup-2-3 - 125 567 112 (5,73%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d719da60 
epollEventLoopGroup-2-12 - 119 599 160 (5,46%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6ceab17b0 
epollEventLoopGroup-2-1 - 118 469 632 (5,41%) bytes. 
•io.netty.util.concurrent.FastThreadLocalThread @ 0x6d7059b00 ReadStage-151 - 
66 494 040 (3,03%) bytes. 
{code}

*Problem Suspect 3*

{code:java}
126 instances of "byte[]", loaded by "" occupy 268 549 640 
(12,26%) bytes. These instances are referenced from one instance of 
"java.util.HashMap$Node[]", loaded by ""

Keywords
byte[]
java.util.HashMap$Node[]
{code}



> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]
> Full stack traces:
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 -
> R:/172.20.4.139:44874]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at 
> 

[jira] [Commented] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-03 Thread Andrey Lataev (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16190166#comment-16190166
 ] 

Andrey Lataev commented on CASSANDRA-13931:
---

Also, I can attach JVM heap dump if it help.

> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]
> Full stack traces:
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 -
> R:/172.20.4.139:44874]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [apache-cassandra-3.11.0.jar:3.1
> 1.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.0.jar:3.11.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - 
> Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
> Heap dump file created
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> java.io.IOError: java.io.EOFException: Stream ended prematurely
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> 

[jira] [Updated] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-03 Thread Andrey Lataev (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Lataev updated CASSANDRA-13931:
--
Attachment: system.log.2017-10-01.zip

> Cassandra JVM stop itself randomly
> --
>
> Key: CASSANDRA-13931
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
> Environment: RHEL 7.3
> JDK HotSpot 1.8.0_121-b13
> cassandra-3.11 cluster with 43 nodes in 9 datacenters
> 8vCPU, 32 GB RAM
>Reporter: Andrey Lataev
> Attachments: cassandra-env.sh, cassandra.yaml, 
> system.log.2017-10-01.zip
>
>
> Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;
> # # grep "Out of" /var/log/messages-20170918
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 
> (java) score 287 or sacrifice child
> Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 
> (java) score 289 or sacrifice child
> If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
> HeapUtils.java:136 - Dumping heap to 
> /egov/dumps/cassandra-1506868110-pid11155.hprof
> It seems like  JVM kill itself when off-heap memory leaks occur.
> Typical errors in  system.log before JVM begin dumping:
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 - R:/172.20.4.139:44874]
> Full stack traces:
> ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 
> Message.java:625 - Unexpected exception during request; channel = [id: 
> 0x3c0c1c26, L:/172.20.4.142:9042 -
> R:/172.20.4.139:44874]
> java.lang.AssertionError: null
> at 
> org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
>  [apache-cassandra-3.11.0.jar:3.11.0]
> at 
> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
>  [netty-all-4.0.44.Final.jar:4.0.44.Final]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> [na:1.8.0_121]
> at 
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
>  [apache-cassandra-3.11.0.jar:3.1
> 1.0]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
> [apache-cassandra-3.11.0.jar:3.11.0]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - 
> Dumping heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
> Heap dump file created
> ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 
> CassandraDaemon.java:228 - Exception in thread 
> Thread[MessagingService-Incoming-/172.20.4.143,5,main]
> java.io.IOError: java.io.EOFException: Stream ended prematurely
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415)
>  ~[apache-cassandra-3.11.0.jar:3.11.0]
> 

[jira] [Created] (CASSANDRA-13931) Cassandra JVM stop itself randomly

2017-10-03 Thread Andrey Lataev (JIRA)
Andrey Lataev created CASSANDRA-13931:
-

 Summary: Cassandra JVM stop itself randomly
 Key: CASSANDRA-13931
 URL: https://issues.apache.org/jira/browse/CASSANDRA-13931
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: RHEL 7.3
JDK HotSpot 1.8.0_121-b13
cassandra-3.11 cluster with 43 nodes in 9 datacenters
8vCPU, 32 GB RAM
Reporter: Andrey Lataev
 Attachments: cassandra-env.sh, cassandra.yaml

Before I set  -XX:MaxDirectMemorySize  I receive  OOM on OS level like;

# # grep "Out of" /var/log/messages-20170918
Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26619 (java) 
score 287 or sacrifice child
Sep 16 06:54:07 p00skimnosql04 kernel: Out of memory: Kill process 26640 (java) 
score 289 or sacrifice child

If set  -XX:MaxDirectMemorySize=5G limitation then periodicaly begin receive:
HeapUtils.java:136 - Dumping heap to 
/egov/dumps/cassandra-1506868110-pid11155.hprof

It seems like  JVM kill itself when off-heap memory leaks occur.
Typical errors in  system.log before JVM begin dumping:

ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:00:36,336 
CassandraDaemon.java:228 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.143,5,main]
ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 
- Unexpected exception during request; channel = [id: 0x3c0c1c26, 
L:/172.20.4.142:9042 - R:/172.20.4.139:44874]

Full stack traces:

ERROR [Native-Transport-Requests-139] 2017-10-01 19:04:02,675 Message.java:625 
- Unexpected exception during request; channel = [id: 0x3c0c1c26, 
L:/172.20.4.142:9042 -
R:/172.20.4.139:44874]
java.lang.AssertionError: null
at 
org.apache.cassandra.transport.ServerConnection.applyStateTransition(ServerConnection.java:97)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:521)
 [apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.11.0.jar:3.11.0]
at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_121]
at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.11.0.jar:3.1
1.0]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) 
[apache-cassandra-3.11.0.jar:3.11.0]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]



INFO  [MutationStage-127] 2017-10-01 19:08:24,255 HeapUtils.java:136 - Dumping 
heap to /egov/dumps/cassandra-1506868110-pid11155.hprof ...
Heap dump file created



ERROR [MessagingService-Incoming-/172.20.4.143] 2017-10-01 19:08:33,493 
CassandraDaemon.java:228 - Exception in thread 
Thread[MessagingService-Incoming-/172.20.4.143,5,main]
java.io.IOError: java.io.EOFException: Stream ended prematurely
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:227)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer$1.computeNext(UnfilteredRowIteratorSerializer.java:215)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:839)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:800)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:415)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:434)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at 
org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:371)
 ~[apache-cassandra-3.11.0.jar:3.11.0]
at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) 
~[apache-cassandra-3.11.0.jar:3.11.0]