Rocco Varela created CASSANDRA-9021:
---------------------------------------
Summary: AssertionError and Leak detected during sstable compaction
Key: CASSANDRA-9021
URL: https://issues.apache.org/jira/browse/CASSANDRA-9021
Project: Cassandra
Issue Type: Bug
Environment: Cluster setup:
- 20-node stress cluster, GCE n1-standard-2
- 10-node receiver cluster ingesting data, GCE n1-standard-8
Versions:
- DSE 4.7.0
- Cassandra 2.1.3.304
DSE Configuration:
- Xms7540M
- Xmx7540M
- Xmn800M
- Ddse.system_cpu_cores=8 -Ddse.system_memory_in_mb=30161
- Dcassandra.config.loader=com.datastax.bdp.config.DseConfigurationLoader
- ea -javaagent:/usr/local/lib/dse/resources/cassandra/lib/jamm-0.3.0.jar
- XX:+CMSClassUnloadingEnabled -XX:+UseThreadPriorities
- XX:ThreadPriorityPolicy=42 -Xms7540M -Xmx7540M -Xmn800M
- XX:+HeapDumpOnOutOfMemoryError -Xss256k
- XX:StringTableSize=1000003 -XX:+UseParNewGC
- XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
- XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
- XX:CMSInitiatingOccupancyFraction=75
- XX:+UseCMSInitiatingOccupancyOnly -XX:+UseTLAB
Reporter: Rocco Varela
Assignee: Benedict
Attachments: system.log
After ~3 hours of data ingestion we see assertion errors and 'LEAK DETECTED'
errors during what looks like sstable compaction.
system.log snippets (full log attached):
{code}
...
INFO [CompactionExecutor:12] 2015-03-23 02:45:51,770 CompactionTask.java:267
- Compacted 4 sstables to [/mnt/cass_data_disks/data1/requests_ks/timeline-
9500fe40d0f611e495675d5ea01541b5/requests_ks-timeline-ka-185,]. 65,916,594
bytes to 66,159,512 (~100% of original) in 26,554ms = 2.376087MB/s. 983 total
partitions merged to 805. Partition merge counts were {1:627, 2:178, }
INFO [CompactionExecutor:11] 2015-03-23 02:45:51,837 CompactionTask.java:267
- Compacted 4 sstables to [/mnt/cass_data_disks/data1/system/
compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-119,].
426 bytes to 42 (~9% of original) in 82ms = 0.000488MB/s. 5 total
partitions merged to 1. Partition merge counts were {1:1, 2:2, }
ERROR [NonPeriodicTasks:1] 2015-03-23 02:45:52,251 CassandraDaemon.java:167 -
Exception in thread Thread[NonPeriodicTasks:1,5,main]
java.lang.AssertionError: null
at
org.apache.cassandra.io.compress.CompressionMetadata$Chunk.<init>(CompressionMetadata.java:438)
~[cassandra-all-2.1.3.304.jar:2.1.3.304]
at
org.apache.cassandra.io.compress.CompressionMetadata.chunkFor(CompressionMetadata.java:228)
~[cassandra-all-2.1.3.304.jar:2.1.3.304]
at
org.apache.cassandra.io.util.CompressedPoolingSegmentedFile.dropPageCache(CompressedPoolingSegmentedFile.java:80)
~[cassandra-all-2.1.3.304.jar:2.1.3.304]
at org.apache.cassandra.io.sstable.SSTableReader$6.run(SSTableReader.java:923)
~[cassandra-all-2.1.3.304.jar:2.1.3.304]
at
org.apache.cassandra.io.sstable.SSTableReader$InstanceTidier$1.run(SSTableReader.java:2036)
~[cassandra-all-2.1.3.304.jar:2.1.3.304]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
~[na:1.7.0_45]
at java.util.concurrent.FutureTask.run(FutureTask.java:262) ~[na:1.7.0_45]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
~[na:1.7.0_45]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
~[na:1.7.0_45]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
~[na:1.7.0_45]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
...
INFO [MemtableFlushWriter:50] 2015-03-23 02:47:29,465 Memtable.java:378 -
Completed flushing /mnt/cass_data_disks/data1/requests_ks/timeline-
9500fe40d0f611e495675d5ea01541b5/requests_ks-timeline-ka-188-Data.db
(16311981 bytes) for commitlog position ReplayPosition(segmentId=1427071574495,
position=4523631)
ERROR [Reference-Reaper:1] 2015-03-23 02:47:33,987 Ref.java:181 - LEAK
DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@2f59b10)
to class
org.apache.cassandra.io.sstable.SSTableReader$DescriptorTypeTidy@1251424500:/mnt/cass_data_disks/data1/requests_ks/timeline-9500fe40d0f611e495675d5ea01541b5/
requests_ks-timeline-ka-149 was not released before the reference was
garbage collected
INFO [Service Thread] 2015-03-23 02:47:40,158 GCInspector.java:142 -
ConcurrentMarkSweep GC in 12247ms. CMS Old Gen: 5318987136 -> 457655168; CMS
Perm Gen: 44731264 -> 44699160; Par Eden Space: 8597912 -> 418006664; Par
Survivor Space: 71865728 -> 59679584
...
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)