Tom van der Woerdt created CASSANDRA-13469: ----------------------------------------------
Summary: Compaction stops; new sstables aren't recognized Key: CASSANDRA-13469 URL: https://issues.apache.org/jira/browse/CASSANDRA-13469 Project: Cassandra Issue Type: Bug Reporter: Tom van der Woerdt cfstats shows me this, so I know I'm in trouble : {code} SSTable count: 3419 SSTables in each level: [2, 20/10, 154/100, 1034/1000, 0, 0, 0, 0, 0] {code} Disk space on these nodes (about five of them) doubled overnight, due to the high write volume not getting compacted. `nodetool compactionstats' shows me zero pending compactions. After a restart things are "better" : {noformat} SSTable count: 3432 SSTables in each level: [1280/4, 21/10, 262/100, 1851/1000, 15, 0, 0, 0, 0] {noformat} compactionstats also shows some compactions again. jstack for one compaction thread (there are 24 of them, all looking the exact same) : {noformat} "CompactionExecutor:73017" #203194 daemon prio=1 os_prio=4 tid=0x00007edd1e597c00 nid=0xf7e4 waiting on condition [0x00007efd16782000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000002c1859520> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:79) at org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$3/335359181.run(Unknown Source) at java.lang.Thread.run(Thread.java:745) {noformat} Sadly I'm pretty clueless about what happened here. :( Oh, and tpstats : {noformat} Pool Name Active Pending Completed Blocked All time blocked MutationStage 0 0 1098466484 0 0 ViewMutationStage 0 0 0 0 0 ReadStage 2 0 759723051 0 0 RequestResponseStage 0 0 1773795694 0 0 ReadRepairStage 0 0 37372223 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 CompactionExecutor 0 0 336794 0 0 MemtableReclaimMemory 0 0 75324 0 0 PendingRangeCalculator 0 0 68 0 0 GossipStage 0 0 1363346 0 0 SecondaryIndexManagement 0 0 0 0 0 HintsDispatcher 1 1 38 0 0 MigrationStage 0 0 274693 0 0 MemtablePostFlush 0 0 248358 0 0 ValidationExecutor 0 0 7882 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0 75324 0 0 InternalResponseStage 0 0 259871 0 0 AntiEntropyStage 0 0 20469 0 0 CacheCleanupExecutor 0 0 0 0 0 Native-Transport-Requests 0 0 1489435995 0 188985 Message type Dropped READ 2 RANGE_SLICE 0 _TRACE 0 HINT 0 MUTATION 20 COUNTER_MUTATION 0 BATCH_STORE 0 BATCH_REMOVE 0 REQUEST_RESPONSE 0 PAGED_RANGE 0 READ_REPAIR 4 {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)