Rafael Harutyunyan created CASSANDRA-10871:
----------------------------------------------
Summary: MemtableFlushWriter blocks and no flushing happens
Key: CASSANDRA-10871
URL: https://issues.apache.org/jira/browse/CASSANDRA-10871
Project: Cassandra
Issue Type: Bug
Components: Compaction, Local Write-Read Paths
Environment: Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug
13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Rafael Harutyunyan
Priority: Critical
Fix For: 2.1.11
Attachments: full_thread_dump.txt
After some time MemtableFlushWriter thread blocks, resulting first full filling
of the FlushWriterQueue, than full filling of MutationStage queue. After this 2
things might happen - Cassandra might drop the queued mutations and everything
becomes normal or it shuts down with insufficient HeapSpace.
Here is the thread dump.
{noformat}
"MemtableFlushWriter:3" - Thread t@2610
java.lang.Thread.State: BLOCKED
at
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250)
- waiting to lock <f9dab27> (a
org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by
"CompactionExecutor:51" t@2638
at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
at
org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
at
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
at
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502)
at
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- locked <7ef8cd1b> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"MemtableFlushWriter:4" - Thread t@2616
java.lang.Thread.State: BLOCKED
at
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250)
- waiting to lock <f9dab27> (a
org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by
"CompactionExecutor:51" t@2638
at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
at
org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
at
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
at
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502)
at
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- locked <2f842d9b> (a java.util.concurrent.ThreadPoolExecutor$Worker)
{noformat}
and here are the tpsats
{noformat}
Pool Name Active Pending Completed Blocked All
time blocked
CounterMutationStage 0 0 0 0
0
ReadStage 0 0 28 0
0
RequestResponseStage 0 0 2020253 0
0
MutationStage 32 63221 27858588 0
0
ReadRepairStage 0 0 0 0
0
GossipStage 0 0 16430 0
0
CacheCleanupExecutor 0 0 0 0
0
AntiEntropyStage 0 0 3008 0
0
MigrationStage 0 0 0 0
0
Sampler 0 0 0 0
0
ValidationExecutor 0 0 1500 0
0
CommitLogArchiver 0 0 0 0
0
MiscStage 0 0 0 0
0
MemtableFlushWriter 2 220 3531 0
0
MemtableReclaimMemory 0 0 4277 0
0
PendingRangeCalculator 0 0 22 0
0
MemtablePostFlush 1 306 5186 0
0
CompactionExecutor 36 142 5326 0
0
InternalResponseStage 0 0 0 0
0
HintedHandoff 0 0 13 0
0
Message type Dropped
RANGE_SLICE 0
READ_REPAIR 0
PAGED_RANGE 0
BINARY 0
READ 0
MUTATION 220352
_TRACE 0
REQUEST_RESPONSE 0
COUNTER_MUTATION 0
{noformat}
cfstats reports 12k++ sstables.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)