Rafael Harutyunyan created CASSANDRA-10871:
----------------------------------------------

             Summary: MemtableFlushWriter blocks and no flushing happens
                 Key: CASSANDRA-10871
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10871
             Project: Cassandra
          Issue Type: Bug
          Components: Compaction, Local Write-Read Paths
         Environment: Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 
13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
            Reporter: Rafael Harutyunyan
            Priority: Critical
             Fix For: 2.1.11
         Attachments: full_thread_dump.txt

After some time MemtableFlushWriter thread blocks, resulting first full filling 
of the FlushWriterQueue, than full filling of MutationStage queue. After this 2 
things might happen - Cassandra might drop the queued mutations and everything 
becomes normal or it shuts down with insufficient HeapSpace.
Here is the thread dump.
{noformat}

"MemtableFlushWriter:3" - Thread t@2610
   java.lang.Thread.State: BLOCKED
        at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250)
        - waiting to lock <f9dab27> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by 
"CompactionExecutor:51" t@2638
        at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
        at 
org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
        at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
        at 
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502)
        at 
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
        at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
        - locked <7ef8cd1b> (a java.util.concurrent.ThreadPoolExecutor$Worker)

"MemtableFlushWriter:4" - Thread t@2616
   java.lang.Thread.State: BLOCKED
        at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250)
        - waiting to lock <f9dab27> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by 
"CompactionExecutor:51" t@2638
        at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
        at 
org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
        at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
        at 
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502)
        at 
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
        at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
        - locked <2f842d9b> (a java.util.concurrent.ThreadPoolExecutor$Worker)
{noformat}

and here are the tpsats
{noformat}
Pool Name                    Active   Pending      Completed   Blocked  All 
time blocked
CounterMutationStage              0         0              0         0          
       0
ReadStage                         0         0             28         0          
       0
RequestResponseStage              0         0        2020253         0          
       0
MutationStage                    32     63221       27858588         0          
       0
ReadRepairStage                   0         0              0         0          
       0
GossipStage                       0         0          16430         0          
       0
CacheCleanupExecutor              0         0              0         0          
       0
AntiEntropyStage                  0         0           3008         0          
       0
MigrationStage                    0         0              0         0          
       0
Sampler                           0         0              0         0          
       0
ValidationExecutor                0         0           1500         0          
       0
CommitLogArchiver                 0         0              0         0          
       0
MiscStage                         0         0              0         0          
       0
MemtableFlushWriter               2       220           3531         0          
       0
MemtableReclaimMemory             0         0           4277         0          
       0
PendingRangeCalculator            0         0             22         0          
       0
MemtablePostFlush                 1       306           5186         0          
       0
CompactionExecutor               36       142           5326         0          
       0
InternalResponseStage             0         0              0         0          
       0
HintedHandoff                     0         0             13         0          
       0

Message type           Dropped
RANGE_SLICE                  0
READ_REPAIR                  0
PAGED_RANGE                  0
BINARY                       0
READ                         0
MUTATION                220352
_TRACE                       0
REQUEST_RESPONSE             0
COUNTER_MUTATION             0
{noformat}

cfstats reports 12k++ sstables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to