[ 
https://issues.apache.org/jira/browse/CASSANDRA-10871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafael Harutyunyan updated CASSANDRA-10871:
-------------------------------------------
    Environment: 
Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC 2015 
x86_64 x86_64 x86_64 GNU/Linux; Java(TM) SE Runtime Environment (build 
1.7.0_67-b01)


  was:Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu Aug 13 22:55:16 UTC 
2015 x86_64 x86_64 x86_64 GNU/Linux


> MemtableFlushWriter blocks and no flushing happens
> --------------------------------------------------
>
>                 Key: CASSANDRA-10871
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10871
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Compaction, Local Write-Read Paths
>         Environment: Linux cassandra1 2.6.32-573.3.1.el6.x86_64 #1 SMP Thu 
> Aug 13 22:55:16 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux; Java(TM) SE Runtime 
> Environment (build 1.7.0_67-b01)
>            Reporter: Rafael Harutyunyan
>            Priority: Critical
>             Fix For: 2.1.11
>
>         Attachments: full_thread_dump.txt
>
>
> After some time MemtableFlushWriter thread blocks, resulting first full 
> filling of the FlushWriterQueue, than full filling of MutationStage queue. 
> After this 2 things might happen - Cassandra might drop the queued mutations 
> and everything becomes normal or it shuts down with insufficient HeapSpace.
> Here is the thread dump.
> {noformat}
> "MemtableFlushWriter:3" - Thread t@2610
>    java.lang.Thread.State: BLOCKED
>       at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250)
>       - waiting to lock <f9dab27> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by 
> "CompactionExecutor:51" t@2638
>       at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
>       at 
> org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
>       at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502)
>       at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>       at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>       - locked <7ef8cd1b> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> "MemtableFlushWriter:4" - Thread t@2616
>    java.lang.Thread.State: BLOCKED
>       at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:250)
>       - waiting to lock <f9dab27> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy) owned by 
> "CompactionExecutor:51" t@2638
>       at org.apache.cassandra.db.DataTracker.notifyAdded(DataTracker.java:518)
>       at 
> org.apache.cassandra.db.DataTracker.replaceFlushed(DataTracker.java:178)
>       at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:234)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1502)
>       at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:336)
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>       at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>       at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745)
>    Locked ownable synchronizers:
>       - locked <2f842d9b> (a java.util.concurrent.ThreadPoolExecutor$Worker)
> {noformat}
> and here are the tpsats
> {noformat}
> Pool Name                    Active   Pending      Completed   Blocked  All 
> time blocked
> CounterMutationStage              0         0              0         0        
>          0
> ReadStage                         0         0             28         0        
>          0
> RequestResponseStage              0         0        2020253         0        
>          0
> MutationStage                    32     63221       27858588         0        
>          0
> ReadRepairStage                   0         0              0         0        
>          0
> GossipStage                       0         0          16430         0        
>          0
> CacheCleanupExecutor              0         0              0         0        
>          0
> AntiEntropyStage                  0         0           3008         0        
>          0
> MigrationStage                    0         0              0         0        
>          0
> Sampler                           0         0              0         0        
>          0
> ValidationExecutor                0         0           1500         0        
>          0
> CommitLogArchiver                 0         0              0         0        
>          0
> MiscStage                         0         0              0         0        
>          0
> MemtableFlushWriter               2       220           3531         0        
>          0
> MemtableReclaimMemory             0         0           4277         0        
>          0
> PendingRangeCalculator            0         0             22         0        
>          0
> MemtablePostFlush                 1       306           5186         0        
>          0
> CompactionExecutor               36       142           5326         0        
>          0
> InternalResponseStage             0         0              0         0        
>          0
> HintedHandoff                     0         0             13         0        
>          0
> Message type           Dropped
> RANGE_SLICE                  0
> READ_REPAIR                  0
> PAGED_RANGE                  0
> BINARY                       0
> READ                         0
> MUTATION                220352
> _TRACE                       0
> REQUEST_RESPONSE             0
> COUNTER_MUTATION             0
> {noformat}
> cfstats reports 12k++ sstables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to