[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-11447:
----------------------------------------
    Reproduced In: 2.2.5
    Fix Version/s: 2.2.x

> Flush writer deadlock in Cassandra 2.2.5
> ----------------------------------------
>
>                 Key: CASSANDRA-11447
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Mark Manley
>             Fix For: 2.2.x
>
>         Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x0000000005fc11d0 nid=0x7664 waiting for monitor entry 
> [0x00007fb83f0e5000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
>         - waiting to lock <0x0000000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
>         at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
>         at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
>         at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
>         at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>         at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>         at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>               Table: active_counters
>               SSTable count: 2
> --
>               Table: aggregation_job_entries
>               SSTable count: 2
> --
>               Table: dsp_metrics_log
>               SSTable count: 207
> --
>               Table: dsp_metrics_ts_5min
>               SSTable count: 3
> --
>               Table: dsp_metrics_ts_day
>               SSTable count: 2
> --
>               Table: dsp_metrics_ts_hour
>               SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable files before this system went into use, 
> but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to