[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-29 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215864#comment-15215864
 ] 

Mark Manley commented on CASSANDRA-11447:
-

Fair enough.  I am unclear as to how all these tasks were cancelled though.

> Flush writer deadlock in Cassandra 2.2.5
> 
>
> Key: CASSANDRA-11447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mark Manley
> Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x05fc11d0 nid=0x7664 waiting for monitor entry 
> [0x7fb83f0e5000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
> - waiting to lock <0x000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>   Table: active_counters
>   SSTable count: 2
> --
>   Table: aggregation_job_entries
>   SSTable count: 2
> --
>   Table: dsp_metrics_log
>   SSTable count: 207
> --
>   Table: dsp_metrics_ts_5min
>   SSTable count: 3
> --
>   Table: dsp_metrics_ts_day
>   SSTable count: 2
> --
>   Table: dsp_metrics_ts_hour
>   SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable files before this system went into use, 
> but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-29 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215841#comment-15215841
 ] 

Mark Manley commented on CASSANDRA-11447:
-

Of course:

INFO  [CompactionExecutor:224] 2016-03-28 16:37:55,107 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 98651672/2163791990)bytes
INFO  [CompactionExecutor:226] 2016-03-28 16:37:55,150 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 9698634605/15761279246)bytes
INFO  [CompactionExecutor:225] 2016-03-28 16:38:55,206 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 0/49808303118)bytes
INFO  [CompactionExecutor:229] 2016-03-28 18:05:31,170 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_log, 9302818642/12244413096)bytes

> Flush writer deadlock in Cassandra 2.2.5
> 
>
> Key: CASSANDRA-11447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mark Manley
> Fix For: 2.2.x
>
> Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x05fc11d0 nid=0x7664 waiting for monitor entry 
> [0x7fb83f0e5000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
> - waiting to lock <0x000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>   Table: active_counters
>   SSTable count: 2
> --
>   Table: aggregation_job_entries
>   SSTable count: 2
> --
>   Table: dsp_metrics_log
>   SSTable count: 207
> --
>   Table: dsp_metrics_ts_5min
>   SSTable count: 3
> --
>   Table: dsp_metrics_ts_day
>   SSTable count: 2
> --
>   Table: dsp_metrics_ts_hour
>   SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable files before this system went into use, 
> but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-28 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215450#comment-15215450
 ] 

Marcus Eriksson commented on CASSANDRA-11447:
-

I suspect this is CASSANDRA-11373

Could you grep the logs for {{Compaction interrupted}}?

> Flush writer deadlock in Cassandra 2.2.5
> 
>
> Key: CASSANDRA-11447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mark Manley
> Fix For: 2.2.x
>
> Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x05fc11d0 nid=0x7664 waiting for monitor entry 
> [0x7fb83f0e5000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
> - waiting to lock <0x000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>   Table: active_counters
>   SSTable count: 2
> --
>   Table: aggregation_job_entries
>   SSTable count: 2
> --
>   Table: dsp_metrics_log
>   SSTable count: 207
> --
>   Table: dsp_metrics_ts_5min
>   SSTable count: 3
> --
>   Table: dsp_metrics_ts_day
>   SSTable count: 2
> --
>   Table: dsp_metrics_ts_hour
>   SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable files before this system went into use, 
> but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)