[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-29 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215864#comment-15215864
 ] 

Mark Manley commented on CASSANDRA-11447:
-

Fair enough.  I am unclear as to how all these tasks were cancelled though.

> Flush writer deadlock in Cassandra 2.2.5
> 
>
> Key: CASSANDRA-11447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mark Manley
> Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x05fc11d0 nid=0x7664 waiting for monitor entry 
> [0x7fb83f0e5000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
> - waiting to lock <0x000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>   Table: active_counters
>   SSTable count: 2
> --
>   Table: aggregation_job_entries
>   SSTable count: 2
> --
>   Table: dsp_metrics_log
>   SSTable count: 207
> --
>   Table: dsp_metrics_ts_5min
>   SSTable count: 3
> --
>   Table: dsp_metrics_ts_day
>   SSTable count: 2
> --
>   Table: dsp_metrics_ts_hour
>   SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable files before this system went into use, 
> but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-29 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215841#comment-15215841
 ] 

Mark Manley commented on CASSANDRA-11447:
-

Of course:

INFO  [CompactionExecutor:224] 2016-03-28 16:37:55,107 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 98651672/2163791990)bytes
INFO  [CompactionExecutor:226] 2016-03-28 16:37:55,150 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 9698634605/15761279246)bytes
INFO  [CompactionExecutor:225] 2016-03-28 16:38:55,206 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 0/49808303118)bytes
INFO  [CompactionExecutor:229] 2016-03-28 18:05:31,170 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_log, 9302818642/12244413096)bytes

> Flush writer deadlock in Cassandra 2.2.5
> 
>
> Key: CASSANDRA-11447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mark Manley
> Fix For: 2.2.x
>
> Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x05fc11d0 nid=0x7664 waiting for monitor entry 
> [0x7fb83f0e5000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
> - waiting to lock <0x000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>   Table: active_counters
>   SSTable count: 2
> --
>   Table: aggregation_job_entries
>   SSTable count: 2
> --
>   Table: dsp_metrics_log
>   SSTable count: 207
> --
>   Table: dsp_metrics_ts_5min
>   SSTable count: 3
> --
>   Table: dsp_metrics_ts_day
>   SSTable count: 2
> --
>   Table: dsp_metrics_ts_hour
>   SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable files before this system went into use, 
> but they should have been eliminated.
> Have other people seen this?  I am attaching a strack trace.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-29 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215841#comment-15215841
 ] 

Mark Manley edited comment on CASSANDRA-11447 at 3/29/16 11:01 AM:
---

Of course:

{code}
INFO  [CompactionExecutor:224] 2016-03-28 16:37:55,107 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 98651672/2163791990)bytes
INFO  [CompactionExecutor:226] 2016-03-28 16:37:55,150 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 9698634605/15761279246)bytes
INFO  [CompactionExecutor:225] 2016-03-28 16:38:55,206 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 0/49808303118)bytes
INFO  [CompactionExecutor:229] 2016-03-28 18:05:31,170 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_log, 9302818642/12244413096)bytes
{code}


was (Author: mwmanley):
Of course:

INFO  [CompactionExecutor:224] 2016-03-28 16:37:55,107 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 98651672/2163791990)bytes
INFO  [CompactionExecutor:226] 2016-03-28 16:37:55,150 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 9698634605/15761279246)bytes
INFO  [CompactionExecutor:225] 2016-03-28 16:38:55,206 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_ts_5min, 0/49808303118)bytes
INFO  [CompactionExecutor:229] 2016-03-28 18:05:31,170 
CompactionManager.java:1464 - Compaction interrupted: 
Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, 
dsp_metrics_log, 9302818642/12244413096)bytes

> Flush writer deadlock in Cassandra 2.2.5
> 
>
> Key: CASSANDRA-11447
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Mark Manley
> Fix For: 2.2.x
>
> Attachments: cassandra.jstack.out
>
>
> When writing heavily to one of my Cassandra tables, I got a deadlock similar 
> to CASSANDRA-9882:
> {code}
> "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
> tid=0x05fc11d0 nid=0x7664 waiting for monitor entry 
> [0x7fb83f0e5000]
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
> - waiting to lock <0x000400956258> (a 
> org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
> at 
> org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
> at 
> org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
> at 
> org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
> at 
> org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
> at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
> at 
> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
> at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> The compaction strategies in this keyspace are mixed with one table using LCS 
> and the rest using DTCS.  None of the tables here save for the LCS one seem 
> to have large SSTable counts:
> {code}
>   Table: active_counters
>   SSTable count: 2
> --
>   Table: aggregation_job_entries
>   SSTable count: 2
> --
>   Table: dsp_metrics_log
>   SSTable count: 207
> --
>   Table: dsp_metrics_ts_5min
>   SSTable count: 3
> --
>   Table: dsp_metrics_ts_day
>   SSTable count: 2
> --
>   Table: dsp_metrics_ts_hour
>   SSTable count: 2
> {code}
> Yet the symptoms are similar. 
> The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
> this to get rid of the 400+ SStable 

[jira] [Updated] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-28 Thread Mark Manley (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Manley updated CASSANDRA-11447:

Description: 
When writing heavily to one of my Cassandra tables, I got a deadlock similar to 
CASSANDRA-9882:

{code}
"MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
tid=0x05fc11d0 nid=0x7664 waiting for monitor entry [0x7fb83f0e5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
- waiting to lock <0x000400956258> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
at 
org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
at 
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

The compaction strategies in this keyspace are mixed with one table using LCS 
and the rest using DTCS.  None of the tables here save for the LCS one seem to 
have large SSTable counts:

{code}
Table: active_counters
SSTable count: 2
--

Table: aggregation_job_entries
SSTable count: 2
--

Table: dsp_metrics_log
SSTable count: 207
--

Table: dsp_metrics_ts_5min
SSTable count: 3
--

Table: dsp_metrics_ts_day
SSTable count: 2
--

Table: dsp_metrics_ts_hour
SSTable count: 2
{code}

Yet the symptoms are similar. 

The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
this to get rid of the 400+ SStable files before this system went into use, but 
they should have been eliminated.

Have other people seen this?  I am attaching a strack trace.

Thanks!

  was:
When writing heavily to one of my Cassandra tables, I got a deadlock similar to 
CASSANDRA-9882:

{code}
"MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
tid=0x05fc11d0 nid=0x7664 waiting for monitor entry [0x7fb83f0e5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
- waiting to lock <0x000400956258> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
at 
org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
at 
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

The compaction strategies in this keyspace are mixed with one table using LCS 
and the rest using DTCS.  None of the tables here save for the LCS one seem to 
have large SSTable counts:

{code}
Table: active_counters
SSTable count: 2
--

Table: aggregation_job_entries
SSTable count: 2
--

Table: dsp_metrics_log
SSTable count: 207
--

Table: dsp_metrics_ts_5min
SSTable count: 3
--

Table: dsp_metrics_ts_day
SSTable count: 2
--

Table: dsp_metrics_ts_hour
SSTable 

[jira] [Created] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5

2016-03-28 Thread Mark Manley (JIRA)
Mark Manley created CASSANDRA-11447:
---

 Summary: Flush writer deadlock in Cassandra 2.2.5
 Key: CASSANDRA-11447
 URL: https://issues.apache.org/jira/browse/CASSANDRA-11447
 Project: Cassandra
  Issue Type: Bug
Reporter: Mark Manley
 Attachments: cassandra.jstack.out

When writing heavily to one of my Cassandra tables, I got a deadlock similar to 
CASSANDRA-9882:

{code}
"MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 
tid=0x05fc11d0 nid=0x7664 waiting for monitor entry [0x7fb83f0e5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266)
- waiting to lock <0x000400956258> (a 
org.apache.cassandra.db.compaction.WrappingCompactionStrategy)
at 
org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400)
at 
org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332)
at 
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235)
at 
org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580)
at 
org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

The compaction strategies in this keyspace are mixed with one table using LCS 
and the rest using DTCS.  None of the tables here save for the LCS one seem to 
have large SSTable counts:

{code}
Table: active_counters
SSTable count: 2
--

Table: aggregation_job_entries
SSTable count: 2
--

Table: dsp_metrics_log
SSTable count: 207
--

Table: dsp_metrics_ts_5min
SSTable count: 3
--

Table: dsp_metrics_ts_day
SSTable count: 2
--

Table: dsp_metrics_ts_hour
SSTable count: 2
{code}

Yet the symptoms are similar. 

The "dsp_metrics_ts_5min" table had had a major compaction shortly before all 
this to get rid of the 400+ SStable files before this system went into use, but 
they should have been eliminated.

Have other people seen?  I am attaching a strack trace.

Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9758) nodetool compactionhistory NPE

2015-09-13 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742695#comment-14742695
 ] 

Mark Manley commented on CASSANDRA-9758:


This seems broken in 2.2.1, FYI:

{code}
$ nodetool compactionhistory
Compaction History:
error: null
-- StackTrace --
java.lang.NullPointerException
at com.google.common.base.Joiner$MapJoiner.join(Joiner.java:330)
at org.apache.cassandra.utils.FBUtilities.toString(FBUtilities.java:477)
at 
org.apache.cassandra.db.compaction.CompactionHistoryTabularData.from(CompactionHistoryTabularData.java:78)
at 
org.apache.cassandra.db.SystemKeyspace.getCompactionHistory(SystemKeyspace.java:425)
at 
org.apache.cassandra.db.compaction.CompactionManager.getCompactionHistory(CompactionManager.java:1492)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at 
com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at 
com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at 
com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
at 
com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at 
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1443)
at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
at 
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:637)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$241(TCPTransport.java:683)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$1/1383269057.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
{code}

> nodetool compactionhistory NPE
> --
>
> Key: CASSANDRA-9758
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9758
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Pierre N.
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 0001-fix-npe-inline.patch, 9758.txt
>
>
> nodetool compactionhistory may trigger NPE : 
> {code}
> admin@localhost:~$ nodetool compactionhistory
> Compaction History: 
> error: null
> -- StackTrace --
> java.lang.NullPointerException
>   at com.google.common.base.Joiner$MapJoiner.join(Joiner.java:330)
>   at org.apache.cassandra.utils.FBUtilities.toString(FBUtilities.java:515)
>   at 
> org.apache.cassandra.db.compaction.CompactionHistoryTabularData.from(CompactionHistoryTabularData.java:78)
>   

[jira] [Commented] (CASSANDRA-9973) java.lang.IllegalStateException: Unable to compute when histogram overflowed

2015-08-23 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708448#comment-14708448
 ] 

Mark Manley commented on CASSANDRA-9973:


Do we have an ETA for the release date of 2.2.1?  My ring was again crippled 
this morning when several of my nodes spewed out hundreds of these errors a 
minute.  It corresponds to the time that they stopped answering requests 
reliably.  If there is a workaround for 2.2.0, I am all ears.

Thanks!

 java.lang.IllegalStateException: Unable to compute when histogram overflowed
 

 Key: CASSANDRA-9973
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9973
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Mark Manley
Assignee: T Jake Luciani
 Fix For: 2.2.1

 Attachments: 9973.txt


 I recently, and probably mistakenly, upgraded one of my production C* 
 clusters to 2.2.0.  I am seeing these errors in the logs, followed by an 
 intense period of garbage collection until the node, then the ring, becomes 
 crippled:
 {code}
 ERROR [OptionalTasks:1] 2015-08-04 03:24:56,057 CassandraDaemon.java:182 - 
 Exception in thread Thread[OptionalTasks:1,5,main]
 java.lang.IllegalStateException: Unable to compute when histogram overflowed
 at 
 org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:179)
  ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:84)
  ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 org.apache.cassandra.db.ColumnFamilyStore$3.run(ColumnFamilyStore.java:405) 
 ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
  ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
 [na:1.8.0_45]
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
 [na:1.8.0_45]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  [na:1.8.0_45]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  [na:1.8.0_45]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  [na:1.8.0_45]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  [na:1.8.0_45]
 at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
 {code}
 I am not sure if the GC instability is this or something else, but I though 
 this histogram overflow issue was fixed in 2.1.3?  Anyway, reporting now as a 
 possible regression.  Please let me know what I can provide in terms of 
 information to help with this.  Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9973) java.lang.IllegalStateException: Unable to compute when histogram overflowed

2015-08-10 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680286#comment-14680286
 ] 

Mark Manley commented on CASSANDRA-9973:


read_request_timeout_in_ms: 1

 java.lang.IllegalStateException: Unable to compute when histogram overflowed
 

 Key: CASSANDRA-9973
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9973
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Mark Manley
Assignee: T Jake Luciani
 Fix For: 2.2.x

 Attachments: 9973.txt


 I recently, and probably mistakenly, upgraded one of my production C* 
 clusters to 2.2.0.  I am seeing these errors in the logs, followed by an 
 intense period of garbage collection until the node, then the ring, becomes 
 crippled:
 {code}
 ERROR [OptionalTasks:1] 2015-08-04 03:24:56,057 CassandraDaemon.java:182 - 
 Exception in thread Thread[OptionalTasks:1,5,main]
 java.lang.IllegalStateException: Unable to compute when histogram overflowed
 at 
 org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:179)
  ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:84)
  ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 org.apache.cassandra.db.ColumnFamilyStore$3.run(ColumnFamilyStore.java:405) 
 ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
  ~[apache-cassandra-2.2.0.jar:2.2.0]
 at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
 [na:1.8.0_45]
 at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
 [na:1.8.0_45]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
  [na:1.8.0_45]
 at 
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
  [na:1.8.0_45]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  [na:1.8.0_45]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  [na:1.8.0_45]
 at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
 {code}
 I am not sure if the GC instability is this or something else, but I though 
 this histogram overflow issue was fixed in 2.1.3?  Anyway, reporting now as a 
 possible regression.  Please let me know what I can provide in terms of 
 information to help with this.  Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9117) LEAK DETECTED during repair, startup

2015-08-04 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654118#comment-14654118
 ] 

Mark Manley commented on CASSANDRA-9117:


I'm still seeing this in 2.2.0:

{code}
ERROR [MessagingService-Outgoing-/10.20.44.122] 2015-07-30 18:55:00,589 
OutboundTcpConnection.java:316 - error writing to /10.20.44.122
ERROR [MessagingService-Outgoing-/10.20.44.74] 2015-07-31 10:52:39,346 
OutboundTcpConnection.java:316 - error writing to /10.20.44.74
ERROR [STREAM-OUT-/10.20.44.108] 2015-07-31 20:22:17,052 StreamSession.java:518 
- [Stream #6f73e430-37c1-11e5-9fb4-a322a3bdb126] Streaming error occurred
ERROR [STREAM-IN-/10.20.44.108] 2015-07-31 20:22:18,513 StreamSession.java:518 
- [Stream #6f73e430-37c1-11e5-9fb4-a322a3bdb126] Streaming error occurred
ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,444 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@35ceb976) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@990466495:Memory@[7f426f54e880..7f426f54e884)
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,445 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@3f36d206) to class 
org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@2050251652:[[OffHeapBitSet]]
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,445 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@3af158bd) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1258578677:Memory@[7f56a130f400..7f56a130fa40)
 was not released before the reference was garbage collected
ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,445 Ref.java:187 - LEAK 
DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@422b3f71) to class 
org.apache.cassandra.io.util.SafeMemory$MemoryTidy@438019275:Memory@[7f56a0158150..7f56a01581a0)
 was not released before the reference was garbage collected
ERROR [MessagingService-Outgoing-/10.20.44.108] 2015-08-02 00:21:30,685 
OutboundTcpConnection.java:316 - error writing to /10.20.44.108
{code}

 LEAK DETECTED during repair, startup
 

 Key: CASSANDRA-9117
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9117
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Tyler Hobbs
Assignee: Marcus Eriksson
 Fix For: 2.2.0 beta 1

 Attachments: 
 0001-dont-initialize-writer-before-checking-if-iter-is-em.patch, node1.log, 
 node2.log.gz


 When running the 
 {{incremental_repair_test.TestIncRepair.multiple_repair_test}} dtest, the 
 following error logs show up:
 {noformat}
 ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,491 Ref.java:181 - LEAK 
 DETECTED: a reference 
 (org.apache.cassandra.utils.concurrent.Ref$State@83f047e) to class 
 org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1631580268:Memory@[7f354800bdc0..7f354800bde8)
  was not released before the reference was garbage collected
 ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,493 Ref.java:181 - LEAK 
 DETECTED: a reference 
 (org.apache.cassandra.utils.concurrent.Ref$State@50bc8f67) to class 
 org.apache.cassandra.io.util.SafeMemory$MemoryTidy@191552666:Memory@[7f354800ba90..7f354800bdb0)
  was not released before the reference was garbage collected
 ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,493 Ref.java:181 - LEAK 
 DETECTED: a reference 
 (org.apache.cassandra.utils.concurrent.Ref$State@7fd10877) to class 
 org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1954741807:Memory@[7f3548101190..7f3548101194)
  was not released before the reference was garbage collected
 ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,494 Ref.java:181 - LEAK 
 DETECTED: a reference 
 (org.apache.cassandra.utils.concurrent.Ref$State@578550ac) to class 
 org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1903393047:[[OffHeapBitSet]]
  was not released before the reference was garbage collected
 {noformat}
 The test is being run against trunk (commit {{1dff098e}}).  I've attached a 
 DEBUG-level log from the test run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9973) java.lang.IllegalStateException: Unable to compute when histogram overflowed

2015-08-03 Thread Mark Manley (JIRA)
Mark Manley created CASSANDRA-9973:
--

 Summary: java.lang.IllegalStateException: Unable to compute when 
histogram overflowed
 Key: CASSANDRA-9973
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9973
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Mark Manley
 Fix For: 2.2.x


I recently, and probably mistakenly, upgraded one of my production C* clusters 
to 2.2.0.  I am seeing these errors in the logs, followed by an intense period 
of garbage collection until the node, then the ring, becomes crippled:

{code}
ERROR [OptionalTasks:1] 2015-08-04 03:24:56,057 CassandraDaemon.java:182 - 
Exception in thread Thread[OptionalTasks:1,5,main]
java.lang.IllegalStateException: Unable to compute when histogram overflowed
at 
org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:179)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
at 
org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:84)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
at 
org.apache.cassandra.db.ColumnFamilyStore$3.run(ColumnFamilyStore.java:405) 
~[apache-cassandra-2.2.0.jar:2.2.0]
at 
org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118)
 ~[apache-cassandra-2.2.0.jar:2.2.0]
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_45]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) 
[na:1.8.0_45]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
 [na:1.8.0_45]
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
 [na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_45]
at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
{code}

I am not sure if the GC instability is this or something else, but I though 
this histogram overflow issue was fixed in 2.1.3?  Anyway, reporting now as a 
possible regression.  Please let me know what I can provide in terms of 
information to help with this.  Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9616) cfstats on 2.1.6 throws fatal exception during compaction cycles

2015-06-18 Thread Mark Manley (JIRA)
Mark Manley created CASSANDRA-9616:
--

 Summary: cfstats on 2.1.6 throws fatal exception during compaction 
cycles
 Key: CASSANDRA-9616
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9616
 Project: Cassandra
  Issue Type: Bug
Reporter: Mark Manley


When running cfstats against any cf that is doing a compaction cycle, I get the 
following exception for its reading of tmplink files:

{code}
error: 
/var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db
-- StackTrace --
java.lang.AssertionError: 
/var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db
{code}

This seems to have started when I rolled out 2.1.6.  I don't see a current bug 
in my cursory search, so here you go!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9616) cfstats on 2.1.6 throws fatal exception during compaction cycles

2015-06-18 Thread Mark Manley (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592327#comment-14592327
 ] 

Mark Manley commented on CASSANDRA-9616:


It looks like the same issue with a different call.  I'll close this as a dup 
and will link this appropriately.

Thanks!

 cfstats on 2.1.6 throws fatal exception during compaction cycles
 

 Key: CASSANDRA-9616
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9616
 Project: Cassandra
  Issue Type: Bug
Reporter: Mark Manley

 When running cfstats against any cf that is doing a compaction cycle, I get 
 the following exception for its reading of tmplink files:
 {code}
 error: 
 /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db
 -- StackTrace --
 java.lang.AssertionError: 
 /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db
 {code}
 This seems to have started when I rolled out 2.1.6.  I don't see a current 
 bug in my cursory search, so here you go!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (CASSANDRA-9616) cfstats on 2.1.6 throws fatal exception during compaction cycles

2015-06-18 Thread Mark Manley (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Manley resolved CASSANDRA-9616.

Resolution: Duplicate

 cfstats on 2.1.6 throws fatal exception during compaction cycles
 

 Key: CASSANDRA-9616
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9616
 Project: Cassandra
  Issue Type: Bug
Reporter: Mark Manley

 When running cfstats against any cf that is doing a compaction cycle, I get 
 the following exception for its reading of tmplink files:
 {code}
 error: 
 /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db
 -- StackTrace --
 java.lang.AssertionError: 
 /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db
 {code}
 This seems to have started when I rolled out 2.1.6.  I don't see a current 
 bug in my cursory search, so here you go!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)