[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5
[ https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215864#comment-15215864 ] Mark Manley commented on CASSANDRA-11447: - Fair enough. I am unclear as to how all these tasks were cancelled though. > Flush writer deadlock in Cassandra 2.2.5 > > > Key: CASSANDRA-11447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11447 > Project: Cassandra > Issue Type: Bug >Reporter: Mark Manley > Attachments: cassandra.jstack.out > > > When writing heavily to one of my Cassandra tables, I got a deadlock similar > to CASSANDRA-9882: > {code} > "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 > tid=0x05fc11d0 nid=0x7664 waiting for monitor entry > [0x7fb83f0e5000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266) > - waiting to lock <0x000400956258> (a > org.apache.cassandra.db.compaction.WrappingCompactionStrategy) > at > org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400) > at > org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332) > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235) > at > org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580) > at > org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The compaction strategies in this keyspace are mixed with one table using LCS > and the rest using DTCS. None of the tables here save for the LCS one seem > to have large SSTable counts: > {code} > Table: active_counters > SSTable count: 2 > -- > Table: aggregation_job_entries > SSTable count: 2 > -- > Table: dsp_metrics_log > SSTable count: 207 > -- > Table: dsp_metrics_ts_5min > SSTable count: 3 > -- > Table: dsp_metrics_ts_day > SSTable count: 2 > -- > Table: dsp_metrics_ts_hour > SSTable count: 2 > {code} > Yet the symptoms are similar. > The "dsp_metrics_ts_5min" table had had a major compaction shortly before all > this to get rid of the 400+ SStable files before this system went into use, > but they should have been eliminated. > Have other people seen this? I am attaching a strack trace. > Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5
[ https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215841#comment-15215841 ] Mark Manley commented on CASSANDRA-11447: - Of course: INFO [CompactionExecutor:224] 2016-03-28 16:37:55,107 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 98651672/2163791990)bytes INFO [CompactionExecutor:226] 2016-03-28 16:37:55,150 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 9698634605/15761279246)bytes INFO [CompactionExecutor:225] 2016-03-28 16:38:55,206 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 0/49808303118)bytes INFO [CompactionExecutor:229] 2016-03-28 18:05:31,170 CompactionManager.java:1464 - Compaction interrupted: Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_log, 9302818642/12244413096)bytes > Flush writer deadlock in Cassandra 2.2.5 > > > Key: CASSANDRA-11447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11447 > Project: Cassandra > Issue Type: Bug >Reporter: Mark Manley > Fix For: 2.2.x > > Attachments: cassandra.jstack.out > > > When writing heavily to one of my Cassandra tables, I got a deadlock similar > to CASSANDRA-9882: > {code} > "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 > tid=0x05fc11d0 nid=0x7664 waiting for monitor entry > [0x7fb83f0e5000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266) > - waiting to lock <0x000400956258> (a > org.apache.cassandra.db.compaction.WrappingCompactionStrategy) > at > org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400) > at > org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332) > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235) > at > org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580) > at > org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The compaction strategies in this keyspace are mixed with one table using LCS > and the rest using DTCS. None of the tables here save for the LCS one seem > to have large SSTable counts: > {code} > Table: active_counters > SSTable count: 2 > -- > Table: aggregation_job_entries > SSTable count: 2 > -- > Table: dsp_metrics_log > SSTable count: 207 > -- > Table: dsp_metrics_ts_5min > SSTable count: 3 > -- > Table: dsp_metrics_ts_day > SSTable count: 2 > -- > Table: dsp_metrics_ts_hour > SSTable count: 2 > {code} > Yet the symptoms are similar. > The "dsp_metrics_ts_5min" table had had a major compaction shortly before all > this to get rid of the 400+ SStable files before this system went into use, > but they should have been eliminated. > Have other people seen this? I am attaching a strack trace. > Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5
[ https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215841#comment-15215841 ] Mark Manley edited comment on CASSANDRA-11447 at 3/29/16 11:01 AM: --- Of course: {code} INFO [CompactionExecutor:224] 2016-03-28 16:37:55,107 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 98651672/2163791990)bytes INFO [CompactionExecutor:226] 2016-03-28 16:37:55,150 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 9698634605/15761279246)bytes INFO [CompactionExecutor:225] 2016-03-28 16:38:55,206 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 0/49808303118)bytes INFO [CompactionExecutor:229] 2016-03-28 18:05:31,170 CompactionManager.java:1464 - Compaction interrupted: Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_log, 9302818642/12244413096)bytes {code} was (Author: mwmanley): Of course: INFO [CompactionExecutor:224] 2016-03-28 16:37:55,107 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 98651672/2163791990)bytes INFO [CompactionExecutor:226] 2016-03-28 16:37:55,150 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 9698634605/15761279246)bytes INFO [CompactionExecutor:225] 2016-03-28 16:38:55,206 CompactionManager.java:1464 - Compaction interrupted: Compaction@bff6b700-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_ts_5min, 0/49808303118)bytes INFO [CompactionExecutor:229] 2016-03-28 18:05:31,170 CompactionManager.java:1464 - Compaction interrupted: Compaction@c31471c0-45f1-11e5-9621-a322a3bdb126(counter_service, dsp_metrics_log, 9302818642/12244413096)bytes > Flush writer deadlock in Cassandra 2.2.5 > > > Key: CASSANDRA-11447 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11447 > Project: Cassandra > Issue Type: Bug >Reporter: Mark Manley > Fix For: 2.2.x > > Attachments: cassandra.jstack.out > > > When writing heavily to one of my Cassandra tables, I got a deadlock similar > to CASSANDRA-9882: > {code} > "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 > tid=0x05fc11d0 nid=0x7664 waiting for monitor entry > [0x7fb83f0e5000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266) > - waiting to lock <0x000400956258> (a > org.apache.cassandra.db.compaction.WrappingCompactionStrategy) > at > org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400) > at > org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332) > at > org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235) > at > org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580) > at > org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} > The compaction strategies in this keyspace are mixed with one table using LCS > and the rest using DTCS. None of the tables here save for the LCS one seem > to have large SSTable counts: > {code} > Table: active_counters > SSTable count: 2 > -- > Table: aggregation_job_entries > SSTable count: 2 > -- > Table: dsp_metrics_log > SSTable count: 207 > -- > Table: dsp_metrics_ts_5min > SSTable count: 3 > -- > Table: dsp_metrics_ts_day > SSTable count: 2 > -- > Table: dsp_metrics_ts_hour > SSTable count: 2 > {code} > Yet the symptoms are similar. > The "dsp_metrics_ts_5min" table had had a major compaction shortly before all > this to get rid of the 400+ SStable
[jira] [Updated] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5
[ https://issues.apache.org/jira/browse/CASSANDRA-11447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Manley updated CASSANDRA-11447: Description: When writing heavily to one of my Cassandra tables, I got a deadlock similar to CASSANDRA-9882: {code} "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 tid=0x05fc11d0 nid=0x7664 waiting for monitor entry [0x7fb83f0e5000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266) - waiting to lock <0x000400956258> (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy) at org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400) at org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235) at org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580) at org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} The compaction strategies in this keyspace are mixed with one table using LCS and the rest using DTCS. None of the tables here save for the LCS one seem to have large SSTable counts: {code} Table: active_counters SSTable count: 2 -- Table: aggregation_job_entries SSTable count: 2 -- Table: dsp_metrics_log SSTable count: 207 -- Table: dsp_metrics_ts_5min SSTable count: 3 -- Table: dsp_metrics_ts_day SSTable count: 2 -- Table: dsp_metrics_ts_hour SSTable count: 2 {code} Yet the symptoms are similar. The "dsp_metrics_ts_5min" table had had a major compaction shortly before all this to get rid of the 400+ SStable files before this system went into use, but they should have been eliminated. Have other people seen this? I am attaching a strack trace. Thanks! was: When writing heavily to one of my Cassandra tables, I got a deadlock similar to CASSANDRA-9882: {code} "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 tid=0x05fc11d0 nid=0x7664 waiting for monitor entry [0x7fb83f0e5000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266) - waiting to lock <0x000400956258> (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy) at org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400) at org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235) at org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580) at org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} The compaction strategies in this keyspace are mixed with one table using LCS and the rest using DTCS. None of the tables here save for the LCS one seem to have large SSTable counts: {code} Table: active_counters SSTable count: 2 -- Table: aggregation_job_entries SSTable count: 2 -- Table: dsp_metrics_log SSTable count: 207 -- Table: dsp_metrics_ts_5min SSTable count: 3 -- Table: dsp_metrics_ts_day SSTable count: 2 -- Table: dsp_metrics_ts_hour SSTable
[jira] [Created] (CASSANDRA-11447) Flush writer deadlock in Cassandra 2.2.5
Mark Manley created CASSANDRA-11447: --- Summary: Flush writer deadlock in Cassandra 2.2.5 Key: CASSANDRA-11447 URL: https://issues.apache.org/jira/browse/CASSANDRA-11447 Project: Cassandra Issue Type: Bug Reporter: Mark Manley Attachments: cassandra.jstack.out When writing heavily to one of my Cassandra tables, I got a deadlock similar to CASSANDRA-9882: {code} "MemtableFlushWriter:4589" #34721 daemon prio=5 os_prio=0 tid=0x05fc11d0 nid=0x7664 waiting for monitor entry [0x7fb83f0e5000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.cassandra.db.compaction.WrappingCompactionStrategy.handleNotification(WrappingCompactionStrategy.java:266) - waiting to lock <0x000400956258> (a org.apache.cassandra.db.compaction.WrappingCompactionStrategy) at org.apache.cassandra.db.lifecycle.Tracker.notifyAdded(Tracker.java:400) at org.apache.cassandra.db.lifecycle.Tracker.replaceFlushed(Tracker.java:332) at org.apache.cassandra.db.compaction.AbstractCompactionStrategy.replaceFlushed(AbstractCompactionStrategy.java:235) at org.apache.cassandra.db.ColumnFamilyStore.replaceFlushed(ColumnFamilyStore.java:1580) at org.apache.cassandra.db.Memtable$FlushRunnable.runMayThrow(Memtable.java:362) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1139) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} The compaction strategies in this keyspace are mixed with one table using LCS and the rest using DTCS. None of the tables here save for the LCS one seem to have large SSTable counts: {code} Table: active_counters SSTable count: 2 -- Table: aggregation_job_entries SSTable count: 2 -- Table: dsp_metrics_log SSTable count: 207 -- Table: dsp_metrics_ts_5min SSTable count: 3 -- Table: dsp_metrics_ts_day SSTable count: 2 -- Table: dsp_metrics_ts_hour SSTable count: 2 {code} Yet the symptoms are similar. The "dsp_metrics_ts_5min" table had had a major compaction shortly before all this to get rid of the 400+ SStable files before this system went into use, but they should have been eliminated. Have other people seen? I am attaching a strack trace. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9758) nodetool compactionhistory NPE
[ https://issues.apache.org/jira/browse/CASSANDRA-9758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742695#comment-14742695 ] Mark Manley commented on CASSANDRA-9758: This seems broken in 2.2.1, FYI: {code} $ nodetool compactionhistory Compaction History: error: null -- StackTrace -- java.lang.NullPointerException at com.google.common.base.Joiner$MapJoiner.join(Joiner.java:330) at org.apache.cassandra.utils.FBUtilities.toString(FBUtilities.java:477) at org.apache.cassandra.db.compaction.CompactionHistoryTabularData.from(CompactionHistoryTabularData.java:78) at org.apache.cassandra.db.SystemKeyspace.getCompactionHistory(SystemKeyspace.java:425) at org.apache.cassandra.db.compaction.CompactionManager.getCompactionHistory(CompactionManager.java:1492) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1443) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399) at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:637) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323) at sun.rmi.transport.Transport$1.run(Transport.java:200) at sun.rmi.transport.Transport$1.run(Transport.java:197) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:196) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$241(TCPTransport.java:683) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$1/1383269057.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {code} > nodetool compactionhistory NPE > -- > > Key: CASSANDRA-9758 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9758 > Project: Cassandra > Issue Type: Bug >Reporter: Pierre N. >Priority: Minor > Fix For: 3.x > > Attachments: 0001-fix-npe-inline.patch, 9758.txt > > > nodetool compactionhistory may trigger NPE : > {code} > admin@localhost:~$ nodetool compactionhistory > Compaction History: > error: null > -- StackTrace -- > java.lang.NullPointerException > at com.google.common.base.Joiner$MapJoiner.join(Joiner.java:330) > at org.apache.cassandra.utils.FBUtilities.toString(FBUtilities.java:515) > at > org.apache.cassandra.db.compaction.CompactionHistoryTabularData.from(CompactionHistoryTabularData.java:78) >
[jira] [Commented] (CASSANDRA-9973) java.lang.IllegalStateException: Unable to compute when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14708448#comment-14708448 ] Mark Manley commented on CASSANDRA-9973: Do we have an ETA for the release date of 2.2.1? My ring was again crippled this morning when several of my nodes spewed out hundreds of these errors a minute. It corresponds to the time that they stopped answering requests reliably. If there is a workaround for 2.2.0, I am all ears. Thanks! java.lang.IllegalStateException: Unable to compute when histogram overflowed Key: CASSANDRA-9973 URL: https://issues.apache.org/jira/browse/CASSANDRA-9973 Project: Cassandra Issue Type: Bug Components: Core Reporter: Mark Manley Assignee: T Jake Luciani Fix For: 2.2.1 Attachments: 9973.txt I recently, and probably mistakenly, upgraded one of my production C* clusters to 2.2.0. I am seeing these errors in the logs, followed by an intense period of garbage collection until the node, then the ring, becomes crippled: {code} ERROR [OptionalTasks:1] 2015-08-04 03:24:56,057 CassandraDaemon.java:182 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.IllegalStateException: Unable to compute when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:179) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:84) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.db.ColumnFamilyStore$3.run(ColumnFamilyStore.java:405) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-2.2.0.jar:2.2.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] {code} I am not sure if the GC instability is this or something else, but I though this histogram overflow issue was fixed in 2.1.3? Anyway, reporting now as a possible regression. Please let me know what I can provide in terms of information to help with this. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9973) java.lang.IllegalStateException: Unable to compute when histogram overflowed
[ https://issues.apache.org/jira/browse/CASSANDRA-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14680286#comment-14680286 ] Mark Manley commented on CASSANDRA-9973: read_request_timeout_in_ms: 1 java.lang.IllegalStateException: Unable to compute when histogram overflowed Key: CASSANDRA-9973 URL: https://issues.apache.org/jira/browse/CASSANDRA-9973 Project: Cassandra Issue Type: Bug Components: Core Reporter: Mark Manley Assignee: T Jake Luciani Fix For: 2.2.x Attachments: 9973.txt I recently, and probably mistakenly, upgraded one of my production C* clusters to 2.2.0. I am seeing these errors in the logs, followed by an intense period of garbage collection until the node, then the ring, becomes crippled: {code} ERROR [OptionalTasks:1] 2015-08-04 03:24:56,057 CassandraDaemon.java:182 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.IllegalStateException: Unable to compute when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:179) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:84) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.db.ColumnFamilyStore$3.run(ColumnFamilyStore.java:405) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-2.2.0.jar:2.2.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] {code} I am not sure if the GC instability is this or something else, but I though this histogram overflow issue was fixed in 2.1.3? Anyway, reporting now as a possible regression. Please let me know what I can provide in terms of information to help with this. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9117) LEAK DETECTED during repair, startup
[ https://issues.apache.org/jira/browse/CASSANDRA-9117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654118#comment-14654118 ] Mark Manley commented on CASSANDRA-9117: I'm still seeing this in 2.2.0: {code} ERROR [MessagingService-Outgoing-/10.20.44.122] 2015-07-30 18:55:00,589 OutboundTcpConnection.java:316 - error writing to /10.20.44.122 ERROR [MessagingService-Outgoing-/10.20.44.74] 2015-07-31 10:52:39,346 OutboundTcpConnection.java:316 - error writing to /10.20.44.74 ERROR [STREAM-OUT-/10.20.44.108] 2015-07-31 20:22:17,052 StreamSession.java:518 - [Stream #6f73e430-37c1-11e5-9fb4-a322a3bdb126] Streaming error occurred ERROR [STREAM-IN-/10.20.44.108] 2015-07-31 20:22:18,513 StreamSession.java:518 - [Stream #6f73e430-37c1-11e5-9fb4-a322a3bdb126] Streaming error occurred ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,444 Ref.java:187 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@35ceb976) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@990466495:Memory@[7f426f54e880..7f426f54e884) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,445 Ref.java:187 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3f36d206) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@2050251652:[[OffHeapBitSet]] was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,445 Ref.java:187 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@3af158bd) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1258578677:Memory@[7f56a130f400..7f56a130fa40) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2015-07-31 20:22:23,445 Ref.java:187 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@422b3f71) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@438019275:Memory@[7f56a0158150..7f56a01581a0) was not released before the reference was garbage collected ERROR [MessagingService-Outgoing-/10.20.44.108] 2015-08-02 00:21:30,685 OutboundTcpConnection.java:316 - error writing to /10.20.44.108 {code} LEAK DETECTED during repair, startup Key: CASSANDRA-9117 URL: https://issues.apache.org/jira/browse/CASSANDRA-9117 Project: Cassandra Issue Type: Bug Components: Core Reporter: Tyler Hobbs Assignee: Marcus Eriksson Fix For: 2.2.0 beta 1 Attachments: 0001-dont-initialize-writer-before-checking-if-iter-is-em.patch, node1.log, node2.log.gz When running the {{incremental_repair_test.TestIncRepair.multiple_repair_test}} dtest, the following error logs show up: {noformat} ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,491 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@83f047e) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1631580268:Memory@[7f354800bdc0..7f354800bde8) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,493 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@50bc8f67) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@191552666:Memory@[7f354800ba90..7f354800bdb0) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,493 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@7fd10877) to class org.apache.cassandra.io.util.SafeMemory$MemoryTidy@1954741807:Memory@[7f3548101190..7f3548101194) was not released before the reference was garbage collected ERROR [Reference-Reaper:1] 2015-04-03 15:48:25,494 Ref.java:181 - LEAK DETECTED: a reference (org.apache.cassandra.utils.concurrent.Ref$State@578550ac) to class org.apache.cassandra.utils.concurrent.WrappedSharedCloseable$1@1903393047:[[OffHeapBitSet]] was not released before the reference was garbage collected {noformat} The test is being run against trunk (commit {{1dff098e}}). I've attached a DEBUG-level log from the test run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9973) java.lang.IllegalStateException: Unable to compute when histogram overflowed
Mark Manley created CASSANDRA-9973: -- Summary: java.lang.IllegalStateException: Unable to compute when histogram overflowed Key: CASSANDRA-9973 URL: https://issues.apache.org/jira/browse/CASSANDRA-9973 Project: Cassandra Issue Type: Bug Components: Core Reporter: Mark Manley Fix For: 2.2.x I recently, and probably mistakenly, upgraded one of my production C* clusters to 2.2.0. I am seeing these errors in the logs, followed by an intense period of garbage collection until the node, then the ring, becomes crippled: {code} ERROR [OptionalTasks:1] 2015-08-04 03:24:56,057 CassandraDaemon.java:182 - Exception in thread Thread[OptionalTasks:1,5,main] java.lang.IllegalStateException: Unable to compute when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.percentile(EstimatedHistogram.java:179) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.metrics.EstimatedHistogramReservoir$HistogramSnapshot.getValue(EstimatedHistogramReservoir.java:84) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.db.ColumnFamilyStore$3.run(ColumnFamilyStore.java:405) ~[apache-cassandra-2.2.0.jar:2.2.0] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-2.2.0.jar:2.2.0] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_45] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_45] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] {code} I am not sure if the GC instability is this or something else, but I though this histogram overflow issue was fixed in 2.1.3? Anyway, reporting now as a possible regression. Please let me know what I can provide in terms of information to help with this. Thanks! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (CASSANDRA-9616) cfstats on 2.1.6 throws fatal exception during compaction cycles
Mark Manley created CASSANDRA-9616: -- Summary: cfstats on 2.1.6 throws fatal exception during compaction cycles Key: CASSANDRA-9616 URL: https://issues.apache.org/jira/browse/CASSANDRA-9616 Project: Cassandra Issue Type: Bug Reporter: Mark Manley When running cfstats against any cf that is doing a compaction cycle, I get the following exception for its reading of tmplink files: {code} error: /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db -- StackTrace -- java.lang.AssertionError: /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db {code} This seems to have started when I rolled out 2.1.6. I don't see a current bug in my cursory search, so here you go! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9616) cfstats on 2.1.6 throws fatal exception during compaction cycles
[ https://issues.apache.org/jira/browse/CASSANDRA-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14592327#comment-14592327 ] Mark Manley commented on CASSANDRA-9616: It looks like the same issue with a different call. I'll close this as a dup and will link this appropriately. Thanks! cfstats on 2.1.6 throws fatal exception during compaction cycles Key: CASSANDRA-9616 URL: https://issues.apache.org/jira/browse/CASSANDRA-9616 Project: Cassandra Issue Type: Bug Reporter: Mark Manley When running cfstats against any cf that is doing a compaction cycle, I get the following exception for its reading of tmplink files: {code} error: /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db -- StackTrace -- java.lang.AssertionError: /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db {code} This seems to have started when I rolled out 2.1.6. I don't see a current bug in my cursory search, so here you go! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (CASSANDRA-9616) cfstats on 2.1.6 throws fatal exception during compaction cycles
[ https://issues.apache.org/jira/browse/CASSANDRA-9616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Manley resolved CASSANDRA-9616. Resolution: Duplicate cfstats on 2.1.6 throws fatal exception during compaction cycles Key: CASSANDRA-9616 URL: https://issues.apache.org/jira/browse/CASSANDRA-9616 Project: Cassandra Issue Type: Bug Reporter: Mark Manley When running cfstats against any cf that is doing a compaction cycle, I get the following exception for its reading of tmplink files: {code} error: /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db -- StackTrace -- java.lang.AssertionError: /var/lib/cassandra/data/metric/metric_300-3fc67c00f75911e495a13d7c060fcade/metric-metric_300-tmplink-ka-9300-Data.db {code} This seems to have started when I rolled out 2.1.6. I don't see a current bug in my cursory search, so here you go! -- This message was sent by Atlassian JIRA (v6.3.4#6332)