[jira] [Commented] (CASSANDRA-6879) ConcurrentModificationException while doing range slice query.
[ https://issues.apache.org/jira/browse/CASSANDRA-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13941552#comment-13941552 ] Shao-Chuan Wang commented on CASSANDRA-6879: [~mishail] By looking at the code, it is unclear to me why we needed to wait for {code}resolver.repairResults{code}. If it is critical, then this 'fix' is hiding the alarms that the waiting of resolver.repairResults is not satisfied. By looking at the history, the last change of {code} FBUtilities.waitOnFutures(resolver.repairResults, DatabaseDescriptor.getWriteRpcTimeout()) {code} is reverting CASSANDRA-1337, which is getting re-done in 2.1. I didn't have much contexts about the prior changes yet. Anyone who knows more about the contexts here? ConcurrentModificationException while doing range slice query. -- Key: CASSANDRA-6879 URL: https://issues.apache.org/jira/browse/CASSANDRA-6879 Project: Cassandra Issue Type: Bug Components: Core Environment: 2.0.4 Reporter: Shao-Chuan Wang Assignee: Mikhail Stepura Fix For: 2.0.7 Attachments: cassandra-2.0-6879.patch The paging read request (either from thrift or native) would sporadically fail due to a race condition between read repair and requesting thread waiting for read repair results list. The READ_REPAIR is queued in ReadCallback.maybeResolveForRepair(), and it does not seem to have guarantee that its resolve() method (which internally create RangeSliceResponseResolver.Reducer and doing repairResults.addAll inside RangeSliceResponseResolver.Reducer) would be invoked before the requesting thread starts waiting on resolver.repairResults. So, there is a small window that the list is partially populated, while requesting thread starts waiting on repairResults. I believe for the most of the time, the requesting thread is either wait for the entire repair results or not waiting for repair results at all. The original intent here seems to be waiting for repair results always (if the repair is triggered by repair chance). {code} ERROR [Native-Transport-Requests:70827] 2014-03-18 05:00:12,774 ErrorMessage.java (line 222) Unexpected exception during request java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:188) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:163) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:58) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} {code} ERROR [Thrift:1] 2014-03-18 07:18:02,434 CustomTThreadPoolServer.java (line 212) Error occurred during processing of message. java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.service.pager.RangeSliceQueryPager.queryNextPage(RangeSliceQueryPager.java:85) at
[jira] [Created] (CASSANDRA-6879) ConcurrentModificationException while doing range slice query.
Shao-Chuan Wang created CASSANDRA-6879: -- Summary: ConcurrentModificationException while doing range slice query. Key: CASSANDRA-6879 URL: https://issues.apache.org/jira/browse/CASSANDRA-6879 Project: Cassandra Issue Type: Bug Components: Core Environment: 2.0.4 Reporter: Shao-Chuan Wang The paging read request (either from thrift or native) would sporadically fail due to the race condition between read repair and requesting thread waiting for read repair results list. The READ_REPAIR is queued in ReadCallback.maybeResolveForRepair(), and it does not seem to have guarantee that its resolve() method (which internally create RangeSliceResponseResolver.Reducer and doing repairResults.addAll inside RangeSliceResponseResolver.Reducer) would be invoked before the requesting thread starts waiting on resolver.repairResults. So, there is a small window that the list is partially populated, while requesting thread starts waiting on repairResults. I believe for the most of the time, the requesting thread is either wait for the entire repair results or not waiting for repair results at all. The original intent here seems to be waiting for repair results always (if the repair is triggered by repair chance). {code} ERROR [Native-Transport-Requests:70827] 2014-03-18 05:00:12,774 ErrorMessage.java (line 222) Unexpected exception during request java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:188) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:163) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:58) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} {code} ERROR [Thrift:1] 2014-03-18 07:18:02,434 CustomTThreadPoolServer.java (line 212) Error occurred during processing of message. java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.service.pager.RangeSliceQueryPager.queryNextPage(RangeSliceQueryPager.java:85) at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:71) at org.apache.cassandra.service.pager.RangeSliceQueryPager.fetchPage(RangeSliceQueryPager.java:36) at org.apache.cassandra.cql3.statements.SelectStatement.pageCountQuery(SelectStatement.java:202) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:169) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:58) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:212) at org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1958) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486) at
[jira] [Updated] (CASSANDRA-6879) ConcurrentModificationException while doing range slice query.
[ https://issues.apache.org/jira/browse/CASSANDRA-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shao-Chuan Wang updated CASSANDRA-6879: --- Description: The paging read request (either from thrift or native) would sporadically fail due to a race condition between read repair and requesting thread waiting for read repair results list. The READ_REPAIR is queued in ReadCallback.maybeResolveForRepair(), and it does not seem to have guarantee that its resolve() method (which internally create RangeSliceResponseResolver.Reducer and doing repairResults.addAll inside RangeSliceResponseResolver.Reducer) would be invoked before the requesting thread starts waiting on resolver.repairResults. So, there is a small window that the list is partially populated, while requesting thread starts waiting on repairResults. I believe for the most of the time, the requesting thread is either wait for the entire repair results or not waiting for repair results at all. The original intent here seems to be waiting for repair results always (if the repair is triggered by repair chance). {code} ERROR [Native-Transport-Requests:70827] 2014-03-18 05:00:12,774 ErrorMessage.java (line 222) Unexpected exception during request java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:188) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:163) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:58) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} {code} ERROR [Thrift:1] 2014-03-18 07:18:02,434 CustomTThreadPoolServer.java (line 212) Error occurred during processing of message. java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.service.pager.RangeSliceQueryPager.queryNextPage(RangeSliceQueryPager.java:85) at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:71) at org.apache.cassandra.service.pager.RangeSliceQueryPager.fetchPage(RangeSliceQueryPager.java:36) at org.apache.cassandra.cql3.statements.SelectStatement.pageCountQuery(SelectStatement.java:202) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:169) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:58) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:222) at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:212) at org.apache.cassandra.thrift.CassandraServer.execute_cql3_query(CassandraServer.java:1958) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4486) at org.apache.cassandra.thrift.Cassandra$Processor$execute_cql3_query.getResult(Cassandra.java:4470) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at
[jira] [Commented] (CASSANDRA-6879) ConcurrentModificationException while doing range slice query.
[ https://issues.apache.org/jira/browse/CASSANDRA-6879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13939686#comment-13939686 ] Shao-Chuan Wang commented on CASSANDRA-6879: [~mishail]] Yes. We added some patches on top of that. The patch does extra logging, select live endpoints based on a new DatabaseDescriptor in filterForQuery, and throttling mutation when read comes in for remote nodes which can be enabled or disabled by DatabaseDescriptor; the throttling is disabled for the nodes that have these exceptions. It doesn't seem to me that the patches are the causes of bug; however, it could be the patches that make this issue more likely to happen. ConcurrentModificationException while doing range slice query. -- Key: CASSANDRA-6879 URL: https://issues.apache.org/jira/browse/CASSANDRA-6879 Project: Cassandra Issue Type: Bug Components: Core Environment: 2.0.4 Reporter: Shao-Chuan Wang The paging read request (either from thrift or native) would sporadically fail due to a race condition between read repair and requesting thread waiting for read repair results list. The READ_REPAIR is queued in ReadCallback.maybeResolveForRepair(), and it does not seem to have guarantee that its resolve() method (which internally create RangeSliceResponseResolver.Reducer and doing repairResults.addAll inside RangeSliceResponseResolver.Reducer) would be invoked before the requesting thread starts waiting on resolver.repairResults. So, there is a small window that the list is partially populated, while requesting thread starts waiting on repairResults. I believe for the most of the time, the requesting thread is either wait for the entire repair results or not waiting for repair results at all. The original intent here seems to be waiting for repair results always (if the repair is triggered by repair chance). {code} ERROR [Native-Transport-Requests:70827] 2014-03-18 05:00:12,774 ErrorMessage.java (line 222) Unexpected exception during request java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:188) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:163) at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:58) at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:188) at org.apache.cassandra.cql3.QueryProcessor.processPrepared(QueryProcessor.java:358) at org.apache.cassandra.transport.messages.ExecuteMessage.execute(ExecuteMessage.java:131) at org.apache.cassandra.transport.Message$Dispatcher.messageReceived(Message.java:304) at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at org.jboss.netty.handler.execution.ChannelUpstreamEventRunnable.doRun(ChannelUpstreamEventRunnable.java:43) at org.jboss.netty.handler.execution.ChannelEventRunnable.run(ChannelEventRunnable.java:67) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} {code} ERROR [Thrift:1] 2014-03-18 07:18:02,434 CustomTThreadPoolServer.java (line 212) Error occurred during processing of message. java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859) at java.util.ArrayList$Itr.next(ArrayList.java:831) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:423) at org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:1583) at org.apache.cassandra.service.pager.RangeSliceQueryPager.queryNextPage(RangeSliceQueryPager.java:85) at org.apache.cassandra.service.pager.AbstractQueryPager.fetchPage(AbstractQueryPager.java:71) at org.apache.cassandra.service.pager.RangeSliceQueryPager.fetchPage(RangeSliceQueryPager.java:36) at
[jira] [Created] (CASSANDRA-6576) StreamMessage.deserialize takes high cpu and does not seem to make progress
Shao-Chuan Wang created CASSANDRA-6576: -- Summary: StreamMessage.deserialize takes high cpu and does not seem to make progress Key: CASSANDRA-6576 URL: https://issues.apache.org/jira/browse/CASSANDRA-6576 Project: Cassandra Issue Type: Bug Environment: AWS EC2 machines, java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Reporter: Shao-Chuan Wang One of my machine seems to be stuck at streaming in the data. At node 10.97.135.32 htop {code} PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 32495 ubuntu 20 0 31.7G 13.9G 487M S 334. 46.7 146h /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 19396 ubuntu 20 0 31.7G 13.9G 487M S 74.0 46.7 9h20:34 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 4199 ubuntu 20 0 31.7G 13.9G 487M R 72.0 46.7 9h20:06 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 2737 ubuntu 20 0 31.7G 13.9G 487M R 69.0 46.7 2h27:54 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 31304 ubuntu 20 0 31.7G 13.9G 487M S 63.0 46.7 3h00:42 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - {code} Jstack for the above busy threads: {code} STREAM-IN-/10.122.50.31 daemon prio=10 tid=0x7f4d480e5000 nid=0xab1 runnable [0x7f4d94396000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x00058290b038 (a java.lang.Object) - locked 0x000582908de0 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) -- STREAM-IN-/10.154.136.39 daemon prio=10 tid=0x7f4a599e6000 nid=0x7a48 runnable [0x7f4d0170] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x0005830519a0 (a java.lang.Object) - locked 0x00058303f000 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) -- STREAM-IN-/10.44.183.111 daemon prio=10 tid=0x7f4d480e0800 nid=0x4bc4 runnable [0x7f4d1167b000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x000583b796f8 (a java.lang.Object) - locked 0x000583b76598 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) -- STREAM-IN-/10.178.13.230 daemon prio=10 tid=0x7f4a59a12000 nid=0x1067 runnable [0x7f4d23ca2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x000582908000 (a java.lang.Object) - locked 0x000582906150 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) {code} After several hours, these above stacks look almost exactly the same, except some thread scheduling. {code} STREAM-IN-/10.122.50.31 daemon prio=10 tid=0x7f4d480e5000 nid=0xab1 runnable [0x7f4d94396000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) - locked 0x000582908de0 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at
[jira] [Updated] (CASSANDRA-6576) StreamMessage.deserialize takes high cpu and does not seem to make progress
[ https://issues.apache.org/jira/browse/CASSANDRA-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shao-Chuan Wang updated CASSANDRA-6576: --- Environment: AWS EC2 machines, java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Each node has 32 vnodes. was: AWS EC2 machines, java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) StreamMessage.deserialize takes high cpu and does not seem to make progress --- Key: CASSANDRA-6576 URL: https://issues.apache.org/jira/browse/CASSANDRA-6576 Project: Cassandra Issue Type: Bug Environment: AWS EC2 machines, java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Each node has 32 vnodes. Reporter: Shao-Chuan Wang Assignee: Yuki Morishita Labels: streaming One of my machine seems to be stuck at streaming in the data. At node 10.97.135.32 htop {code} PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command 32495 ubuntu 20 0 31.7G 13.9G 487M S 334. 46.7 146h /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 19396 ubuntu 20 0 31.7G 13.9G 487M S 74.0 46.7 9h20:34 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 4199 ubuntu 20 0 31.7G 13.9G 487M R 72.0 46.7 9h20:06 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 2737 ubuntu 20 0 31.7G 13.9G 487M R 69.0 46.7 2h27:54 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - 31304 ubuntu 20 0 31.7G 13.9G 487M S 63.0 46.7 3h00:42 /usr/lib/jvm/java-7-oracle/bin/java -Dcassandra.ring_delay_ms=18 -ea - {code} Jstack for the above busy threads: {code} STREAM-IN-/10.122.50.31 daemon prio=10 tid=0x7f4d480e5000 nid=0xab1 runnable [0x7f4d94396000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x00058290b038 (a java.lang.Object) - locked 0x000582908de0 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) -- STREAM-IN-/10.154.136.39 daemon prio=10 tid=0x7f4a599e6000 nid=0x7a48 runnable [0x7f4d0170] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x0005830519a0 (a java.lang.Object) - locked 0x00058303f000 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) -- STREAM-IN-/10.44.183.111 daemon prio=10 tid=0x7f4d480e0800 nid=0x4bc4 runnable [0x7f4d1167b000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x000583b796f8 (a java.lang.Object) - locked 0x000583b76598 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) -- STREAM-IN-/10.178.13.230 daemon prio=10 tid=0x7f4a59a12000 nid=0x1067 runnable [0x7f4d23ca2000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.NativeThread.current(Native Method) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:325) - locked 0x000582908000 (a java.lang.Object) - locked 0x000582906150 (a java.lang.Object) at org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:50) at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:287) at java.lang.Thread.run(Thread.java:744) {code} After several hours, these above stacks look almost exactly the same, except some thread scheduling. {code} STREAM-IN-/10.122.50.31 daemon prio=10
[jira] [Created] (CASSANDRA-6577) ConcurrentModificationException during nodetool netstats
Shao-Chuan Wang created CASSANDRA-6577: -- Summary: ConcurrentModificationException during nodetool netstats Key: CASSANDRA-6577 URL: https://issues.apache.org/jira/browse/CASSANDRA-6577 Project: Cassandra Issue Type: Bug Environment: AWS EC2 machines, java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Each node has 32 vnodes. Reporter: Shao-Chuan Wang The node is leaving and I wanted to check its netstats, but it raises ConcurrentModificationException. {code} [ubuntu@ip-10-4-202-48 :~]# /mnt/cassandra_latest/bin/nodetool netstats Mode: LEAVING Exception in thread main java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926) at java.util.HashMap$ValueIterator.next(HashMap.java:954) at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) at com.google.common.collect.Iterators.addAll(Iterators.java:357) at com.google.common.collect.Lists.newArrayList(Lists.java:146) at com.google.common.collect.Lists.newArrayList(Lists.java:128) at org.apache.cassandra.streaming.management.SessionInfoCompositeData.toArrayOfCompositeData(SessionInfoCompositeData.java:161) at org.apache.cassandra.streaming.management.SessionInfoCompositeData.toCompositeData(SessionInfoCompositeData.java:98) at org.apache.cassandra.streaming.management.StreamStateCompositeData$1.apply(StreamStateCompositeData.java:82) at org.apache.cassandra.streaming.management.StreamStateCompositeData$1.apply(StreamStateCompositeData.java:79) at com.google.common.collect.Iterators$8.transform(Iterators.java:794) at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) at com.google.common.collect.Iterators.addAll(Iterators.java:357) at com.google.common.collect.Lists.newArrayList(Lists.java:146) at com.google.common.collect.Lists.newArrayList(Lists.java:128) at org.apache.cassandra.streaming.management.StreamStateCompositeData.toCompositeData(StreamStateCompositeData.java:78) at org.apache.cassandra.streaming.StreamManager$1.apply(StreamManager.java:87) at org.apache.cassandra.streaming.StreamManager$1.apply(StreamManager.java:84) at com.google.common.collect.Iterators$8.transform(Iterators.java:794) at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) at com.google.common.collect.Iterators.addAll(Iterators.java:357) at com.google.common.collect.Sets.newHashSet(Sets.java:238) at com.google.common.collect.Sets.newHashSet(Sets.java:218) at org.apache.cassandra.streaming.StreamManager.getCurrentStreams(StreamManager.java:83) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1464) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:657) at sun.reflect.GeneratedMethodAccessor54.invoke(Unknown Source) at
[jira] [Commented] (CASSANDRA-6577) ConcurrentModificationException during nodetool netstats
[ https://issues.apache.org/jira/browse/CASSANDRA-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868951#comment-13868951 ] Shao-Chuan Wang commented on CASSANDRA-6577: Running nodetool removenode force failed. {code} ubuntu@ip-10-71-141-158:~$ /mnt/cassandra_latest/bin/nodetool removenode force RemovalStatus: Removing token (-8408964321996035122). Waiting for replication confirmation from [/10.97.135.32]. Exception in thread main java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926) at java.util.HashMap$KeyIterator.next(HashMap.java:960) at org.apache.cassandra.service.StorageService.forceRemoveCompletion(StorageService.java:3041) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:75) at sun.reflect.GeneratedMethodAccessor11.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:279) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1487) at javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:97) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1328) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1420) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:848) at sun.reflect.GeneratedMethodAccessor37.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322) at sun.rmi.transport.Transport$1.run(Transport.java:177) at sun.rmi.transport.Transport$1.run(Transport.java:174) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:173) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:556) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:811) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:670) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) {code} ConcurrentModificationException during nodetool netstats Key: CASSANDRA-6577 URL: https://issues.apache.org/jira/browse/CASSANDRA-6577 Project: Cassandra Issue Type: Bug Environment: AWS EC2 machines, java version 1.7.0_45 Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) Each node has 32 vnodes. Reporter: Shao-Chuan Wang Labels: decommission, nodetool The node is leaving and I wanted to check its netstats, but it raises ConcurrentModificationException. {code} [ubuntu@ip-10-4-202-48 :~]# /mnt/cassandra_latest/bin/nodetool netstats Mode: LEAVING Exception in thread main java.util.ConcurrentModificationException at java.util.HashMap$HashIterator.nextEntry(HashMap.java:926) at java.util.HashMap$ValueIterator.next(HashMap.java:954) at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48) at com.google.common.collect.Iterators.addAll(Iterators.java:357) at com.google.common.collect.Lists.newArrayList(Lists.java:146) at
[jira] [Commented] (CASSANDRA-6542) nodetool removenode hangs
[ https://issues.apache.org/jira/browse/CASSANDRA-6542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868463#comment-13868463 ] Shao-Chuan Wang commented on CASSANDRA-6542: We have seen the similar behavior here. We are using 2.0.1. We have no choices but to call *removenode force* to remove it from the ring. nodetool removenode hangs - Key: CASSANDRA-6542 URL: https://issues.apache.org/jira/browse/CASSANDRA-6542 Project: Cassandra Issue Type: Bug Components: Core Environment: Ubuntu 12, 1.2.11 DSE Reporter: Eric Lubow Running *nodetool removenode $host-id* doesn't actually remove the node from the ring. I've let it run anywhere from 5 minutes to 3 days and there are no messages in the log about it hanging or failing, the command just sits there running. So the regular response has been to run *nodetool removenode $host-id*, give it about 10-15 minutes and then run *nodetool removenode force*. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6469) FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer
[ https://issues.apache.org/jira/browse/CASSANDRA-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868481#comment-13868481 ] Shao-Chuan Wang commented on CASSANDRA-6469: ERROR [ReadStage:54] 2014-01-10 02:32:34,224 CassandraDaemon.java (line 185) Exception in thread Thread[ReadStage:54,5,main] FSReadError in /mnt/cassandra/data/pagedb/tablename/maskedname-jb-57-Data.db at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:95) at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:280) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1163) at org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:65) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:42) at org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:171) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:62) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:249) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1468) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1294) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:43) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99) at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:250) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:101) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:87) ... 19 more FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer - Key: CASSANDRA-6469 URL: https://issues.apache.org/jira/browse/CASSANDRA-6469 Project: Cassandra Issue Type: Bug Environment: Linux Reporter: Shao-Chuan Wang Fix For: 2.0.1 We are seeing FSError in the code path of SSTableReader. Noted that, the file can be read, so suggesting it should not be the file corruption. The FileChannel is closed when it tries to call position(). It is not very easily reproducible. We'll paste here if we hit this again. Thank you. ERROR [ReadStage:4332] 2013-12-09 06:13:01,857 CassandraDaemon.java (line 185) Exception in thread Thread[ReadStage:4332,5,main] FSReadError in /mnt/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-jb-342-Data.db at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:95) at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:280) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1163) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:362) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at
[jira] [Commented] (CASSANDRA-6564) Gossiper failed with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CASSANDRA-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868489#comment-13868489 ] Shao-Chuan Wang commented on CASSANDRA-6564: The cluster was upgraded from 1.2.8, but that was long time ago. Noted I removed the data, and started from scratch, and after several retries, they finally joined the cluster. Hopefully, it's useful. Here is the gossip info: /10.215.114.239 LOAD:6.42133986848E11 RACK:us-east-1c HOST_ID:d13374d8-e4ae-466d-ad5a-44229a2fa190 RPC_ADDRESS:0.0.0.0 SEVERITY:0.0 SCHEMA:4909e1b6-7f1d-3882-bee2-d345d2d3df17 DC:pagedb-frontend STATUS:NORMAL,-1305722878370288637 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.122.218.80 LOAD:2.70118657429E11 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:7746a875-1e53-4ebe-aef6-ad59fadd6ea7 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1796000685025988745 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.178.13.230 LOAD:2.17601876297E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:160d83bf-4cc0-4872-aec6-7908446eccbf SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1646882803236172485 DC:pagedb-backend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.71.141.179 LOAD:2.20270818508E11 RACK:us-east-1c HOST_ID:baa81888-6417-4e7c-8d7b-2bb79c38ca40 SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 DC:pagedb-frontend STATUS:NORMAL,-1277850891915823266 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.95.128.214 LOAD:1.46145062779E11 RACK:us-east-1e HOST_ID:3e5e6e75-56cb-4664-abf4-de8c2801bd5d SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:ab0684d8-b50f-3d99-bc2d-7aa021db8131 DC:pagedb-backend STATUS:NORMAL,-1896996972976095280 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.251.114.31 LOAD:2.95507989207E11 RACK:us-east-1b HOST_ID:a71e7037-c4f8-4ff9-864d-e83b6d3037d7 SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:70359e7a-d66e-3e72-a83d-c65b2b0db916 DC:pagedb-frontend STATUS:NORMAL,-1866613386904756042 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.228.26.240 LOAD:1.45711328732E11 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:8df816c9-7875-4106-aeda-b372b9f1fdc9 SEVERITY:1.7857143878936768 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1359255731430490566 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.185.67.195 LOAD:4.5836673634E10 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:37fe7cc7-3481-48b3-96ff-c92df17a4132 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1295072457639851651 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.220.195.198 LOAD:7.7243382968E10 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:0bd7f3ab-d347-4c59-9f1a-1b7104e34a6b SEVERITY:8.875740051269531 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1514821029440891529 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.62.39.130 LOAD:1.67888679362E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:13b12ee3-36a4-462c-9013-ccc1b83a70ff SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1089815038805941976 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.137.11.197 LOAD:1.63784332779E11 RACK:us-east-1c RPC_ADDRESS:0.0.0.0 HOST_ID:76d339f4-ac4a-4f95-bada-1ec17f67f4e3 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1708596056219631840 DC:pagedb-backend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.87.145.85 LOAD:5.78900708105E11 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:bdad1b28-3fb7-4788-8f34-01caace03ca9 SEVERITY:3.8759689331054688 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-2223601356751123985 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.97.135.36 LOAD:9.184614E10 RACK:us-east-1c HOST_ID:40fb13d6-0803-45cc-9b4f-45b3ad7194fa SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 DC:pagedb-frontend STATUS:NORMAL,-1146411778648768253 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.95.128.6 LOAD:2.86435047103E11 RACK:us-east-1e HOST_ID:80b583a5-8f7d-4fee-91db-b948c090d055 SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 DC:pagedb-backend STATUS:NORMAL,-1025262993714352739 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.154.136.39 LOAD:1.6809207E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:0e391fea-e4e9-4a46-b9af-87948459876c SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-2690692580263318876 DC:pagedb-backend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.185.9.84 LOAD:1.65537034566E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:080241a9-eadd-48b4-8f94-efbe35bfd6e1 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4
[jira] [Commented] (CASSANDRA-6565) New node refuses to join the ring.
[ https://issues.apache.org/jira/browse/CASSANDRA-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868493#comment-13868493 ] Shao-Chuan Wang commented on CASSANDRA-6565: Does this implies if some nodes are down, we can't possibly add new nodes before either bringing them back or removing them from the ring? New node refuses to join the ring. -- Key: CASSANDRA-6565 URL: https://issues.apache.org/jira/browse/CASSANDRA-6565 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed ARN [STREAM-IN-/10.4.197.53] 2014-01-09 19:41:40,418 StreamResultFuture.java (line 209) [Stream #e515d6e0-795d-11e3-b74a-b72892248056] Stream failed ERROR [main] 2014-01-09 19:41:40,418 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:901) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:670) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:529) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:428) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:343) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:321) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:501) at org.apache.cassandra.streaming.StreamSession.messageReceived(Stre amSession.java:376)at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) at java.lang.Thread.run(Thread.java:744) ERROR [StorageServiceShutdownHook] 2014-01-09 19:41:40,428 CassandraDaemon.java (line 185) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:312) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:361) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:96) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:494) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (CASSANDRA-6564) Gossiper failed with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CASSANDRA-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868489#comment-13868489 ] Shao-Chuan Wang edited comment on CASSANDRA-6564 at 1/10/14 11:59 PM: -- The cluster was upgraded from 1.2.8, but that was long time ago. Noted I removed the data folder, and started joining from scratch, and after several retries, they finally joined the cluster. Hopefully, the following information is useful. Here is the gossip info: /10.215.114.239 LOAD:6.42133986848E11 RACK:us-east-1c HOST_ID:d13374d8-e4ae-466d-ad5a-44229a2fa190 RPC_ADDRESS:0.0.0.0 SEVERITY:0.0 SCHEMA:4909e1b6-7f1d-3882-bee2-d345d2d3df17 DC:pagedb-frontend STATUS:NORMAL,-1305722878370288637 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.122.218.80 LOAD:2.70118657429E11 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:7746a875-1e53-4ebe-aef6-ad59fadd6ea7 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1796000685025988745 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.178.13.230 LOAD:2.17601876297E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:160d83bf-4cc0-4872-aec6-7908446eccbf SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1646882803236172485 DC:pagedb-backend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.71.141.179 LOAD:2.20270818508E11 RACK:us-east-1c HOST_ID:baa81888-6417-4e7c-8d7b-2bb79c38ca40 SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 DC:pagedb-frontend STATUS:NORMAL,-1277850891915823266 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.95.128.214 LOAD:1.46145062779E11 RACK:us-east-1e HOST_ID:3e5e6e75-56cb-4664-abf4-de8c2801bd5d SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:ab0684d8-b50f-3d99-bc2d-7aa021db8131 DC:pagedb-backend STATUS:NORMAL,-1896996972976095280 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.251.114.31 LOAD:2.95507989207E11 RACK:us-east-1b HOST_ID:a71e7037-c4f8-4ff9-864d-e83b6d3037d7 SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:70359e7a-d66e-3e72-a83d-c65b2b0db916 DC:pagedb-frontend STATUS:NORMAL,-1866613386904756042 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.228.26.240 LOAD:1.45711328732E11 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:8df816c9-7875-4106-aeda-b372b9f1fdc9 SEVERITY:1.7857143878936768 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1359255731430490566 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.185.67.195 LOAD:4.5836673634E10 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:37fe7cc7-3481-48b3-96ff-c92df17a4132 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1295072457639851651 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.220.195.198 LOAD:7.7243382968E10 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:0bd7f3ab-d347-4c59-9f1a-1b7104e34a6b SEVERITY:8.875740051269531 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1514821029440891529 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.62.39.130 LOAD:1.67888679362E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:13b12ee3-36a4-462c-9013-ccc1b83a70ff SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1089815038805941976 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.137.11.197 LOAD:1.63784332779E11 RACK:us-east-1c RPC_ADDRESS:0.0.0.0 HOST_ID:76d339f4-ac4a-4f95-bada-1ec17f67f4e3 SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-1708596056219631840 DC:pagedb-backend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.87.145.85 LOAD:5.78900708105E11 RACK:us-east-1a RPC_ADDRESS:0.0.0.0 HOST_ID:bdad1b28-3fb7-4788-8f34-01caace03ca9 SEVERITY:3.8759689331054688 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-2223601356751123985 DC:pagedb-frontend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.97.135.36 LOAD:9.184614E10 RACK:us-east-1c HOST_ID:40fb13d6-0803-45cc-9b4f-45b3ad7194fa SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 DC:pagedb-frontend STATUS:NORMAL,-1146411778648768253 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.95.128.6 LOAD:2.86435047103E11 RACK:us-east-1e HOST_ID:80b583a5-8f7d-4fee-91db-b948c090d055 SEVERITY:0.0 RPC_ADDRESS:0.0.0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 DC:pagedb-backend STATUS:NORMAL,-1025262993714352739 RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.154.136.39 LOAD:1.6809207E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0 HOST_ID:0e391fea-e4e9-4a46-b9af-87948459876c SEVERITY:0.0 SCHEMA:c85aa256-c8a3-3122-bc16-a4cd5c9032d4 STATUS:NORMAL,-2690692580263318876 DC:pagedb-backend RELEASE_VERSION:2.0.1 NET_VERSION:7 /10.185.9.84 LOAD:1.65537034566E11 RACK:us-east-1d RPC_ADDRESS:0.0.0.0
[jira] [Updated] (CASSANDRA-6469) FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer
[ https://issues.apache.org/jira/browse/CASSANDRA-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shao-Chuan Wang updated CASSANDRA-6469: --- Fix Version/s: (was: 2.0.1) FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer - Key: CASSANDRA-6469 URL: https://issues.apache.org/jira/browse/CASSANDRA-6469 Project: Cassandra Issue Type: Bug Environment: Linux Reporter: Shao-Chuan Wang We are seeing FSError in the code path of SSTableReader. Noted that, the file can be read, so suggesting it should not be the file corruption. The FileChannel is closed when it tries to call position(). It is not very easily reproducible. We'll paste here if we hit this again. Thank you. ERROR [ReadStage:4332] 2013-12-09 06:13:01,857 CassandraDaemon.java (line 185) Exception in thread Thread[ReadStage:4332,5,main] FSReadError in /mnt/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-jb-342-Data.db at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:95) at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:280) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1163) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:362) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.init(MergeIterator.java:87) at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1468) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1294) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1365) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1897) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99) at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:250) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:101) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:87) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (CASSANDRA-6469) FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer
[ https://issues.apache.org/jira/browse/CASSANDRA-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shao-Chuan Wang resolved CASSANDRA-6469. Resolution: Not A Problem FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer - Key: CASSANDRA-6469 URL: https://issues.apache.org/jira/browse/CASSANDRA-6469 Project: Cassandra Issue Type: Bug Environment: Linux Reporter: Shao-Chuan Wang We are seeing FSError in the code path of SSTableReader. Noted that, the file can be read, so suggesting it should not be the file corruption. The FileChannel is closed when it tries to call position(). It is not very easily reproducible. We'll paste here if we hit this again. Thank you. ERROR [ReadStage:4332] 2013-12-09 06:13:01,857 CassandraDaemon.java (line 185) Exception in thread Thread[ReadStage:4332,5,main] FSReadError in /mnt/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-jb-342-Data.db at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:95) at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:280) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1163) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:362) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.init(MergeIterator.java:87) at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1468) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1294) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1365) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1897) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99) at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:250) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:101) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:87) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6565) New node refuses to join the ring.
[ https://issues.apache.org/jira/browse/CASSANDRA-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868693#comment-13868693 ] Shao-Chuan Wang commented on CASSANDRA-6565: Noted that we have replication factor 3 in both data center. So ideally, we have 6 replicas. Also, we saw from nodetool status, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this node node, it finally joined the ring. But the joining new nodes have become a very painful process for us. New node refuses to join the ring. -- Key: CASSANDRA-6565 URL: https://issues.apache.org/jira/browse/CASSANDRA-6565 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed ARN [STREAM-IN-/10.4.197.53] 2014-01-09 19:41:40,418 StreamResultFuture.java (line 209) [Stream #e515d6e0-795d-11e3-b74a-b72892248056] Stream failed ERROR [main] 2014-01-09 19:41:40,418 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:901) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:670) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:529) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:428) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:343) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:321) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:501) at org.apache.cassandra.streaming.StreamSession.messageReceived(Stre amSession.java:376)at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) at java.lang.Thread.run(Thread.java:744) ERROR [StorageServiceShutdownHook] 2014-01-09 19:41:40,428 CassandraDaemon.java (line 185) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:312) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:361) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:96) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:494) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (CASSANDRA-6565) New node refuses to join the ring.
[ https://issues.apache.org/jira/browse/CASSANDRA-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868693#comment-13868693 ] Shao-Chuan Wang edited comment on CASSANDRA-6565 at 1/11/14 7:11 AM: - Noted that we have replication factor 3 in both data centers. So ideally, we have 6 replicas. Also, we saw from nodetool status, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this node node, it finally joined the ring. But the joining new nodes have become a very painful process for us. was (Author: shaochuan): Noted that we have replication factor 3 in both data center. So ideally, we have 6 replicas. Also, we saw from nodetool status, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this node node, it finally joined the ring. But the joining new nodes have become a very painful process for us. New node refuses to join the ring. -- Key: CASSANDRA-6565 URL: https://issues.apache.org/jira/browse/CASSANDRA-6565 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed ARN [STREAM-IN-/10.4.197.53] 2014-01-09 19:41:40,418 StreamResultFuture.java (line 209) [Stream #e515d6e0-795d-11e3-b74a-b72892248056] Stream failed ERROR [main] 2014-01-09 19:41:40,418 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:901) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:670) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:529) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:428) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:343) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:321) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:501) at org.apache.cassandra.streaming.StreamSession.messageReceived(Stre amSession.java:376)at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) at java.lang.Thread.run(Thread.java:744) ERROR [StorageServiceShutdownHook] 2014-01-09 19:41:40,428 CassandraDaemon.java (line 185) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:312) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:361) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:96) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:494) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (CASSANDRA-6565) New node refuses to join the ring.
[ https://issues.apache.org/jira/browse/CASSANDRA-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868693#comment-13868693 ] Shao-Chuan Wang edited comment on CASSANDRA-6565 at 1/11/14 7:11 AM: - Noted that we have replication factor 3 in both data centers. So ideally, we have 6 replicas. Also, we saw from *nodetool status*, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this new node, it finally joined the ring. But the joining new nodes have become a very painful process for us. was (Author: shaochuan): Noted that we have replication factor 3 in both data centers. So ideally, we have 6 replicas. Also, we saw from *nodetool status*, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this node node, it finally joined the ring. But the joining new nodes have become a very painful process for us. New node refuses to join the ring. -- Key: CASSANDRA-6565 URL: https://issues.apache.org/jira/browse/CASSANDRA-6565 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed ARN [STREAM-IN-/10.4.197.53] 2014-01-09 19:41:40,418 StreamResultFuture.java (line 209) [Stream #e515d6e0-795d-11e3-b74a-b72892248056] Stream failed ERROR [main] 2014-01-09 19:41:40,418 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:901) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:670) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:529) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:428) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:343) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:321) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:501) at org.apache.cassandra.streaming.StreamSession.messageReceived(Stre amSession.java:376)at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) at java.lang.Thread.run(Thread.java:744) ERROR [StorageServiceShutdownHook] 2014-01-09 19:41:40,428 CassandraDaemon.java (line 185) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:312) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:361) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:96) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:494) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (CASSANDRA-6565) New node refuses to join the ring.
[ https://issues.apache.org/jira/browse/CASSANDRA-6565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13868693#comment-13868693 ] Shao-Chuan Wang edited comment on CASSANDRA-6565 at 1/11/14 7:11 AM: - Noted that we have replication factor 3 in both data centers. So ideally, we have 6 replicas. Also, we saw from *nodetool status*, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this node node, it finally joined the ring. But the joining new nodes have become a very painful process for us. was (Author: shaochuan): Noted that we have replication factor 3 in both data centers. So ideally, we have 6 replicas. Also, we saw from nodetool status, some of nodes appear to be down but they are not. Could it because the replicas owners are too busy or network is too slow such that it only saw one available replica when this brand new node was trying to join. Noted that by retrying a lot of times of adding this node node, it finally joined the ring. But the joining new nodes have become a very painful process for us. New node refuses to join the ring. -- Key: CASSANDRA-6565 URL: https://issues.apache.org/jira/browse/CASSANDRA-6565 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed ARN [STREAM-IN-/10.4.197.53] 2014-01-09 19:41:40,418 StreamResultFuture.java (line 209) [Stream #e515d6e0-795d-11e3-b74a-b72892248056] Stream failed ERROR [main] 2014-01-09 19:41:40,418 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:901) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:670) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:529) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:428) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:343) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:321) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:501) at org.apache.cassandra.streaming.StreamSession.messageReceived(Stre amSession.java:376)at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) at java.lang.Thread.run(Thread.java:744) ERROR [StorageServiceShutdownHook] 2014-01-09 19:41:40,428 CassandraDaemon.java (line 185) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:312) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:361) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:96) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:494) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (CASSANDRA-6565) New node refuses to join the ring.
Shao-Chuan Wang created CASSANDRA-6565: -- Summary: New node refuses to join the ring. Key: CASSANDRA-6565 URL: https://issues.apache.org/jira/browse/CASSANDRA-6565 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed ARN [STREAM-IN-/10.4.197.53] 2014-01-09 19:41:40,418 StreamResultFuture.java (line 209) [Stream #e515d6e0-795d-11e3-b74a-b72892248056] Stream failed ERROR [main] 2014-01-09 19:41:40,418 CassandraDaemon.java (line 459) Exception encountered during startup java.lang.RuntimeException: Error during boostrap: Stream failed at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:86) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:901) at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:670) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:529) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:428) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:343) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: org.apache.cassandra.streaming.StreamException: Stream failed at org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:210) at org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:185) at org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:321) at org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:501) at org.apache.cassandra.streaming.StreamSession.messageReceived(Stre amSession.java:376)at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:293) at java.lang.Thread.run(Thread.java:744) ERROR [StorageServiceShutdownHook] 2014-01-09 19:41:40,428 CassandraDaemon.java (line 185) Exception in thread Thread[StorageServiceShutdownHook,5,main] java.lang.NullPointerException at org.apache.cassandra.service.StorageService.stopRPCServer(StorageService.java:312) at org.apache.cassandra.service.StorageService.shutdownClientServers(StorageService.java:361) at org.apache.cassandra.service.StorageService.access$000(StorageService.java:96) at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:494) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (CASSANDRA-6564) Gossiper failed with ArrayIndexOutOfBoundsException
Shao-Chuan Wang created CASSANDRA-6564: -- Summary: Gossiper failed with ArrayIndexOutOfBoundsException Key: CASSANDRA-6564 URL: https://issues.apache.org/jira/browse/CASSANDRA-6564 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang Priority: Critical We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed with ArrayIndexOutOfBoundsException: java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1594) at org.apache.cassandra.service.StorageService.handleStateRemoving(StorageService.java:1550) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1174) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1887) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:844) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:922) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6564) Gossiper failed with ArrayIndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/CASSANDRA-6564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13867489#comment-13867489 ] Shao-Chuan Wang commented on CASSANDRA-6564: This is a new node that I added to join the ring. Gossiper failed with ArrayIndexOutOfBoundsException --- Key: CASSANDRA-6564 URL: https://issues.apache.org/jira/browse/CASSANDRA-6564 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang Assignee: Tyler Hobbs Priority: Minor We have 30 nodes in one DC, 25 nodes in another. We are running 2.0.1. Two nodes are joining the ring, but one of them failed with ArrayIndexOutOfBoundsException: java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.cassandra.service.StorageService.extractExpireTime(StorageService.java:1594) at org.apache.cassandra.service.StorageService.handleStateRemoving(StorageService.java:1550) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:1174) at org.apache.cassandra.service.StorageService.onJoin(StorageService.java:1887) at org.apache.cassandra.gms.Gossiper.handleMajorStateChange(Gossiper.java:844) at org.apache.cassandra.gms.Gossiper.applyStateLocally(Gossiper.java:922) at org.apache.cassandra.gms.GossipDigestAck2VerbHandler.doVerb(GossipDigestAck2VerbHandler.java:49) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (CASSANDRA-6494) Cassandra refuses to restart due to a corrupted commit log.
[ https://issues.apache.org/jira/browse/CASSANDRA-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859833#comment-13859833 ] Shao-Chuan Wang commented on CASSANDRA-6494: We did drop the column and recreated the same column family with different schema. Cassandra refuses to restart due to a corrupted commit log. --- Key: CASSANDRA-6494 URL: https://issues.apache.org/jira/browse/CASSANDRA-6494 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang This is running on our production server. Please advise how to address this issue. Thank you! INFO 02:46:58,879 Finished reading /mnt/cassandra/commitlog/CommitLog-3-1386069222785.log ERROR 02:46:58,879 Exception encountered during startup java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407) ... 8 more Caused by: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.db.marshal.ColumnToCollectionType.compareCollectionMembers(ColumnToCollectionType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1192) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:323) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:195) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:196) at org.apache.cassandra.db.Memtable.put(Memtable.java:160) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at
[jira] [Comment Edited] (CASSANDRA-6494) Cassandra refuses to restart due to a corrupted commit log.
[ https://issues.apache.org/jira/browse/CASSANDRA-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13859833#comment-13859833 ] Shao-Chuan Wang edited comment on CASSANDRA-6494 at 1/1/14 5:27 AM: We did drop the column family and recreated the same column family with different schema. was (Author: shaochuan): We did drop the column and recreated the same column family with different schema. Cassandra refuses to restart due to a corrupted commit log. --- Key: CASSANDRA-6494 URL: https://issues.apache.org/jira/browse/CASSANDRA-6494 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang This is running on our production server. Please advise how to address this issue. Thank you! INFO 02:46:58,879 Finished reading /mnt/cassandra/commitlog/CommitLog-3-1386069222785.log ERROR 02:46:58,879 Exception encountered during startup java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407) ... 8 more Caused by: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.db.marshal.ColumnToCollectionType.compareCollectionMembers(ColumnToCollectionType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1192) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:323) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:195) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:196) at org.apache.cassandra.db.Memtable.put(Memtable.java:160) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at
[jira] [Created] (CASSANDRA-6494) Cassandra refuses to restart due to a corrupted commit log.
Shao-Chuan Wang created CASSANDRA-6494: -- Summary: Cassandra refuses to restart due to a corrupted commit log. Key: CASSANDRA-6494 URL: https://issues.apache.org/jira/browse/CASSANDRA-6494 Project: Cassandra Issue Type: Bug Reporter: Shao-Chuan Wang This is running on our production server. Please advise how to address this issue. Thank you! INFO 02:46:58,879 Finished reading /mnt/cassandra/commitlog/CommitLog-3-1386069222785.log ERROR 02:46:58,879 Exception encountered during startup java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:485) Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:188) at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:407) ... 8 more Caused by: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.db.marshal.ColumnToCollectionType.compareCollectionMembers(ColumnToCollectionType.java:72) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:85) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:35) at edu.stanford.ppl.concurrent.SnapTreeMap$1.compareTo(SnapTreeMap.java:538) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1108) at edu.stanford.ppl.concurrent.SnapTreeMap.attemptUpdate(SnapTreeMap.java:1192) at edu.stanford.ppl.concurrent.SnapTreeMap.updateUnderRoot(SnapTreeMap.java:1059) at edu.stanford.ppl.concurrent.SnapTreeMap.update(SnapTreeMap.java:1023) at edu.stanford.ppl.concurrent.SnapTreeMap.putIfAbsent(SnapTreeMap.java:985) at org.apache.cassandra.db.AtomicSortedColumns$Holder.addColumn(AtomicSortedColumns.java:323) at org.apache.cassandra.db.AtomicSortedColumns.addAllWithSizeDelta(AtomicSortedColumns.java:195) at org.apache.cassandra.db.Memtable.resolve(Memtable.java:196) at org.apache.cassandra.db.Memtable.put(Memtable.java:160) at org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:842) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:373) at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:338) at org.apache.cassandra.db.commitlog.CommitLogReplayer$1.runMayThrow(CommitLogReplayer.java:265) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: 706167655f74616773 is not defined as a collection at org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:411) at org.apache.cassandra.utils.FBUtilities.waitOnFutures(FBUtilities.java:400) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:273) at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:96) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:146) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:126) at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:299) at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:442) at
[jira] [Created] (CASSANDRA-6469) FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer
Shao-Chuan Wang created CASSANDRA-6469: -- Summary: FSError in org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer Key: CASSANDRA-6469 URL: https://issues.apache.org/jira/browse/CASSANDRA-6469 Project: Cassandra Issue Type: Bug Environment: Linux Reporter: Shao-Chuan Wang Fix For: 2.0.1 We are seeing FSError in the code path of SSTableReader. Noted that, the file can be read, so suggesting it should not be the file corruption. The FileChannel is closed when it tries to call position(). It is not very easily reproducible. We'll paste here if we hit this again. Thank you. ERROR [ReadStage:4332] 2013-12-09 06:13:01,857 CassandraDaemon.java (line 185) Exception in thread Thread[ReadStage:4332,5,main] FSReadError in /mnt/cassandra/data/system/schema_columnfamilies/system-schema_columnfamilies-jb-342-Data.db at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:95) at org.apache.cassandra.io.util.RandomAccessReader.seek(RandomAccessReader.java:280) at org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41) at org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:1163) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.getNextBlock(IndexedSliceReader.java:362) at org.apache.cassandra.db.columniterator.IndexedSliceReader$IndexedBlockFetcher.fetchMoreData(IndexedSliceReader.java:332) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:145) at org.apache.cassandra.db.columniterator.IndexedSliceReader.computeNext(IndexedSliceReader.java:45) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.init(MergeIterator.java:87) at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:294) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:53) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1468) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1294) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:332) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1365) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1897) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:99) at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:250) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.decompressChunk(CompressedRandomAccessReader.java:101) at org.apache.cassandra.io.compress.CompressedRandomAccessReader.reBuffer(CompressedRandomAccessReader.java:87) -- This message was sent by Atlassian JIRA (v6.1.4#6159)