Thanks for this pointer but I don't think this is the source of our problem since we use 1 data center and Ec2Snitch.
2013/3/14 Jean-Armel Luce <jaluc...@gmail.com> > Hi Alain, > > Maybe it is due to https://issues.apache.org/jira/browse/CASSANDRA-5299 > > A patch is provided with this ticket. > > Regards. > > Jean Armel > > > 2013/3/14 Alain RODRIGUEZ <arodr...@gmail.com> > >> Hi >> >> We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2. >> >> This has been a disaster. I just switch one node to 1.2.2, updated its >> configuration (cassandra.yaml / cassandra-env.sh) and restart it. >> >> It resulted on error on all the 5 remaining 1.1.6 nodes : >> >> ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750 >> AbstractCassandraDaemon.java (line 135) Exception in thread >> Thread[RequestResponseStage:2,5,main] >> java.io.IOError: java.io.EOFException >> at >> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71) >> at >> org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:155) >> at >> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45) >> at >> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> Caused by: java.io.EOFException >> at java.io.DataInputStream.readFully(DataInputStream.java:180) >> at >> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:100) >> at >> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:81) >> at >> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64) >> ... 6 more >> >> I had this a lot of times, and my entire cluster wasn't reachable by our >> 4 clients (phpCassa, Hector, Cassie, Helenus) >> >> I decommissioned the 1.2.2 node to get our cluster answering queries. It >> worked. >> >> Then I tried to replace this node by a new C*1.1.6 one with the same >> token as the previous node decommissioned. The node joined the ring and >> before getting any data switch to normal status. >> >> In all the other nodes I had : >> >> ERROR [MutationStage:8] 2013-03-14 10:21:01,288 >> AbstractCassandraDaemon.java (line 135) Exception in thread >> Thread[MutationStage:8,5,main] >> java.lang.AssertionError >> at >> org.apache.cassandra.locator.TokenMetadata.getToken(TokenMetadata.java:304) >> at >> org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:371) >> at >> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439) >> at >> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> >> So I decommissioned this new 1.1.6 node and we are now running with 5 >> servers, not balanced along the ring, without any possibility of adding >> nodes, nor upgradinc C* version. >> >> We are quite desperate over here. >> >> If someone has any idea of what could happened and how to stabilize the >> cluster, it will be very appreciated. >> >> It's quite an emergency since we can't add nodes and are under heavy load. >> >> >