Well it seems I have nothing like this when I run a $grep "Unknown host" /var/log/cassandra/system.log.
This issue was reported in 1.2.1 and commited to the trunk. It may have been fixed in 1.2.2 even if I can't see the release version from the jira nor can I see it in the changelog. Thanks again even if I am still in troubles. 2013/3/14 Michal Michalski <mich...@opera.com> > Just to make it clear: This bug will occur on single-DC configuration too. > > In our case it resulted in Exception like this at the very end of node > startup: > > ERROR [WRITE-/<SOME-IP>] 2013-02-27 12:14:55,433 CassandraDaemon.java > (line 133) Exception in thread Thread[WRITE-/<SOME-IP>,5,**main] > java.lang.RuntimeException: Unknown host /0.0.0.0 with no default > configured > > It will happen if your rpc_address is set to 0.0.0.0. > > M. > > W dniu 14.03.2013 13:03, Alain RODRIGUEZ pisze: > > Thanks for this pointer but I don't think this is the source of our >> problem >> since we use 1 data center and Ec2Snitch. >> >> >> >> 2013/3/14 Jean-Armel Luce <jaluc...@gmail.com> >> >> Hi Alain, >>> >>> Maybe it is due to https://issues.apache.org/** >>> jira/browse/CASSANDRA-5299<https://issues.apache.org/jira/browse/CASSANDRA-5299> >>> >>> A patch is provided with this ticket. >>> >>> Regards. >>> >>> Jean Armel >>> >>> >>> 2013/3/14 Alain RODRIGUEZ <arodr...@gmail.com> >>> >>> Hi >>>> >>>> We just tried to migrate our production cluster from C* 1.1.6 to 1.2.2. >>>> >>>> This has been a disaster. I just switch one node to 1.2.2, updated its >>>> configuration (cassandra.yaml / cassandra-env.sh) and restart it. >>>> >>>> It resulted on error on all the 5 remaining 1.1.6 nodes : >>>> >>>> ERROR [RequestResponseStage:2] 2013-03-14 09:53:25,750 >>>> AbstractCassandraDaemon.java (line 135) Exception in thread >>>> Thread[RequestResponseStage:2,**5,main] >>>> java.io.IOError: java.io.EOFException >>>> at >>>> org.apache.cassandra.service.**AbstractRowResolver.**preprocess(** >>>> AbstractRowResolver.java:71) >>>> at >>>> org.apache.cassandra.service.**ReadCallback.response(** >>>> ReadCallback.java:155) >>>> at >>>> org.apache.cassandra.net.**ResponseVerbHandler.doVerb(** >>>> ResponseVerbHandler.java:45) >>>> at >>>> org.apache.cassandra.net.**MessageDeliveryTask.run(** >>>> MessageDeliveryTask.java:59) >>>> at >>>> java.util.concurrent.**ThreadPoolExecutor$Worker.** >>>> runTask(ThreadPoolExecutor.**java:886) >>>> at >>>> java.util.concurrent.**ThreadPoolExecutor$Worker.run(** >>>> ThreadPoolExecutor.java:908) >>>> at java.lang.Thread.run(Thread.**java:662) >>>> Caused by: java.io.EOFException >>>> at java.io.DataInputStream.**readFully(DataInputStream.** >>>> java:180) >>>> at >>>> org.apache.cassandra.db.**ReadResponseSerializer.** >>>> deserialize(ReadResponse.java:**100) >>>> at >>>> org.apache.cassandra.db.**ReadResponseSerializer.** >>>> deserialize(ReadResponse.java:**81) >>>> at >>>> org.apache.cassandra.service.**AbstractRowResolver.**preprocess(** >>>> AbstractRowResolver.java:64) >>>> ... 6 more >>>> >>>> I had this a lot of times, and my entire cluster wasn't reachable by our >>>> 4 clients (phpCassa, Hector, Cassie, Helenus) >>>> >>>> I decommissioned the 1.2.2 node to get our cluster answering queries. It >>>> worked. >>>> >>>> Then I tried to replace this node by a new C*1.1.6 one with the same >>>> token as the previous node decommissioned. The node joined the ring and >>>> before getting any data switch to normal status. >>>> >>>> In all the other nodes I had : >>>> >>>> ERROR [MutationStage:8] 2013-03-14 10:21:01,288 >>>> AbstractCassandraDaemon.java (line 135) Exception in thread >>>> Thread[MutationStage:8,5,main] >>>> java.lang.AssertionError >>>> at >>>> org.apache.cassandra.locator.**TokenMetadata.getToken(** >>>> TokenMetadata.java:304) >>>> at >>>> org.apache.cassandra.service.**StorageProxy$5.runMayThrow(** >>>> StorageProxy.java:371) >>>> at >>>> org.apache.cassandra.utils.**WrappedRunnable.run(** >>>> WrappedRunnable.java:30) >>>> at >>>> java.util.concurrent.**Executors$RunnableAdapter.** >>>> call(Executors.java:439) >>>> at >>>> java.util.concurrent.**FutureTask$Sync.innerRun(**FutureTask.java:303) >>>> at java.util.concurrent.**FutureTask.run(FutureTask.** >>>> java:138) >>>> at >>>> java.util.concurrent.**ThreadPoolExecutor$Worker.** >>>> runTask(ThreadPoolExecutor.**java:886) >>>> at >>>> java.util.concurrent.**ThreadPoolExecutor$Worker.run(** >>>> ThreadPoolExecutor.java:908) >>>> at java.lang.Thread.run(Thread.**java:662) >>>> >>>> So I decommissioned this new 1.1.6 node and we are now running with 5 >>>> servers, not balanced along the ring, without any possibility of adding >>>> nodes, nor upgradinc C* version. >>>> >>>> We are quite desperate over here. >>>> >>>> If someone has any idea of what could happened and how to stabilize the >>>> cluster, it will be very appreciated. >>>> >>>> It's quite an emergency since we can't add nodes and are under heavy >>>> load. >>>> >>>> >>>> >>> >> >