> Operation [158320] retried 10 times - error inserting key 0158320 > ((UnavailableException))
This means that at the point where the thrift request to write data was handled, the co-ordinator node (the one your client is connected to) believed that, among the replicas responsible for the key, too many were down to satisfy the consistency level. Most likely causes would be that you're in fact not using RF > 2 (e.g., is the RF really > 1 for the keyspace you're inserting into), or you're in fact not using ONE. > I'm sure my naive setup is flawed in some way, but what I was hoping for was > when the node went down it would fail to write to the downed node and instead > write to one of the other nodes in the clusters. So question is why are > writes failing even after a retry? It might be the stress client doesn't pool > connections (I took Write always go to all responsible replicas that are up, and when enough return (according to consistency level), the insert succeeds. If replicas fail to respond you may get a TimeoutException. UnavailableException means it didn't even try because it didn't have enough replicas to even try to write to. (Note though: Reads are a bit of a different story and if you want to test behavior when nodes go down I suggest including that. See CASSANDRA-2540 and CASSANDRA-3927.) -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)