Re: Easy way to overload a single node on purpose?

2011-06-17 Thread aaron morton
The short answer to the problem you saw is monitor the disk space. Also monitor client side logs for errors. Running out of commit log space does not stop the node from doing reads, so it can still be considered up. One nodes view of it's own UP'ness is not as important as the other nodes (or

Re: Easy way to overload a single node on purpose?

2011-06-16 Thread aaron morton
DEBUG 14:36:55,546 ... timed out Is logged when the coordinator times out waiting for the replicas to respond, the timeout setting is rpc_timeout in the yaml file. This results in the client getting a TimedOutException. AFAIK There is no global everything is good / bad flags to check.

Re: Easy way to overload a single node on purpose?

2011-06-16 Thread Suan Aik Yeo
Having a ping column can work if every key is replicated to every node. It would tell you the cluster is working, sort of. Once the number of nodes is greater than the RF, it tells you a subset of the nodes works. The way our check works is that each node checks itself, so in this context we're

Easy way to overload a single node on purpose?

2011-06-15 Thread Suan Aik Yeo
Here's a weird one... what's the best way to get a Cassandra node into a half-crashed state? We have a 3-node cluster running 0.7.5. A few days ago this happened organically to node1 - the partition the commitlog was on was 100% full and there was a No space left on device error, and after a