Can you turn the logging up to DEBUG level and look for a message from
CassandraServer that says ... timed out ?
Also check the thread pool stats nodetool tpstats to see if the node is
keeping up.
Aaron
On 7 Apr 2011, at 13:43, Sheng Chen wrote:
Thank you Aaron.
It does not seem to be
TimedOutException means that the less than CL number of nodes responded to the
coordinator before the rpc_timeout.
So it was overloaded. Which makes sense when you say it only happens with
secondary indexes. Consider things like
- reducing the throughput
- reducing the number of clients
-
Thank you Aaron.
It does not seem to be an overload problem.
I have 16 cores and 48G ram on the single node, and I reduced the concurrent
threads to be 1.
Still, it just suddenly dies of a timeout, while the cpu, ram, disk load are
below 10% and write latency is about 0.5ms for the past 10