> I've got a batch process running every so often that issues a bunch of
> counter increments. I have noticed that when this process runs without being
> throttled it will raise the CPU to 80-90% utilization on the nodes handling
> the requests. This in turns timeouts and general lag on queries running on
> the cluster.

This much is entirely expected. If you are not bottlenecking anywhere
else and saturing the cluster, you will be bound by it, and it will
affect the latency of other traffic, no matter how fast or slow
Cassandra is.

You do say "nodes handling the requests". Two things to always keep in
mind is to (1) spread the requests evenly across all members of the
cluster, and (2) if you are doing a lot of work per row key, spread it
around and be concurrent so that you're not hitting a single row at a
time, which will be under the responsibility of a single set of RF
nodes (you want to put load on the entire cluster evently if you want
to maximize throughput).

> Is there anything that can be done to increase the throughput, I've been
> looking on the wiki and the mailing list and didn't find any optimization
> suggestions (apart from spreading the load on more nodes).
>
> Cluster is 5 node, BOP, RF=3, AMD opteron 4174 CPU (6 x 2.3 Ghz cores),
> Gigabit ethernet, RAID-0 SATA2 disks

For starters, what *is* the throughput? How many counter mutations are
you submitting per second?

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Reply via email to