Important Variables for Scaling

2011-06-16 Thread Schuilenga, Jan Taeke
Which variables (for instance: throughput, CPU, I/O, connections) are
leading in deciding to add a node to a Cassandra setup which is put
under strain. We are trying to proove scalibility, but when is the time
there to add a node and have the optimum scalibilty result.


Cassandra scaling problem in virtualized environment

2011-06-14 Thread Schuilenga, Jan Taeke
Hi All, 

We are having issues testing Cassandra in a virtualized environment
(Vmware ESX). 
Our challenge is to combine a  high number of concurrent users with a
very low maximum response time. 
Immediately we ran into a problem with scalability where our performance
(Trx per sec) unexpectedly degrades after adding nodes without
overcommitting host cpu resources too much as far as we can tell.
Therefore we are looking for bestpractices or anybody with experiences
with cassandra in a similar environment to help us.
So far we only found the following article which hasn't helped so far:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Node-ad
ded-no-performance-boost-are-the-tokens-correct-td6228872.html

Our current test setup using the java version of the cassandra load tool
consists of:
Hardware: Vmware ESX cluster with IBM 3850 (4x dual core cpu) Hosts on
Compellent FC SAN for storage 
Cassandra: 3-6 node 2vCpu Centos guest boxes (RF=2) 

Jan-Taeke Schuilenga