Question about insert performance in multiple node cluster

Flachbart, Dirk (HP Software - TransactionVision) Mon, 28 Feb 2011 09:25:55 -0800

Hi,

We are trying to use Cassandra for high-performance insertion of simple 
key/value records. I have set up Cassandra on two of my machines in my local 
network (Windows 2008 server), using pretty much the default configuration. I 
created a test driver in java (using thrift) which inserts a single 1K data 
column (keys are unique strings of integer values) with multiple threads. On 
each machine I am able to achieve around 9,000 inserts/sec when running the 
test driver with the local Cassandra server.


Then I set up a cluster with both machines, and ran the same test again (the 
test driver is still local to one of the Cassandra nodes). Surprisingly I did 
not see any improvement in the insert performance, I got the same 9000 
inserts/sec as when running with a single node. I know that I shouldn't expect 
linear scaling to 18,000 operations/sec, but shouldn't I see at least some 
significant improvement? The CPU isn't fully loaded on either of the machines, 
and the network utilization is low too (1000 mbit network). Later on I also 
tested adding a third node, but that didn't improve anything either.

I suspect I'm doing something wrong with setting up the cluster. The only 
changes I made on the second machine were:


-          AutoBootstrap=true

-          Setting 'Seed' to the IP of the other node


Did I miss anything? Or am I simply wrong in expecting the throughput to scale 
when using multiple nodes?



Thanks,
Dirk

Question about insert performance in multiple node cluster

Reply via email to