On Fri, Jul 16, 2010 at 6:06 PM, Oren Benjamin <o...@clearspring.com> wrote:
> The first goal was to reproduce the test described on spyced here: 
> http://spyced.blogspot.com/2010/01/cassandra-05.html
>
> Using Cassandra 0.6.3, a 4GB/160GB cloud server 
> (http://www.rackspacecloud.com/cloud_hosting_products/servers/pricing) with 
> default storage-conf.xml and cassandra.in.sh, here's what I got:
>
> Reads: 4,800/s
> Writes: 9,000/s
>
> Pretty close to the result posted on the blog, with a slightly lower write 
> performance (perhaps due to the availability of only a single disk for both 
> commitlog and data).

You're getting as close as you are because you're comparing 0.6
numbers with 0.5.  For 0.6 on the test machine used in the blog post
(quad core, 2 disks, 4GB) we were getting 7k reads and 14k writes.

In our tests we saw a 5-15% performance penalty from adding a
virtualization layer.  Things like only having a single disk are going
to stack on top of that.

> The above was single node testing.  I'd expect to be able to add nodes and 
> scale throughput.  Unfortunately, I seem to be running into a cap of 21,000 
> reads/s regardless of the number of nodes in the cluster.

This is what I would expect if a single machine is handling all the
Thrift requests.  Are you spreading the client connections to all the
machines?

> The disk performance of the cloud servers have been extremely spotty... Is 
> this normal for the cloud?

Yes.

>  And if so, what's the solution re Cassandra?

The larger the instance you're using, the closer you are to having the
entire machine, meaning less other users are competing with you for
disk i/o.

Of course when you're renting the entire machine's worth, it can be
more cost-effective to just use dedicated hardware.

> However, Cassandra routes to the nearest node topologically and not to the 
>best performing one, so "bad" nodes will always result in high latency reads.

Cassandra routes reads around nodes with temporarily poor performance
in 0.7, btw.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to