> I ran this test previously on the cloud, with similar results: > > nodes reads/sec > 1 24,000 > 2 21,000 > 3 21,000 > 4 21,000 > 5 21,000 > 6 21,000 > > In fact, I ran it twice out of disbelief (on different nodes the second time) > to essentially identical results.
Something other than cassandra just *has* to be fishy here unless there is some kind of bug causing communication with nodes that should not be involved. It really sounds like there is a hidden bottleneck somewhere. You already mention that you've run multiple test clients so that the client is not a bottleneck. What about bandwidth? I could imagine bandwidth adding up a bit given those requests rate. Is it possible all the nodes are communicating with each other via some bottleneck (like 100 mbit)? What does the load "look like" when you observe the nodes during bottlenecking? How much bandwidth is each machine pushing (ifstat, nload, etc); is Cassandra obviously CPU bound or does it look idle? Presumably Cassandra is not perfectly concurrent and you may not saturate 8 cores under this load necessarily, but as you add more and more nodes and still only reaching 21k/sec you should come past a point where you're not even saturating a single core... *Something* else is probably going on. -- / Peter Schuller