On Thu, Jan 28, 2010 at 9:23 PM, Suhail Doshi <suh...@mixpanel.com> wrote: > We've started to use Cassandra in production and just have one node right > now. Here's one of our ColumnFamilys: > > > 16G Jan 28 22:28 SomeIndex-5467-Index.db > 196M Jan 28 22:32 SomeIndex-5487-Index.db > > The first bottle neck you encounter is reads--writes are extremely > fast even with one node. > > My question is, is the size of the *-Index.db files the amount of RAM > you need available for Cassandra to do reads fast?
No. It depends how much of your data is "hot." IIRC you are running trunk -- look at the key cache hit rate with various KeyCacheFractions and see how large it has to be to get an 80% hit rate or so. > Next, if you provision more nodes will Cassandra distribute the data > in memory so I don't need a single 16 GB node? Yes. See the Ring Management section here: http://wiki.apache.org/cassandra/Operations > Can any client connect to any one node request info and it will > get the info back from a node that has that part of the index in > memory? Yes.