Another piece I am interested in is how cassandra distributes the data
automatically. In MySQL you need to shard and you'd pick the shard to
request info from--how does that translate in cassandra?

On Thu, Jan 28, 2010 at 7:23 PM, Suhail Doshi <suh...@mixpanel.com> wrote:

> We've started to use Cassandra in production and just have one node right
> now. Here's one of our ColumnFamilys:
>
> 16G Jan 28 22:28 SomeIndex-5467-Index.db
> 196M Jan 28 22:32 SomeIndex-5487-Index.db
>
> The first bottle neck you encounter is reads--writes are extremely fast even 
> with one node.
>
> My question is, is the size of the *-Index.db files the amount of RAM you 
> need available for Cassandra to do reads fast?
>
> What are some configuration options you would need to tweak besides the JVM's 
> max memory size being larger. Is there any default configurations commonly 
> missed?
>
> Next, if you provision more nodes will Cassandra distribute the data in 
> memory so I don't need a single 16 GB node? Is there anything I need to build 
> in my application logic to make this work correctly. Ideally, if I had a 16 
> GB index, I'd want it spread across 4 4GB nodes. Can any client connect to 
> any one node request info and it will get the info back from a node that has 
> that part of the index in memory?
>
> What's the best way to do efficient reads?
>
> Suhail
>
>

Reply via email to