you want to get hitrate to 0.9 or so, i.e. 90% of index lookups don't have to hit disk. play with KCF and see what happens. and use jconsole to see how close you are getting to your 3GB limit (hit the GC button to see how much memory is "really" being used, and then add 25% or so for a reasonable padding).
On Sat, Jan 30, 2010 at 5:46 PM, Suhail Doshi <digitalwarf...@gmail.com> wrote: > According jconsole on the main table I am having issues with: > > Capacity: 1164790 > HitRate: .54 > Size: 99753 > > Right now my KeysCachedFraction is 0.2. The current memory allocated is 3G. > What's a suggested KeysCachedFraction value? > > Suhail > > On Sat, Jan 30, 2010 at 5:58 AM, Jonathan Ellis <jbel...@gmail.com> wrote: > >> the thing that will help most in 0.5 is to increase your >> KeysCachedFraction to 0.2 or even more, depending on your workload. >> >> On Sat, Jan 30, 2010 at 5:23 AM, Suhail Doshi <digitalwarf...@gmail.com> >> wrote: >> > An issue I've been seeing is it's really hard to scale Cassandra with >> reads. >> > I've run top, vmstat, iostat. vmstat shows no swapping but iostat shows >> > heavy saturation of %util and await times over 90ms with max rMB/s of >> 7-8. >> > >> > I have over 7G of memory dedicated across two nodes. I am wondering what >> the >> > issue might be and how to solve this? I felt like 7 G would be enough. >> > >> > Suhail >> > >> > On Thu, Jan 28, 2010 at 7:32 PM, Ray Slakinski <r...@mahalo.com> wrote: >> > >> >> Cassandra auto shards, so you just need to point at your cluster and >> >> cassandra does the rest. You should read up on different partitioners >> though >> >> before you go live in production, because its not too easy to switch >> once >> >> you make that decision. >> >> >> >> http://wiki.apache.org/cassandra/StorageConfiguration#Partitioner >> >> >> >> Ray Slakinski >> >> On 2010-01-28, at 7:29 PM, Suhail Doshi wrote: >> >> >> >> > Another piece I am interested in is how cassandra distributes the data >> >> > automatically. In MySQL you need to shard and you'd pick the shard to >> >> > request info from--how does that translate in cassandra? >> >> > >> >> > On Thu, Jan 28, 2010 at 7:23 PM, Suhail Doshi <suh...@mixpanel.com> >> >> wrote: >> >> > >> >> >> We've started to use Cassandra in production and just have one node >> >> right >> >> >> now. Here's one of our ColumnFamilys: >> >> >> >> >> >> 16G Jan 28 22:28 SomeIndex-5467-Index.db >> >> >> 196M Jan 28 22:32 SomeIndex-5487-Index.db >> >> >> >> >> >> The first bottle neck you encounter is reads--writes are extremely >> fast >> >> even with one node. >> >> >> >> >> >> My question is, is the size of the *-Index.db files the amount of RAM >> >> you need available for Cassandra to do reads fast? >> >> >> >> >> >> What are some configuration options you would need to tweak besides >> the >> >> JVM's max memory size being larger. Is there any default configurations >> >> commonly missed? >> >> >> >> >> >> Next, if you provision more nodes will Cassandra distribute the data >> in >> >> memory so I don't need a single 16 GB node? Is there anything I need to >> >> build in my application logic to make this work correctly. Ideally, if I >> had >> >> a 16 GB index, I'd want it spread across 4 4GB nodes. Can any client >> connect >> >> to any one node request info and it will get the info back from a node >> that >> >> has that part of the index in memory? >> >> >> >> >> >> What's the best way to do efficient reads? >> >> >> >> >> >> Suhail >> >> >> >> >> >> >> >> >> >> >> > >> > >> > -- >> > http://mixpanel.com >> > Blog: http://blog.mixpanel.com >> > >> > > > > -- > http://mixpanel.com > Blog: http://blog.mixpanel.com >