Big old guess of something in the 1000's. Try benchmarking your work load and plug the numbers (my 5m is pretty high) in...
- 8 cores * 8 writers per core = 64 if each write request takes 5ms = 1280 max per sec - 1 spindle * 16 readers per spindle = 16 readers if each read request takes 5ms = 320 max per sec (reader and writer sizes from the help in conf/cassandra.yaml) This is really just a guess, there are a lot more things going on in the system and it gets even more complicated once it's turned on. But I know sometimes you just need to show you've thought about it :) Hope that helps. Aaron On 25 Mar 2011, at 02:27, Brian Fitzpatrick wrote: > Thanks for the tips on the replication factor. Any thoughts on the > number of nodes in a cluster to support an RF=3 with a workload of 400 > ops/sec (4-8K sized rows, 50/50 read/write)? Based on the "sweet > spot" hardware referenced in the wiki (8-core, 16-32GB RAM), what kink > of ops/sec could I reasonably expect from each node. Just looking for > a range to make some educated guesses. > > Thanks, > Brian > > On Wed, Mar 23, 2011 at 9:04 PM, aaron morton <aa...@thelastpickle.com> wrote: >> It really does depend on what your workload is like, and in the end will >> involve a certain amount of fudge factor. >> >> http://wiki.apache.org/cassandra/CassandraHardware provides some guidance. >> http://wiki.apache.org/cassandra/MemtableThresholds can be used to get a >> rough idea of the memory requirements. Note that secondary indexes are also >> CF's with the same memory settings as the parent. >> With RF3 you can lose afford to lose one replica for a key a token range and >> still be available (Assuming Quorum CL). With RF 5 you can lose 2 replicas >> and still be available for the keys in the range. >> I'm been careful to say "lose X replicas" because the other nodes in the >> cluster don't count when considering an operation for a key. Two examples, 9 >> node cluster with RF3. If you lose nodes 2 and 3 and they are replicas for >> node 1, Quorum operations on keys in the range for node 1 will fail (ranges >> for 2 and 3 will be ok). If you lose nodes 2 and 5 Quorum operations will >> succeed for all keys. >> RF 3 is reasonable starting point for some redundancy, RF 5 is more. After >> that it's Web Scale (tm). >> Hope that helps >> Aaron >> >> On 24 Mar 2011, at 04:04, Brian Fitzpatrick wrote: >> >> I'm going through the process of specing out the hardware for a >> Cassandra cluster. The relevant specs: >> >> - Support 460 operations/sec (50/50 read/write workload). Row size >> ranges from 4 to 8K. >> - Support 29 million objects for the first year >> - Support 365 GB storage for the first year, based on Cassandra tests >> (data + index + overhead * replication factor of 3) >> >> I'm looking for advice on the node size for this cluster, recommended >> RAM per node, and whether RF=3 seems to be a good choice for general >> availability and resistance to failure. >> >> I've looked at the YCSB benchmark paper and through the archives of >> this email list looking for pointers. I haven't found any general >> guidelines on recommended cluster size to support X operations/sec >> with Y data size at RF factor of Z, that I could extrapolate from. >> >> Any and all recommendations appreciated. >> >> Thanks, >> Brian >> >>