> > So is it not true that each node contains all the data in the cluster?
No, not in the general case, in fact rarely is it the case. Usually R<N. In my case I have N=6 and R=2. You configure R per CF under ReplicationFactor (v0.6.*) or replication_factor (v0.7.*). http://wiki.apache.org/cassandra/StorageConfiguration On Thu, Dec 9, 2010 at 12:43 PM, Jonathan Colby <jonathan.co...@gmail.com>wrote: > Thanks Ran. This helps a little but unfortunately I'm still a bit > fuzzy for me. So is it not true that each node contains all the data > in the cluster? I haven't come across any information on how clustered > data is coordinated in cassandra. how does my query get directed to > the right node? > > On Thu, Dec 9, 2010 at 11:35 AM, Ran Tavory <ran...@gmail.com> wrote: > > there are two numbers to look at, N the numbers of hosts in the ring > > (cluster) and R the number of replicas for each data item. R is > configurable > > per column family. > > Typically for large clusters N >> R. For very small clusters if makes > sense > > for R to be close to N in which case cassandra is useful so the database > > doesn't have a single a single point of failure but not so much b/c of > the > > size of the data. But for large clusters it rarely makes sense to have > N=R, > > usually N >> R. > > > > On Thu, Dec 9, 2010 at 12:28 PM, Jonathan Colby < > jonathan.co...@gmail.com> > > wrote: > >> > >> I have a very basic question which I have been unable to find in > >> online documentation on cassandra. > >> > >> It seems like every node in a cassandra cluster contains all the data > >> ever stored in the cluster (i.e., all nodes are identical). I don't > >> understand how you can scale this on commodity servers with merely > >> internal hard disks. In other words, if I want to store 5 TB of > >> data, does that each node need a hard disk capacity of 5 TB?? > >> > >> With HBase, memcached and other nosql solutions it is more clear how > >> data is spilt up in the cluster and replicated for fault tolerance. > >> Again, please excuse the rather basic question. > > > > > > > > -- > > /Ran > > > -- /Ran