I have a very basic question which I have been unable to find in
online documentation on cassandra.

It seems like every node in a cassandra cluster contains all the data
ever stored in the cluster (i.e., all nodes are identical).  I don't
understand how you can scale this on commodity servers with merely
internal hard disks.   In other words, if I want to store 5 TB of
data, does that each node need a hard disk capacity of 5 TB??

With HBase, memcached and other nosql solutions it is more clear how
data is spilt up in the cluster and replicated for fault tolerance.
Again, please excuse the rather basic question.

Reply via email to