Hello Everybody,

I am trying to understand how disks are filled in cassandra.

If we run cassandra in a cluster of N (commodity) servers and each server has the same DataFileDirectories configuration

 <DataFileDirectories>
     <DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
 </DataFileDirectories>

and with the same disk size, what should I do to never get a disk full on those servers?
Does cassandra scaling only me to act like follows:

I just watch the percentage of use of the partition containing the /var/lib/cassandra/data on each server and if one of the servers returns a usage greater than a threshold (say 95%), then I just have to add an extra N+1 node to my cluster? Will the disk usage eventually stablize a the average disk usage 'U' it was before the node addition to the lower value 'U * (N/N+1)'?
Is that that easy? ;)

Thanks
Alex

Reply via email to