Try `nodetool clearsnapshot` which will delete any snapshots you have. I have never taken a snapshot with nodetool yet I found several snapshots on my disk recently (which can take a lot of space). So perhaps they are automatically generated by some operation? No idea. Regardless, nuking those freed up a ton of space for me.
- Ian On Mon, Dec 8, 2014 at 8:12 PM, Nate Yoder <n...@whistle.com> wrote: > Hi All, > > I am new to Cassandra so I apologise in advance if I have missed anything > obvious but this one currently has me stumped. > > I am currently running a 6 node Cassandra 2.1.1 cluster on EC2 using > C3.2XLarge nodes which overall is working very well for us. However, after > letting it run for a while I seem to get into a situation where the amount > of disk space used far exceeds the total amount of data on each node and I > haven't been able to get the size to go back down except by stopping and > restarting the node. > > For example, in my data I have almost all of my data in one table. On one > of my nodes right now the total space used (as reported by nodetool > cfstats) is 57.2 GB and there are no snapshots. However, when I look at the > size of the data files (using du) the data file for that table is 107GB. > Because the C3.2XLarge only have 160 GB of SSD you can see why this quickly > becomes a problem. > > Running nodetool compact didn't reduce the size and neither does running > nodetool repair -pr on the node. I also tried nodetool flush and nodetool > cleanup (even though I have not added or removed any nodes recently) but it > didn't change anything either. In order to keep my cluster up I then > stopped and started that node and the size of the data file dropped to 54GB > while the total column family size (as reported by nodetool) stayed about > the same. > > Any suggestions as to what I could be doing wrong? > > Thanks, > Nate >