Have you tried restarting? It's possible there's open file handles to sstables that have been compacted away. You can verify by doing lsof and grepping for DEL or deleted.
If it's not that, you can run nodetool cleanup on each node to scan all of the sstables on disk and remove anything that it's not responsible for. Generally this would only work if you added nodes recently. On Tuesday, January 12, 2016, Rahul Ramesh <rr.ii...@gmail.com> wrote: > We have a 2 node Cassandra cluster with a replication factor of 2. > > The load factor on the nodes is around 350Gb > > Datacenter: Cassandra > ========== > Address Rack Status State Load Owns > Token > > -5072018636360415943 > 172.31.7.91 rack1 Up Normal 328.5 GB 100.00% > -7068746880841807701 > 172.31.7.92 rack1 Up Normal 351.7 GB 100.00% > -5072018636360415943 > > However,if I use df -h, > > /dev/xvdf 252G 223G 17G 94% /HDD1 > /dev/xvdg 493G 456G 12G 98% /HDD2 > /dev/xvdh 197G 167G 21G 90% /HDD3 > > > HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one > of the machine and in another machine it is close to 650Gb. > > I started repair 2 days ago, after running repair, the amount of disk > space consumption has actually increased. > I also checked if this is because of snapshots. nodetool listsnapshot > intermittently lists a snapshot but it goes away after sometime. > > Can somebody please help me understand, > 1. why so much disk space is consumed? > 2. Why did it increase after repair? > 3. Is there any way to recover from this state. > > > Thanks, > Rahul > >