-tmp- files will sit in the data dir, if there was an error creating them during compaction or flushing to disk they will sit around until a restart.
Check the logs for errors to see if compaction was failing on something. Cheers ----------------- Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 17/12/2013, at 12:28 pm, Narendra Sharma <narendra.sha...@gmail.com> wrote: > No snapshots. > > I restarted the node and now the Load in ring is in sync with the disk usage. > Not sure what caused it to go out of sync. However, the Live SStable count > doesn't match exactly with the number of data files on disk. > > I am going through the Cassandra code to understand what could be the reason > for the mismatch in the sstable count and also why there is no reference of > some of the data files in system.log. > > > > > On Mon, Dec 16, 2013 at 2:45 PM, Arindam Barua <aba...@247-inc.com> wrote: > > > Do you have any snapshots on the nodes where you are seeing this issue? > > Snapshots will link to sstables which will cause them not be deleted. > > > > -Arindam > > > > From: Narendra Sharma [mailto:narendra.sha...@gmail.com] > Sent: Sunday, December 15, 2013 1:15 PM > To: user@cassandra.apache.org > Subject: Cassandra 1.1.6 - Disk usage and Load displayed in ring doesn't match > > > > We have 8 node cluster. Replication factor is 3. > > > > For some of the nodes the Disk usage (du -ksh .) in the data directory for CF > doesn't match the Load reported in nodetool ring command. When we expanded > the cluster from 4 node to 8 nodes (4 weeks back), everything was okay. Over > period of last 2-3 weeks the disk usage has gone up. We increased the RF from > 2 to 3 2 weeks ago. > > > > I am not sure if increasing the RF is causing this issue. > > > > For one of the nodes that I analyzed: > > 1. nodetool ring reported load as 575.38 GB > > > > 2. nodetool cfstats for the CF reported: > > SSTable count: 28 > > Space used (live): 572671381955 > > Space used (total): 572671381955 > > > > > > 3. 'ls -1 *Data* | wc -l' in the data folder for CF returned > > 46 > > > > 4. 'du -ksh .' in the data folder for CF returned > > 720G > > > > The above numbers indicate that there are some sstables that are obsolete and > are still occupying space on disk. What could be wrong? Will restarting the > node help? The cassandra process is running for last 45 days with no > downtime. However, because the disk usage is high, we are not able to run > full compaction. > > > > Also, I can't find reference to each of the sstables on disk in the > system.log file. For eg I have one data file on disk as (ls -lth): > > 86G Nov 20 06:14 > > > > I have system.log file with first line: > > INFO [main] 2013-11-18 09:41:56,120 AbstractCassandraDaemon.java (line 101) > Logging initialized > > > > The 86G file must be a result of some compaction. I see no reference of data > file in system.log file between 11/18 to 11/25. What could be the reason for > that? The only reference is dated 11/29 when the file was being streamed to > another node (new node). > > > > How can I identify the obsolete files and remove them? I am thinking about > following. Let me know if it make sense. > > 1. Restart the node and check the state. > > 2. Move the oldest data files to another location (to another mount point) > > 3. Restart the node again > > 4. Run repair on the node so that it can get the missing data from its peers. > > > > > > I compared the numbers of a healthy node for the same CF: > > 1. nodetool ring reported load as 662.95 GB > > > > 2. nodetool cfstats for the CF reported: > > SSTable count: 16 > > Space used (live): 670524321067 > > Space used (total): 670524321067 > > > > 3. 'ls -1 *Data* | wc -l' in the data folder for CF returned > > 16 > > > > 4. 'du -ksh .' in the data folder for CF returned > > 625G > > > > > > -Naren > > > > > > > > -- > Narendra Sharma > > Software Engineer > > http://www.aeris.com > > http://narendrasharma.blogspot.com/ > > > > > > > -- > Narendra Sharma > Software Engineer > http://www.aeris.com > http://narendrasharma.blogspot.com/ >