-tmp- files will sit in the data dir, if there was an error creating them 
during compaction or flushing to disk they will sit around until a restart. 

Check the logs for errors to see if compaction was failing on something.

Cheers

-----------------
Aaron Morton
New Zealand
@aaronmorton

Co-Founder & Principal Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

On 17/12/2013, at 12:28 pm, Narendra Sharma <narendra.sha...@gmail.com> wrote:

> No snapshots.
> 
> I restarted the node and now the Load in ring is in sync with the disk usage. 
> Not sure what caused it to go out of sync. However, the Live SStable count 
> doesn't match exactly with the number of data files on disk.
> 
> I am going through the Cassandra code to understand what could be the reason 
> for the mismatch in the sstable count and also why there is no reference of 
> some of the data files in system.log.
> 
> 
> 
> 
> On Mon, Dec 16, 2013 at 2:45 PM, Arindam Barua <aba...@247-inc.com> wrote:
>  
> 
> Do you have any snapshots on the nodes where you are seeing this issue?
> 
> Snapshots will link to sstables which will cause them not be deleted.
> 
>  
> 
> -Arindam
> 
>  
> 
> From: Narendra Sharma [mailto:narendra.sha...@gmail.com] 
> Sent: Sunday, December 15, 2013 1:15 PM
> To: user@cassandra.apache.org
> Subject: Cassandra 1.1.6 - Disk usage and Load displayed in ring doesn't match
> 
>  
> 
> We have 8 node cluster. Replication factor is 3. 
> 
>  
> 
> For some of the nodes the Disk usage (du -ksh .) in the data directory for CF 
> doesn't match the Load reported in nodetool ring command. When we expanded 
> the cluster from 4 node to 8 nodes (4 weeks back), everything was okay. Over 
> period of last 2-3 weeks the disk usage has gone up. We increased the RF from 
> 2 to 3 2 weeks ago.
> 
>  
> 
> I am not sure if increasing the RF is causing this issue.
> 
>  
> 
> For one of the nodes that I analyzed:
> 
> 1. nodetool ring reported load as 575.38 GB
> 
>  
> 
> 2. nodetool cfstats for the CF reported:
> 
> SSTable count: 28
> 
> Space used (live): 572671381955
> 
> Space used (total): 572671381955
> 
>  
> 
>  
> 
> 3. 'ls -1 *Data* | wc -l' in the data folder for CF returned
> 
> 46
> 
>  
> 
> 4. 'du -ksh .' in the data folder for CF returned
> 
> 720G
> 
>  
> 
> The above numbers indicate that there are some sstables that are obsolete and 
> are still occupying space on disk. What could be wrong? Will restarting the 
> node help? The cassandra process is running for last 45 days with no 
> downtime. However, because the disk usage is high, we are not able to run 
> full compaction.
> 
>  
> 
> Also, I can't find reference to each of the sstables on disk in the 
> system.log file. For eg I have one data file on disk as (ls -lth):
> 
> 86G Nov 20 06:14
> 
>  
> 
> I have system.log file with first line:
> 
> INFO [main] 2013-11-18 09:41:56,120 AbstractCassandraDaemon.java (line 101) 
> Logging initialized
> 
>  
> 
> The 86G file must be a result of some compaction. I see no reference of data 
> file in system.log file between 11/18 to 11/25. What could be the reason for 
> that? The only reference is dated 11/29 when the file was being streamed to 
> another node (new node).
> 
>  
> 
> How can I identify the obsolete files and remove them? I am thinking about 
> following. Let me know if it make sense.
> 
> 1. Restart the node and check the state.
> 
> 2. Move the oldest data files to another location (to another mount point)
> 
> 3. Restart the node again
> 
> 4. Run repair on the node so that it can get the missing data from its peers.
> 
>  
> 
>  
> 
> I compared the numbers of a healthy node for the same CF:
> 
> 1. nodetool ring reported load as 662.95 GB
> 
>  
> 
> 2. nodetool cfstats for the CF reported:
> 
> SSTable count: 16
> 
> Space used (live): 670524321067
> 
> Space used (total): 670524321067
> 
>  
> 
> 3. 'ls -1 *Data* | wc -l' in the data folder for CF returned
> 
> 16
> 
>  
> 
> 4. 'du -ksh .' in the data folder for CF returned
> 
> 625G
> 
>  
> 
>  
> 
> -Naren
> 
>  
> 
> 
> 
>  
> 
> -- 
> Narendra Sharma
> 
> Software Engineer
> 
> http://www.aeris.com
> 
> http://narendrasharma.blogspot.com/
> 
>  
> 
> 
> 
> 
> -- 
> Narendra Sharma
> Software Engineer
> http://www.aeris.com
> http://narendrasharma.blogspot.com/
> 

Reply via email to