The nodetool cleanup docs explain this increase in disk space usage.

"Running the nodetool cleanupcommand causes a temporary increase in disk space 
usage proportional to the size of your largest SSTable. Disk I/O occurs when 
running this command."

http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCleanup.html 
<http://docs.datastax.com/en/cassandra/2.1/cassandra/tools/toolsCleanup.html>

Cheers,
Akhil


> On 19/06/2017, at 7:47 PM, wxn...@zjqunshuo.com wrote:
> 
> Akhil, I agree with you that the node still has unwanted data, but why it has 
> more data than before cleaning up?
> 
> More background:
> Before cleaning up, the node has 790GB data. After cleaning up, I assume it 
> should has less data. But in fact it has 1000GB data which is larger than I 
> expected.
> Cassandra daemon crashed and left the files with the name with "tmp-" prefix 
> in the data directory which indicate the cleaning up task was not complete.
> 
> Cheers,
> -Simon
>  
> From: Akhil Mehra <mailto:akhilme...@gmail.com>
> Date: 2017-06-19 15:17
> To: wxn...@zjqunshuo.com <mailto:wxn...@zjqunshuo.com>
> CC: user <mailto:user@cassandra.apache.org>
> Subject: Re: Cleaning up related issue
> When you add a new node into the cluster data is streamed for all the old 
> nodes into the new node added. The new node is now responsible for data 
> previously stored in the old node.
>  
> The clean up process removes unwanted data after adding a new node to the 
> cluster.
>  
> In your case clean up failed on this node.
>  
> I think this node still has unwanted data that has not been cleaned up.
>  
> Cheers,
> Akhil
>  
>  
>  
>  
> > On 19/06/2017, at 7:00 PM, wxn...@zjqunshuo.com wrote:
> >
> > Thanks for the quick response. It's the existing node where the cleanup 
> > failed. It has a larger volume than other nodes.
> >  
> > From: Akhil Mehra
> > Date: 2017-06-19 14:56
> > To: wxn002
> > CC: user
> > Subject: Re: Cleaning up related issue
> > Is the node with the large volume a new node or an existing node. If it is 
> > an existing node is this the one where the node tool cleanup failed.
> >
> > Cheers,
> > Akhil
> >
> >> On 19/06/2017, at 6:40 PM, wxn...@zjqunshuo.com wrote:
> >>
> >> Hi,
> >> After adding a new node, I started cleaning up task to remove the old data 
> >> on the other 4 nodes. All went well except one node. The cleanup takes 
> >> hours and the Cassandra daemon crashed in the third node. I checked the 
> >> node and found the crash was because of OOM. The Cassandra data volume has 
> >> zero space left. I removed the temporary files which I believe created 
> >> during the cleaning up process and started Cassanndra.
> >>
> >> The node joined the cluster successfully, but one thing I found. From the 
> >> "nodetool status" output, the node takes much data than other nodes. 
> >> Nomally the load should be 700GB. But actually it's 1000GB. Why it is 
> >> larger? Please see the output below.
> >>
> >> UN  10.253.44.149   705.98 GB  256          40.4%             
> >> 9180b7c9-fa0b-4bbe-bf62-64a599c01e58  rack1
> >> UN  10.253.106.218  691.07 GB  256          39.9%             
> >> e24d13e2-96cb-4e8c-9d94-22498ad67c85  rack1
> >> UN  10.253.42.113   623.73 GB  256          39.3%             
> >> 385ad28c-0f3f-415f-9e0a-7fe8bef97e17  rack1
> >> UN  10.253.41.165   779.38 GB  256          40.1%             
> >> 46f37f06-9c45-492d-bd25-6fef7f926e38  rack1
> >> UN  10.253.106.210  1022.7 GB  256          40.3%             
> >> a31b6088-0cb2-40b4-ac22-aec718dbd035  rack1
> >>
> >> Cheers,
> >> -Simon

Reply via email to