RE: Is cleanup is required if cluster topology changes

Durity, Sean R via user Fri, 05 May 2023 05:55:05 -0700

I run clean-up in parallel, not serially, since it is a node-only kind of 
operation. And I only run in the impacted DC. With only 300 GB on a node, 
clean-up should not take very long. Check your compactionthroughput.


I ran clean-up in parallel on 53 nodes with over 3 TB of data each. It took 
like 6-8 hours. (And many nodes were done much earlier than that.) I restrict 
clean-up to one compactionthread, but I double the compactionthroughput for the 
duration of the cleanup. This protects against two large sstables being 
compacted at the same time and running out of disk space.

Sean Durity
From: manish khandelwal <manishkhandelwa...@gmail.com>
Sent: Friday, May 5, 2023 4:52 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Is cleanup is required if cluster topology changes

You can replace the node directly why to add a node and decommission the 
another node. Just replace the node with the new node and your topology remains 
the same so no need to run the cleanup . On Fri, May 5, 2023 at 10: 26 AM 
Jaydeep Chovatia

You can replace the node directly why to add a node and decommission the 
another node. Just replace the node with the new node and your topology remains 
the same so no need to run the cleanup .

On Fri, May 5, 2023 at 10:26 AM Jaydeep Chovatia 
<chovatia.jayd...@gmail.com<mailto:chovatia.jayd...@gmail.com>> wrote:

We use STCS, and our experience with cleanup is that it takes a long time to 
run in a 100-node cluster. We would like to replace one node every day for 
various purposes in our fleet.

If we run cleanup after each node replacement, then it might take, say, 15 days 
to complete, and that hinders our node replacement frequency.

Do you see any other options?

Jaydeep

On Thu, May 4, 2023 at 9:47 PM Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
You should 100% trigger cleanup each time or you’ll almost certainly resurrect 
data sooner or later
If you’re using leveled compaction it’s especially cheap. Stcs and twcs are 
worse, but if you’re really scaling that often, I’d be considering lcs and 
running cleanup just before or just after each scaling


On May 4, 2023, at 9:25 PM, Jaydeep Chovatia 
<chovatia.jayd...@gmail.com<mailto:chovatia.jayd...@gmail.com>> wrote:

Thanks, Jeff!
But in our environment we replace nodes quite often for various optimization 
purposes, etc. say, almost 1 node per day (node addition followed by node 
decommission, which of course changes the topology), and we have a cluster of 
size 100 nodes with 300GB per node. If we have to run cleanup on 100 nodes 
after every replacement, then it could take forever.
What is the recommendation until we get this fixed in Cassandra itself as part 
of compaction (w/o externally triggering cleanup)?

Jaydeep

On Thu, May 4, 2023 at 8:14 PM Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
Cleanup is fast and cheap and basically a no-op if you haven’t changed the ring
After cassandra has transactional cluster metadata to make ring changes 
strongly consistent, cassandra should do this in every compaction. But until 
then it’s left for operators to run when they’re sure the state of the ring is 
correct .




On May 4, 2023, at 7:41 PM, Jaydeep Chovatia 
<chovatia.jayd...@gmail.com<mailto:chovatia.jayd...@gmail.com>> wrote:

Isn't this considered a kind of bug in Cassandra because as we know cleanup is 
a lengthy and unreliable operation, so relying on the cleanup means higher 
chances of data resurrection?
Do you think we should discard the unowned token-ranges as part of the regular 
compaction itself? What are the pitfalls of doing this as part of compaction 
itself?

Jaydeep

On Thu, May 4, 2023 at 7:25 PM guo Maxwell 
<cclive1...@gmail.com<mailto:cclive1...@gmail.com>> wrote:
compact ion will just merge duplicate data and remove delete data in this node 
.if you add or remove one node for the cluster, I think clean up is needed. if 
clean up failed, I think we should come to see the reason.

Runtian Liu <curly...@gmail.com<mailto:curly...@gmail.com>> 于2023年5月5日周五 
06:37写道：
Hi all,

Is cleanup the sole method to remove data that does not belong to a specific 
node? In a cluster, where nodes are added or decommissioned from time to time, 
failure to run cleanup may lead to data resurrection issues, as deleted data 
may remain on the node that lost ownership of certain partitions. Or is it true 
that normal compactions can also handle data removal for nodes that no longer 
have ownership of certain data?

Thanks,
Runtian


--
you are the apple of my eye !

________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


INTERNAL USE

RE: Is cleanup is required if cluster topology changes

Reply via email to