The important point to consider is whether you are deleting old data or
recently written data. How old/recent depends on your write rate to the
cluster and there's no real formula. Basically you want to avoid deleting a
lot of old data all at once because the tombstones will end up in new
SSTables and the data to be deleted will live in higher levels (LCS) or
large SSTables (STCS), which won't get compacted together for a long time.
In this case it makes no difference if you do a big purge or if you break
it up, because at the end of the day if your big purge is just old data,
all the tombstones will have to stick around for awhile until they make it
to the higher levels/bigger SSTables.

If you have to purge large amounts of old data, the easiest way is to 1.
Make sure you have at least 50% disk free (for large/major compactions)
and/or 2. Use garbagecollect compactions (3.10+)
​

Reply via email to