Hi

Some Cassandra nodes could have rows that are associated with tokens that aren't owned by those nodes anymore as a result of expansion, this data will remain until a cleanup compaction is run.

We would like to know the best way to calculate the amount (or close to) of data that is essentially dead data on each node to determine how much disk space will be freed once the expansion is complete. One possible approach we considered is to identify the rows no longer owned by each node and their size by scanning the sstables.



Reply via email to