That sounds like the combined results from the anti-compaction and the
size amplification from the default SizeTieredCompactionStrategy. If you
keep repeating those steps, the disk usage will eventually stop growing.
Of course, that's not an excuse to keep repeating it.
To fix this (if you
Sometimes time bucketing can be used to create manageable partition sizes. How
much data is attached to a day, week, or minute? Could you use a partition and
clustering key like: ((source, time_bucket), timestamp)?
Then your application logic can iterate through time buckets to pull out the
Hello Team,
Sorry for this might be a simple question.
I was working on Cassandra 2.1.14
Node1 -- 4.5 mb data
Node2 -- 5.3 mb data
Node3 -- 4.9 mb data
Node3 was down since 90 days.
I brought it up and it joined the cluster.
To sync data I ran nodetool repair --full
Repair was