Hi,

I'm wondering if there is a way to tell Kafka to spread the log file
deletion when decreasing the retention time of a topic, and if not, if
it would make sense.
I'm asking because this afternoon, after decreasing the retention time
from 2 months to 1 month on 4 of my topics, the whole cluster became
overloaded for approximately 15 minutes (every broker with 25+ load,
disk usage almost 100%), with leader reelection, under replicated
partitions, and a bunch of consumers unable to make progress.
The change removed 5Tib of data across the 4 topics and I didn't check
beforehand to make sure how it would affect disk i/o, so it's on me that
this happened, but seeing how much data was removed I think it would
make sense to delete only a couple segments at a time in order to not
overload the disks.
Right now I can only be careful and plan the decrease in small steps but
that's going to be a little tedious.
How does everyone deal with this ?

Reply via email to