Hi Bart,

Before changing anything, I would verify whether or not the affected broker is 
trying to catch up. Have you looked at the broker’s log? Do you see any errors? 
Check your metrics or the partition directories themselves to see if data is 
flowing into the broker.

If you do want to reset the broker to have it start a fresh resync, stop the 
kafka broker service/process, 'rm -rf /path/to/kafka-logs' — check the value of 
your log.dir or log.dirs property in your server.properties file for the path — 
and then start the service again. It should check in with zookeeper and then 
start following the topic partition leaders for all the topic partition 
replicas assigned to it.

-- Peter

>> On Oct 18, 2019, at 12:16 AM, Bart van Deenen <[email protected]> 
>> wrote:
> Hi all
> 
> We had a Kafka broker failure (too many open files, stupid), and now the 
> partitions on that broker will no longer become part of the ISR set. It's 
> been a few days (organizational issues), and we have significant amounts of 
> data on the ISR partitions.
> 
> In order to make the partitions on the broker become part of the ISR set 
> again, should I:
> 
> * increase `replica.lag.time.max.ms` on the broker to the number of ms that 
> the partitions are behind. I can guesstimate the value to about 7 days, or 
> should I measure it somehow?
> * stop the broker and wipe files (which ones?) and then restart it. Should I 
> also do stuff on zookeeper ?
> 
> Is there any _official_ information on how to deal with this situation?
> 
> Thanks for helping!

Reply via email to