Re: Recovery of Kafka cluster takes very long time

Alexey Sverdelov Mon, 10 Aug 2015 09:25:04 -0700

Hi Todd,

It is a good idea, thanks. There is no "recovery.threads.per.data.dir"
entry in our server.properties (so, we run our cluster with default value
1). I will set it to 8 and try again.


Alexey

On Mon, Aug 10, 2015 at 6:13 PM, Todd Palino <tpal...@gmail.com> wrote:

> It looks like you did an unclean shutdown of the cluster, in which case
> each open log segment in each partition needs to be checked upon startup.
> It doesn't really have anything to do with RF=3 specifically, but it does
> mean that each of your brokers has 6000 partitions to check.
>
> What is the setting of recovery.threads.per.data.dir in your broker
> configuration? The default is 1, which means that upon startup and
> shutdown, the broker only uses 1 thread for checking/closing log segments.
> If you increase this, it will parallelize both the startup and shutdown
> process. This is particularly helpful for recovering from unclean shutdown.
> We generally set it to the number of CPUs in the system, because we want a
> fast recovery.
>
> -Todd
>
>
> On Mon, Aug 10, 2015 at 8:57 AM, Alexey Sverdelov <
> alexey.sverde...@googlemail.com> wrote:
>
> > Hi all,
> >
> > I have a 3 node Kafka cluster. There are ten topics, every topic has 600
> > partitions with RF3.
> >
> > So, after cluster restart I can see the following log message like "INFO
> > Recovering unflushed segment 0 in log..." and the complete recovery of 3
> > nodes takes about 2+ hours.
> >
> > I don't know why it takes so long? Is it because of RF=3?
> >
> > Have a nice day,
> > Alexey
> >
>

Re: Recovery of Kafka cluster takes very long time

Reply via email to