Re: [DISCUSS] KIP-831: Add metric for log recovery progress

Luke Chen Sat, 21 May 2022 21:11:13 -0700

Hi Tom and Raman,

Thanks for your comments.


> 1. There's not a JIRA for this KIP (or the JIRA link needs updating).
2. Similarly the link to this discussion thread needs updating.
> Please update the links to JIRA and the discussion thread.

Yes, thanks for the reminder. I've updated the KIP.

> 3. I wonder whether we need to keep these metrics (with value 0) once the
broker enters the running state. Do you see it as valuable? A benefit of
removing the metrics would be a reduction on storage required for metric
stores which are recording these metrics.

Yes, removing the metrics after log recovery completed is a good idea.
Updated the KIP.

> 4. I think the KIP's public interfaces section could be a bit clearer.
Previous KIPs which added metrics usually used a table, with the MBean
name, metric type and description. SeeKIP-551 for example (or KIP-748,
KIP-608). Similarly you could use a table in the proposed changes section
rather than describing the tree you'd see in an MBean console.

Good point! Updated the KIP to use a table to list the MBean name, metric
type and descriptions.


Thank you.
Luke

On Fri, May 20, 2022 at 9:13 AM Raman Verma <[email protected]>
wrote:

> Hi Luke,
>
> The change is useful and simple. Thanks.
> Please update the links to JIRA and the discussion thread.
>
> Best Regards,
> Raman Verma
>
> On Thu, May 19, 2022 at 8:57 AM Tom Bentley <[email protected]> wrote:
> >
> > Hi Luke,
> >
> > Thanks for the KIP. I think the idea makes sense and would provide useful
> > observability of log recovery. I have a few comments.
> >
> > 1. There's not a JIRA for this KIP (or the JIRA link needs updating).
> > 2. Similarly the link to this discussion thread needs updating.
> > 3. I wonder whether we need to keep these metrics (with value 0) once the
> > broker enters the running state. Do you see it as valuable? A benefit of
> > removing the metrics would be a reduction on storage required for metric
> > stores which are recording these metrics.
> > 4. I think the KIP's public interfaces section could be a bit clearer.
> > Previous KIPs which added metrics usually used a table, with the MBean
> > name, metric type and description. SeeKIP-551 for example (or KIP-748,
> > KIP-608). Similarly you could use a table in the proposed changes section
> > rather than describing the tree you'd see in an MBean console.
> >
> > Kind regards,
> >
> > Tom
> >
> > On Wed, 11 May 2022 at 09:08, Luke Chen <[email protected]> wrote:
> >
> > > > And if people start using RemainingLogs and RemainingSegments and
> then
> > > REALLY FEEL like they need RemainingBytes, then we can always add it
> in the
> > > future.
> > >
> > > +1
> > >
> > > Thanks James!
> > > Luke
> > >
> > > On Wed, May 11, 2022 at 3:57 PM James Cheng <[email protected]>
> wrote:
> > >
> > > > Hi Luke,
> > > >
> > > > Thanks for the detailed explanation. I agree that the current
> proposal of
> > > > RemainingLogs and RemainingSegments will greatly improve the
> situation,
> > > and
> > > > that we can go ahead with the KIP as is.
> > > >
> > > > If RemainingBytes were straight-forward to implement, then I’d like
> to
> > > > have it. But we can live without it for now. And if people start
> using
> > > > RemainingLogs and RemainingSegments and then REALLY FEEL like they
> need
> > > > RemainingBytes, then we can always add it in the future.
> > > >
> > > > Thanks Luke, for the detailed explanation, and for responding to my
> > > > feedback!
> > > >
> > > > -James
> > > >
> > > > Sent from my iPhone
> > > >
> > > > > On May 10, 2022, at 6:48 AM, Luke Chen <[email protected]> wrote:
> > > > >
> > > > > Hi James and all,
> > > > >
> > > > > I checked again and I can see when creating UnifiedLog, we
> expected the
> > > > > logs/indexes/snapshots are in good state.
> > > > > So, I don't think we should break the current design to expose the
> > > > > `RemainingBytesToRecovery`
> > > > > metric.
> > > > >
> > > > > If there is no other comments, I'll start a vote within this week.
> > > > >
> > > > > Thank you.
> > > > > Luke
> > > > >
> > > > >> On Fri, May 6, 2022 at 6:00 PM Luke Chen <[email protected]>
> wrote:
> > > > >>
> > > > >> Hi James,
> > > > >>
> > > > >> Thanks for your input.
> > > > >>
> > > > >> For the `RemainingBytesToRecovery` metric proposal, I think
> there's
> > > one
> > > > >> thing I didn't make it clear.
> > > > >> Currently, when log manager start up, we'll try to load all logs
> > > > >> (segments), and during the log loading, we'll try to recover logs
> if
> > > > >> necessary.
> > > > >> And the logs loading is using "thread pool" as you thought.
> > > > >>
> > > > >> So, here's the problem:
> > > > >> All segments in each log folder (partition) will be loaded in
> each log
> > > > >> recovery thread, and until it's loaded, we can know how many
> segments
> > > > (or
> > > > >> how many Bytes) needed to recover.
> > > > >> That means, if we have 10 partition logs in one broker, and we
> have 2
> > > > log
> > > > >> recovery threads (num.recovery.threads.per.data.dir=2), before the
> > > > >> threads load the segments in each log, we only know how many logs
> > > > >> (partitions) we have in the broker (i.e. RemainingLogsToRecover
> > > metric).
> > > > >> We cannot know how many segments/Bytes needed to recover until
> each
> > > > thread
> > > > >> starts to load the segments under one log (partition).
> > > > >>
> > > > >> So, the example in the KIP, it shows:
> > > > >> Currently, there are still 5 logs (partitions) needed to recover
> under
> > > > >> /tmp/log1 dir. And there are 2 threads doing the jobs, where one
> > > thread
> > > > has
> > > > >> 10000 segments needed to recover, and the other one has 3 segments
> > > > needed
> > > > >> to recover.
> > > > >>
> > > > >>   - kafka.log
> > > > >>      - LogManager
> > > > >>         - RemainingLogsToRecover
> > > > >>            - /tmp/log1 => 5            ← there are 5 logs under
> > > > >>            /tmp/log1 needed to be recovered
> > > > >>            - /tmp/log2 => 0
> > > > >>         - RemainingSegmentsToRecover
> > > > >>            - /tmp/log1                     ← 2 threads are doing
> log
> > > > >>            recovery for /tmp/log1
> > > > >>            - 0 => 10000         ← there are 10000 segments needed
> to
> > > be
> > > > >>               recovered for thread 0
> > > > >>               - 1 => 3
> > > > >>               - /tmp/log2
> > > > >>               - 0 => 0
> > > > >>               - 1 => 0
> > > > >>
> > > > >>
> > > > >> So, after a while, the metrics might look like this:
> > > > >> It said, now, there are only 4 logs needed to recover in
> /tmp/log1,
> > > and
> > > > >> the thread 0 has 9000 segments left, and thread 1 has 5 segments
> left
> > > > >> (which should imply the thread already completed 2 logs recovery
> in
> > > the
> > > > >> period)
> > > > >>
> > > > >>   - kafka.log
> > > > >>      - LogManager
> > > > >>         - RemainingLogsToRecover
> > > > >>            - /tmp/log1 => 3            ← there are 3 logs under
> > > > >>            /tmp/log1 needed to be recovered
> > > > >>            - /tmp/log2 => 0
> > > > >>         - RemainingSegmentsToRecover
> > > > >>            - /tmp/log1                     ← 2 threads are doing
> log
> > > > >>            recovery for /tmp/log1
> > > > >>            - 0 => 9000         ← there are 9000 segments needed
> to be
> > > > >>               recovered for thread 0
> > > > >>               - 1 => 5
> > > > >>               - /tmp/log2
> > > > >>               - 0 => 0
> > > > >>               - 1 => 0
> > > > >>
> > > > >>
> > > > >> That said, the `RemainingBytesToRecovery` metric is difficult to
> > > achieve
> > > > >> as you expected. I think the current proposal with
> > > > `RemainingLogsToRecover`
> > > > >> and `RemainingSegmentsToRecover` should already provide enough
> info
> > > for
> > > > >> the log recovery progress.
> > > > >>
> > > > >> I've also updated the KIP example to make it clear.
> > > > >>
> > > > >>
> > > > >> Thank you.
> > > > >> Luke
> > > > >>
> > > > >>
> > > > >>> On Thu, May 5, 2022 at 3:31 AM James Cheng <[email protected]
> >
> > > > wrote:
> > > > >>>
> > > > >>> Hi Luke,
> > > > >>>
> > > > >>> Thanks for adding RemainingSegmentsToRecovery.
> > > > >>>
> > > > >>> Another thought: different topics can have different segment
> sizes. I
> > > > >>> don't know how common it is, but it is possible. Some topics
> might
> > > want
> > > > >>> small segment sizes to more granular expiration of data.
> > > > >>>
> > > > >>> The downside of RemainingLogsToRecovery and
> > > RemainingSegmentsToRecovery
> > > > >>> is that the rate that they will decrement depends on the
> > > configuration
> > > > and
> > > > >>> patterns of the topics and partitions and segment sizes. If
> someone
> > > is
> > > > >>> monitoring those metrics, they might see times where the metric
> > > > decrements
> > > > >>> slowly, followed by a burst where it decrements quickly.
> > > > >>>
> > > > >>> What about RemainingBytesToRecovery? This would not depend on the
> > > > >>> configuration of the topic or of the data. It would actually be a
> > > > pretty
> > > > >>> good metric, because I think that this metric would change at a
> > > > constant
> > > > >>> rate (based on the disk I/O speed that the broker allocates to
> > > > recovery).
> > > > >>> Because it changes at a constant rate, you would be able to use
> the
> > > > >>> rate-of-change to predict when it hits zero, which will let you
> know
> > > > when
> > > > >>> the broker is going to start up. Like, I would imagine if we
> graphed
> > > > >>> RemainingBytesToRecovery that we'd see a fairly straight line
> that is
> > > > >>> decrementing at a steady rate towards zero.
> > > > >>>
> > > > >>> What do you think about adding RemainingBytesToRecovery?
> > > > >>>
> > > > >>> Or, what would you think about making the primary metric be
> > > > >>> RemainingBytesToRecovery, and getting rid of the others?
> > > > >>>
> > > > >>> I don't know if I personally would rather have all 3 metrics, or
> > > would
> > > > >>> just use RemainingBytesToRecovery. I'd too would like more
> community
> > > > input
> > > > >>> on which of those metrics would be useful to people.
> > > > >>>
> > > > >>> About the JMX metrics, you said that if
> > > > >>> num.recovery.threads.per.data.dir=2, that there might be a
> separate
> > > > >>> RemainingSegmentsToRecovery counter for each thread. Is that
> actually
> > > > how
> > > > >>> the data is structured within the Kafka recovery threads? Does
> each
> > > > thread
> > > > >>> get a fixed set of partitions, or is there just one big pool of
> > > > partitions
> > > > >>> that the threads all work on?
> > > > >>>
> > > > >>> As a more concrete example:
> > > > >>> * If I have 9 small partitions and 1 big partition, and
> > > > >>> num.recovery.threads.per.data.dir=2
> > > > >>> Does each thread get 5 partitions, which means one thread will
> finish
> > > > >>> much sooner than the other?
> > > > >>> OR
> > > > >>> Do both threads just work on the set of 10 partitions, which
> means
> > > > likely
> > > > >>> 1 thread will be busy with the big partition, while the other one
> > > ends
> > > > up
> > > > >>> plowing through the 9 small partitions?
> > > > >>>
> > > > >>> If each thread gets assigned 5 partitions, then it would make
> sense
> > > > that
> > > > >>> each thread has its own counter.
> > > > >>> If the threads works on a single pool of 10 partitions, then it
> would
> > > > >>> probably mean that the counter is on the pool of partitions
> itself,
> > > > and not
> > > > >>> on each thread.
> > > > >>>
> > > > >>> -James
> > > > >>>
> > > > >>>> On May 4, 2022, at 5:55 AM, Luke Chen <[email protected]>
> wrote:
> > > > >>>>
> > > > >>>> Hi devs,
> > > > >>>>
> > > > >>>> If there are no other comments, I'll start a vote tomorrow.
> > > > >>>>
> > > > >>>> Thank you.
> > > > >>>> Luke
> > > > >>>>
> > > > >>>> On Sun, May 1, 2022 at 5:08 PM Luke Chen <[email protected]>
> wrote:
> > > > >>>>
> > > > >>>>> Hi James,
> > > > >>>>>
> > > > >>>>> Sorry for the late reply.
> > > > >>>>>
> > > > >>>>> Yes, this is a good point, to know how many segments to be
> > > recovered
> > > > if
> > > > >>>>> there are some large partitions.
> > > > >>>>> I've updated the KIP, to add a `*RemainingSegmentsToRecover*`
> > > metric
> > > > >>> for
> > > > >>>>> each log recovery thread, to show the value.
> > > > >>>>> The example in the Proposed section here
> > > > >>>>> <
> > > > >>>
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress#KIP831:Addmetricforlogrecoveryprogress-ProposedChanges
> > > > >>>>
> > > > >>>>> shows what it will look like.
> > > > >>>>>
> > > > >>>>> Thanks for the suggestion.
> > > > >>>>>
> > > > >>>>> Thank you.
> > > > >>>>> Luke
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Sat, Apr 23, 2022 at 8:54 AM James Cheng <
> [email protected]>
> > > > >>> wrote:
> > > > >>>>>
> > > > >>>>>> The KIP describes RemainingLogsToRecovery, which seems to be
> the
> > > > >>> number
> > > > >>>>>> of partitions in each log.dir.
> > > > >>>>>>
> > > > >>>>>> We have some partitions which are much much larger than
> others.
> > > > Those
> > > > >>>>>> large partitions have many many more segments than others.
> > > > >>>>>>
> > > > >>>>>> Is there a way the metric can reflect partition size? Could
> it be
> > > > >>>>>> RemainingSegmentsToRecover? Or even RemainingBytesToRecover?
> > > > >>>>>>
> > > > >>>>>> -James
> > > > >>>>>>
> > > > >>>>>> Sent from my iPhone
> > > > >>>>>>
> > > > >>>>>>> On Apr 20, 2022, at 2:01 AM, Luke Chen <[email protected]>
> > > wrote:
> > > > >>>>>>>
> > > > >>>>>>> Hi all,
> > > > >>>>>>>
> > > > >>>>>>> I'd like to propose a KIP to expose a metric for log recovery
> > > > >>> progress.
> > > > >>>>>>> This metric would let the admins have a way to monitor the
> log
> > > > >>> recovery
> > > > >>>>>>> progress.
> > > > >>>>>>> Details can be found here:
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-831%3A+Add+metric+for+log+recovery+progress
> > > > >>>>>>>
> > > > >>>>>>> Any feedback is appreciated.
> > > > >>>>>>>
> > > > >>>>>>> Thank you.
> > > > >>>>>>> Luke
> > > > >>>>>>
> > > > >>>>>
> > > > >>>
> > > > >>>
> > > >
> > >
>
>
>
> --
> Best Regards,
> Raman Verma
>

Re: [DISCUSS] KIP-831: Add metric for log recovery progress

Reply via email to