Re: [DISCUSS] KIP-143: Controller Health Metrics

Ismael Juma Wed, 10 May 2017 18:12:44 -0700

Thanks Jun. Discussed this offline with Onur and Jun and I believe there's
agreement so updated the KIP:


https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?
pageId=69407758&selectedPageVersions=8&selectedPageVersions=7

Ismael

On Wed, May 10, 2017 at 4:46 PM, Jun Rao <[email protected]> wrote:

> Hi, Onur,
>
> We probably don't want to do the 1-to-1 mapping from the event type to the
> controller state since some of the event types are implementation details.
> How about the following mapping?
>
> 0 - idle
> 1 - controller change (Startup, ControllerChange, Reelect)
> 2 - broker change (BrokerChange)
> 3 - topic creation/change (TopicChange, PartitionModifications)
> 4 - topic deletion (TopicDeletion, TopicDeletionStopReplicaResult)
> 5 - partition reassigning (PartitionReassignment,
> PartitionReassignmentIsrChange)
> 6 - auto leader balancing (AutoPreferredReplicaLeaderElection)
> 7 - manual leader balancing (PreferredReplicaLeaderElection)
> 8 - controlled shutdown (ControlledShutdown)
> 9 - isr change (IsrChangeNotification)
>
> For each state, we will add a corresponding timer to track the rate and the
> latency, if it's not there already (e.g., broker change and controlled
> shutdown). If there are future changes to the controller, we can make a
> call whether the new event should be mapped to one of the existing states
> or a new state.
>
> Thanks,
>
> Jun
>
> On Tue, May 9, 2017 at 6:17 PM, Onur Karaman <[email protected]
> >
> wrote:
>
> > @Ismael, Jun
> > After bringing up an earlier point twice now, it still doesn't feel like
> > it's been commented on/addressed, so I'm going to give it one more shot:
> > Assuming that ControllerState should reflect the current event being
> > processed, the KIP is missing states.
> >
> > The controller currently has 14 event types:
> > BrokerChange
> > TopicChange
> > PartitionModifications
> > TopicDeletion
> > PartitionReassignment
> > PartitionReassignmentIsrChange
> > IsrChangeNotification
> > PreferredReplicaLeaderElection
> > AutoPreferredReplicaLeaderElection
> > ControlledShutdown
> > TopicDeletionStopReplicaResult
> > Startup
> > ControllerChange
> > Reelect
> >
> > The KIP only shows 10 event types (and it's not a proper subset of the
> > above set).
> >
> > I think this mismatch would cause the ControllerState to incorrectly be
> in
> > the Idle state when in fact the controller could be doing a lot of work.
> >
> > 1. Should ControllerState exactly consist of the 14 controller event
> types
> > + the 1 Idle state?
> > 2. If so, what's the process for adding/removing/merging event types
> > w.r.t. this metric?
> >
> > On Tue, May 9, 2017 at 4:45 PM, Ismael Juma <[email protected]> wrote:
> >
> >> Becket,
> >>
> >> Are you OK with extending the metrics via a subsequent KIP (assuming
> that
> >> what we're doing here doesn't prevent that)? The KIP freeze is tomorrow
> >> (although I will give it an extra day or two as many in the community
> have
> >> been attending the Kafka Summit this week), so we should avoid
> increasing
> >> the scope unless it's important for future improvements.
> >>
> >> Thanks,
> >> Ismael
> >>
> >> On Wed, May 10, 2017 at 12:09 AM, Jun Rao <[email protected]> wrote:
> >>
> >> > Hi, Becket,
> >> >
> >> > q10. The reason why there is not a timer metric for broker change
> event
> >> is
> >> > that the controller currently always has a LeaderElectionRateAndTimeMs
> >> > timer metric (in ControllerStats).
> >> >
> >> > q11. I agree that that it's useful to know the queue time in the
> >> > controller event queue and suggested that earlier. Onur thinks that
> >> it's a
> >> > bit too early to add that since we are about to change how to queue
> >> events
> >> > from ZK. Similarly, we will probably also make changes to batch
> requests
> >> > from the controller to the broker. So, perhaps we can add more metrics
> >> once
> >> > those changes in the controller have been made. For now, knowing the
> >> > controller state and the controller channel queue size is probably
> good
> >> > enough.
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >> >
> >> >
> >> > On Mon, May 8, 2017 at 10:05 PM, Becket Qin <[email protected]>
> >> wrote:
> >> >
> >> >> @Ismael,
> >> >>
> >> >> About the stage and event type. Yes, I think each event handling
> should
> >> >> have those stages covered. It is similar to what we are doing for the
> >> >> requests on the broker side. We have benefited from such systematic
> >> metric
> >> >> structure a lot so I think it would be worth following the same way
> in
> >> the
> >> >> controller.
> >> >>
> >> >> As an example, for BrokerChangeEvent (or any event), I am thinking we
> >> >> would
> >> >> have the following metrics:
> >> >> 1. EventRate
> >> >> 2. EventQueueTime : The event queue time
> >> >> 3. EventHandlingTime: The event handling time (including zk path
> >> updates)
> >> >> 4. EventControllerChannelQueueTime: The queue time of the
> >> corresponding
> >> >> LeaderAndIsrRequest and UpdateMetadataRequest in the controller
> channel
> >> >> queue.
> >> >> 5. EventControllerChannelSendTime: The time to send the
> corresponding
> >> >> requests, could be the total time for the corresponding
> >> >> LeaderAndIsrRequest
> >> >> and UpdateMetadataRequest. It is typically small, but in some cases
> >> could
> >> >> be slower than we expect.
> >> >> 6. EventAckTime: The time waiting for all the corresponding
> responses.
> >> >> 7. EventHandlingTotalTime: sum of 2-6
> >> >>
> >> >> Note that all the metrics are from the event and cluster wide state
> >> >> transition point of view, but not from a single request type point of
> >> >> view.
> >> >> Among the above metrics, 4,5,6 could potentially be per broker.
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Jiangjie (Becket) Qin
> >> >>
> >> >> On Mon, May 8, 2017 at 7:24 PM, Onur Karaman <
> >> >> [email protected]>
> >> >> wrote:
> >> >>
> >> >> > I had a similar comment to Becket but accidentally posted it on the
> >> vote
> >> >> > thread last Friday. From that thread:
> >> >> > "I noticed that both the ControllerState metric and the
> >> *RateAndTimeMs
> >> >> > metrics
> >> >> > only cover a subset of the controller event types. Was this
> >> >> intentional?"
> >> >> >
> >> >> > I think it makes most sense to just have the ControllerState metric
> >> and
> >> >> > *RateAndTimeMs metrics exactly match up with the event types.
> >> >> >
> >> >> > On Mon, May 8, 2017 at 4:35 PM, Ismael Juma <[email protected]>
> >> wrote:
> >> >> >
> >> >> > > Hi Becket,
> >> >> > >
> >> >> > > Thanks for the feedback. Comments inline.
> >> >> > >
> >> >> > > On Tue, May 9, 2017 at 12:19 AM, Becket Qin <
> [email protected]>
> >> >> > wrote:
> >> >> > >>
> >> >> > >> q10. With event loop based controller design, it seems natrural
> to
> >> >> have
> >> >> > >> the
> >> >> > >> processing time for each controller event type. In that case,
> the
> >> >> > current
> >> >> > >> metrics seem not covering all the event processing time? e.g.
> >> (broker
> >> >> > >> joining and broker failure event handling time).
> >> >> > >>
> >> >> > >
> >> >> > > I'll leave it to Jun to explain why some metrics were not
> included
> >> in
> >> >> the
> >> >> > > proposal.
> >> >> > >
> >> >> > > q11. There are actually a couple of stages for controller state
> >> >> > transition
> >> >> > >> and propagation. More specifically:
> >> >> > >> Stage 1: Event queue time
> >> >> > >> Stage 2: Event process time (including all the zk path updates)
> >> >> > >> Stage 3: State propagation time (including the state propagation
> >> >> queuing
> >> >> > >> time on the controller and the actual request sent to response
> >> >> receive
> >> >> > >> time)
> >> >> > >>
> >> >> > >> I think it worth to have metrics for each of the stage. Stage 3
> >> might
> >> >> > be a
> >> >> > >> little tricky because we may need to potentially manage the per
> >> >> broker
> >> >> > >> propagation time. Arguably the state propagation time is a
> little
> >> >> > >> overlapping with the broker side request handling time, but for
> >> some
> >> >> > state
> >> >> > >> change that involves multiple types of requests, it could still
> be
> >> >> > useful
> >> >> > >> to know what is the time for a state transition to be propagated
> >> to
> >> >> the
> >> >> > >> entire cluster.
> >> >> > >>
> >> >> > >
> >> >> > > Can you please elaborate on how you would like to see this
> >> measured.
> >> >> Do
> >> >> > > you mean that each event would have a separate metric for each of
> >> >> these
> >> >> > > stages? Maybe a worked out example would make it clear.
> >> >> > >
> >> >> > > q11. Regarding (13), controller actually do not update the ISR
> but
> >> >> just
> >> >> > >> read it. The error message seems from the brokers. It usually
> >> >> happens in
> >> >> > >> the following case:
> >> >> > >> 1. Current leader broker has the cached ZNode version V
> >> >> > >> 2. The controller elected a new leader and updated the ZNode,
> now
> >> >> ZNode
> >> >> > >> version becomes V+1,
> >> >> > >> 3. Controller sends the LeaderAndIsrRequest to the replica
> >> brokers to
> >> >> > >> propagate the new leader information as well as the new
> zkVersion.
> >> >> > >> 4. Before the old leader process the LeaderAndIsrRequest from
> >> >> controller
> >> >> > >> in
> >> >> > >> step 3, The old leader tries to update ISR and found the cached
> >> >> > zkVersion
> >> >> > >> V
> >> >> > >> is different from the actual zkVersion V+1.
> >> >> > >>
> >> >> > >> It looks that this is not a controller metric.
> >> >> > >
> >> >> > >
> >> >> > > Yes, it's not a controller metric, that's why it's under the
> >> >> "Partition
> >> >> > > Metrics" section. In the PR, I actually implemented it in
> >> >> ReplicaManager
> >> >> > > alongside IsrExpandsPerSec and IsrShrinksPerSec.
> >> >> > >
> >> >> > > Thanks,
> >> >> > > Ismael
> >> >> > >
> >> >> >
> >> >>
> >> >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] KIP-143: Controller Health Metrics

Reply via email to