Adar Dembo has posted comments on this change. Change subject: Add tablet state summary metrics and fix KUDU-2044 ......................................................................
Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/7618/2//COMMIT_MSG Commit Message: PS2, Line 11: The numbers are computed by the heartbeater. There's two reasons : for this: : 1. the heartbeater was already computing the number of RUNNING : and BOOTSTRAPPING tablets by holding the appropriate lock and : iterating over the tablet map : 2. the alternative to having some thread periodically iterate : over the tablet map is to increment and decrement the metrics : when tablets transition states. This is error-prone, : particularly if new states are added, and mistakes will : accumulate until the metric is worse than useless. > I could use a function gauge, but a really fast metric poll could contend w I'm not that worried about missing some places where calls to TransitionFromFooToBar() should go. What I don't like, though, is the additional plumbing that would require. It'd mean a backpointer to the TsTabletManager (with some hack for the master where there is no TsTabletManager), or a more abstract "notify tablet state change observers" pattern. How about this for a middle ground: 1. TsTabletManager maintains a timestamp representing the last time that the state metrics were calculated and a set of counters, one for each state. The timestamp could have a dedicated spinlock, or reuse an existing one if appropriate. 2. There are N function guages, one for each state. 3. When a guage is invoked, the timestamp is compared with the current time. If the delta exceeds some threshold (statically defined or configurable via gflag), the tablet map is walked and all of the counters are updated. If it doesn't, the current counter values are returned as-is. PS2, Line 23: A metric entity can now be : marked as hidden, so it will not print to /metrics, and : tablet metric entities are marked as hidden when the tablet is : tombstoned, and un-hidden if and when the tablet is revived. > They take up some memory as part of the TabletReplica that remains in the t Let's solicit another opinion first. Todd understands the metrics subsystem better than I do, and Mike can comment on the tombstoning/vivification lifecycle. -- To view, visit http://gerrit.cloudera.org:8080/7618 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8c82987ffe4f37e8af95972bde97841e44c521d9 Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Will Berkeley <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-HasComments: Yes
