The information provided by micrometer instrumentation should be consistent with the values produced by Hadoop metrics. Things like gauges and counters are straight forward and should match 1:1. Things that collect / calculate statics may be slightly different due to implementation details - say the way binning for histograms is performed - they will still be mathematically correct and the values they report should still be consistent, but they might be "different".
An issue with metrics is that each collection system seems to have slight variations in the way they want things collected and reported. Micrometer supports various monitoring systems and a way to implement others if a particular system is not currently supported. In micrometer, each registry provides for converting / supporting a specific monitoring system. This includes things like name conversions, rate aggregation (client vs. server) and push vs. pull. Our current metrics were named with a specific metrics system and a naming convention - rather than trying to match our current names exactly we could follow the micrometer naming convention and then rely on the micrometer registry conversion to match the user's defined collection system. Adopting and following the micrometer conventions should increase our compatibility with other collection systems and ease user implementations. In places where this might result in a name change, I think we should prioritize constancy and normalizing names with conventions. That would seem to provide the least surprise to end users and increase their flexibility to meet their needs. We should also look to take advantage of tagging to allow for aggregation and dimensional drill down to increase utility to end users. To the extent that this changes a reported metric name, the increased utility and flexibility provided would benefit end-users. While any name change would increase friction for current metric consumers, the degree of friction seems independent of the amount of change - any change might be disruptive. I am not advocating that we should change names just to change them - rather we should seek to provide uniform names and consistent naming conventions across our codebase as primary consideration and allow the reported names fall out from there. The configuration of each monitoring system will depend on the system chosen by the user. We should provide a select set of examples (I advocate Prometheus, some flavor of statsd and logging) to guide users if one of those do not fit their requirements and they elect to use a different micrometer module / collection system. I agree that we should supply documentation mapping current names to their micrometer equivalents - the specific name reported will be dependent on the conversions performed by the target system - but those should be documented in each module and is not within our scope. -----Original Message----- From: Keith Turner <ke...@deenlo.com> Sent: Tuesday, September 21, 2021 5:07 PM To: Accumulo Dev List <dev@accumulo.apache.org> Subject: Re: Metrics Replacement On Tue, Sep 21, 2021 at 3:45 PM Dave Marion <dmario...@gmail.com> wrote: > > There is a WIP pull request against 2.1.0-SNAPSHOT for replacing the > Hadoop > Metrics2 framework with Micrometer[1]. Micrometer suggests using a > naming pattern[2] for the metrics internally where words are all > lowercase separated by a period. Micrometer output formats then > rewrite the metric names to the destination specific format. It's > possible that we may not be able to produce metrics in the same exact > way as the Hadoop Metrics2 Is it only the naming pattern that will cause incompatibility, or is it more than that? Like would a timer, guage, etc in micrometer produce different information/metrics than a timer,gauge,etc in hadoop metrics? I suspect these would differ and that would also impact compat. Will the way in which accumulo is configured to report metrics also change? I can't imagine it would be the same, but I have not looked at the PR. Can you provide an example of a naming incompat where it has to change? > framework. Metrics are not part of the public API, but we do want to > try and retain as much backwards compatibility as possible. In the > event that we cannot get that compatibility it has been suggested that > we document how things are different. As I have limited knowledge of > how the metrics are Is there a reasonable path to achieving compatibility? If not, it seems like documenting what has changed is a good way to go. Could possibly explain it in detail in the 2.1.0 release notes and have a link to that in the user manual. > being used today, I'm looking for some feedback from the community as > to how painful it would be if metric names changed in a minor release. > > [1] https://micrometer.io/ > [2] https://micrometer.io/docs/concepts#_naming_meters