RE: Metrics Replacement

dev1 Wed, 22 Sep 2021 07:03:54 -0700

The information provided by micrometer instrumentation should be consistent 
with the values produced by Hadoop metrics.  Things like gauges and counters 
are straight forward and should match 1:1.  Things that collect / calculate 
statics may be slightly different due to implementation details - say the way 
binning for histograms is performed - they will still be mathematically correct 
and the values they report should still be consistent, but they might be 
"different".

An issue with metrics is that each collection system seems to have slight 
variations in the way they want things collected and reported. Micrometer 
supports various monitoring systems and a way to implement others if a 
particular system is not currently supported.  In micrometer, each registry 
provides for converting / supporting a specific monitoring system.  This 
includes things like name conversions, rate aggregation (client vs. server) and 
push vs. pull. Our current metrics were named with a specific metrics system 
and a naming convention - rather than trying to match our current names exactly 
we could follow the micrometer naming convention and then rely on the 
micrometer registry conversion to match the user's defined collection system. 

Adopting and following the micrometer conventions should increase our 
compatibility with other collection systems and ease user implementations.  In 
places where this might result in a name change, I think we should prioritize 
constancy and normalizing names with conventions. That would seem to provide 
the least surprise to end users and increase their flexibility to meet their 
needs. We should also look to take advantage of tagging to allow for 
aggregation and dimensional drill down to increase utility to end users. To the 
extent that this changes a reported metric name, the increased utility and 
flexibility provided would benefit end-users.  While any name change would 
increase friction for current metric consumers, the degree of friction seems 
independent of the amount of change - any change might be disruptive.  I am not 
advocating that we should change names just to change them - rather we should 
seek to provide uniform names and consistent naming conventions across our 
codebase as primary consideration and allow the reported names fall out from 
there.

The configuration of each monitoring system will depend on the system chosen by 
the user.  We should provide a select set of examples (I advocate Prometheus, 
some flavor of statsd and logging) to guide users if one of those do not fit 
their requirements and they elect to use a different micrometer module / 
collection system.

I agree that we should supply documentation mapping current names to their 
micrometer equivalents -  the specific name reported will be dependent on the 
conversions performed by the target system - but those should be documented in 
each module and is not within our scope.

-----Original Message-----
From: Keith Turner <[email protected]> 
Sent: Tuesday, September 21, 2021 5:07 PM
To: Accumulo Dev List <[email protected]>
Subject: Re: Metrics Replacement

On Tue, Sep 21, 2021 at 3:45 PM Dave Marion <[email protected]> wrote:
>
> There is a WIP pull request against 2.1.0-SNAPSHOT for replacing the 
> Hadoop
> Metrics2 framework with Micrometer[1]. Micrometer suggests using a 
> naming pattern[2] for the metrics internally where words are all 
> lowercase separated by a period. Micrometer output formats then 
> rewrite the metric names to the destination specific format. It's 
> possible that we may not be able to produce metrics in the same exact 
> way as the Hadoop Metrics2

Is it only the naming pattern that will cause incompatibility, or is it more 
than that?  Like would a timer, guage, etc in micrometer produce different 
information/metrics than a timer,gauge,etc in hadoop metrics?  I suspect these 
would differ and that would also impact compat.  Will the way in which accumulo 
is configured to report metrics also change?  I can't imagine it would be the 
same, but I have not looked at the PR.

Can you provide an example of a naming incompat where it has to change?

> framework. Metrics are not part of the public API, but we do want to 
> try and retain as much backwards compatibility as possible. In the 
> event that we cannot get that compatibility it has been suggested that 
> we document how things are different. As I have limited knowledge of 
> how the metrics are

Is there a reasonable path to achieving compatibility?  If not, it seems like 
documenting what has changed is a good way to go.  Could possibly explain it in 
detail in the 2.1.0 release notes and have a link to that in the user manual.

> being used today, I'm looking for some feedback from the community as 
> to how painful it would be if metric names changed in a minor release.
>
> [1] https://micrometer.io/
> [2] https://micrometer.io/docs/concepts#_naming_meters

RE: Metrics Replacement

Reply via email to