Github user revans2 commented on the issue:
https://github.com/apache/storm/pull/2203
@ptgoetz for me this is all about the metadata for the metric.
The code we have been working on for storing metrics in a DB that we can
then use for scheduling is getting close to being done. We have a data
structure like the following. (NOTE still a work in progress)
```
struct WorkerMetric {
1: required string metricName;
2: required i64 timestamp;
3: required double metricValue;
4: required string componentId;
5: required string executorId;
6: required string streamId;
}
struct WorkerMetricList {
1: list<WorkerMetric> metrics;
}
struct WorkerMetrics {
1: required string topologyId;
2: required i32 port;
3: required string hostname;
4: required WorkerMetricList metricList;
}
```
Having metadata separated out for topology, host, port, etc allow us to
easily ask questions like how many messages were emitted by all instances of
bolt "B" on streamId "default" for the topology "T". I assume people using
ganglia or nagios want to do the same thing without having to setup a separate
metric for each bolt/spout/hostname/port combination possible.
Almost every metrics system I have seen supports these kinds of metadata or
tags, but almost none of the metrics APIs I have seen support this (dropwizard
included). As such we need a way to parse these values out of the metric name
in a reliable way. Making the name more configurable does not help me in this.
In fact it makes it much more difficult.
The current proposed format is:
```
String.format("storm.worker.%s.%s.%s.%s-%s", stormId, hostName,
componentId, workerPort, name);
```
It has most of the information we need. It is missing the stream id and
executor id, which for some internal metrics we really need if we are to
replace the metrics on the UI and reduce the load on zookeeper.
But the issue for me is how do I get the metadata back out in a reliable
way? topologyid is simple to parse out because there is no '.' character
allowed. hostName is going to have '.' in it, so I don't know what to do
there. ComponentId and streamId I think are totally open so again I am in
trouble no matter what I do, and then the name itself we could do some
restrictions on to make it simpler.
All I want is some API that can take a metric name and reliably return to
me the fields that we put into creating the name.
---