Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2203 @ptgoetz for me this is all about the metadata for the metric. The code we have been working on for storing metrics in a DB that we can then use for scheduling is getting close to being done. We have a data structure like the following. (NOTE still a work in progress) ``` struct WorkerMetric { 1: required string metricName; 2: required i64 timestamp; 3: required double metricValue; 4: required string componentId; 5: required string executorId; 6: required string streamId; } struct WorkerMetricList { 1: list<WorkerMetric> metrics; } struct WorkerMetrics { 1: required string topologyId; 2: required i32 port; 3: required string hostname; 4: required WorkerMetricList metricList; } ``` Having metadata separated out for topology, host, port, etc allow us to easily ask questions like how many messages were emitted by all instances of bolt "B" on streamId "default" for the topology "T". I assume people using ganglia or nagios want to do the same thing without having to setup a separate metric for each bolt/spout/hostname/port combination possible. Almost every metrics system I have seen supports these kinds of metadata or tags, but almost none of the metrics APIs I have seen support this (dropwizard included). As such we need a way to parse these values out of the metric name in a reliable way. Making the name more configurable does not help me in this. In fact it makes it much more difficult. The current proposed format is: ``` String.format("storm.worker.%s.%s.%s.%s-%s", stormId, hostName, componentId, workerPort, name); ``` It has most of the information we need. It is missing the stream id and executor id, which for some internal metrics we really need if we are to replace the metrics on the UI and reduce the load on zookeeper. But the issue for me is how do I get the metadata back out in a reliable way? topologyid is simple to parse out because there is no '.' character allowed. hostName is going to have '.' in it, so I don't know what to do there. ComponentId and streamId I think are totally open so again I am in trouble no matter what I do, and then the name itself we could do some restrictions on to make it simpler. All I want is some API that can take a metric name and reliably return to me the fields that we put into creating the name.
---