Vadim Chekan created SAMZA-926:
----------------------------------

             Summary: Use tags im metrics
                 Key: SAMZA-926
                 URL: https://issues.apache.org/jira/browse/SAMZA-926
             Project: Samza
          Issue Type: Improvement
          Components: metrics
    Affects Versions: 0.10.0
            Reporter: Vadim Chekan
            Assignee: Vadim Chekan
            Priority: Minor


We are using Grafana/Influxdb for metrics and current Samza's model does not 
fit it particularly well.
Influxdb recently introduced so called "tags" and Grafana UI offers gret value 
when using them. The idea is to keep metric name very simple, for example 
cpu.use, and supply the measure with tags, for example {datacenter: vegas,  
environment: staging, machine: vm-003, application: myApp}
>From what I can read, OpenTSDB use tags too.

Having tags instead of long metric names is much more convenient and in some 
cases the only way to perform some desired operations. For example, I want to 
have an alert for throughput of samza job. With tags encoded in metric name it 
is impossible because I would have to have a list of all machine names and 
samza job names in influxdb select statement, and even after that, there is no 
way to group them properly. With tags it is as simple as SELECT ... GROUP BY 
[[tag_host]],[[tag_samza_job_name]]. You can add new machines to the cluster 
and jobs to yarn, and they will appear with zero configuration effort in your 
metrics.

Currently, I partially mitigated the issue by ripping out 1st part of metric 
name (dot-separated parts) and making it "samza-src" tag, with the assumption 
that it is going to be container name. But in many metrics, partition number is 
encoded as part of metric name too. Its location is not consistent and not all 
metrics have it, I can not build alerting system on top of samza metrics.

Proposal:
Change samza internal metrics to use tags (string key-value pairs) and leave 
the job of constructing metric name to the output metric plugin.
This would allow to preserve backward compatibility and JMX reporter would 
construct metric name the same it is today, but Influxdb plugin would not 
modify the name and add list of tags to the measure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to