Jungtaek Lim created AMBARI-16946:
-------------------------------------
Summary: Storm Metrics Sink has high chance to discard some
datapoints
Key: AMBARI-16946
URL: https://issues.apache.org/jira/browse/AMBARI-16946
Project: Ambari
Issue Type: Bug
Components: ambari-metrics
Reporter: Jungtaek Lim
There's a mismatch between TimelineMetricsCache and Storm metrics unit, while
TimelineMetricsCache considers "metric name + timestamp" to be unique but Storm
is not.
For example, assume that bolt B has task T1, T2 and B has registered metrics
M1. It's possible for metrics sink to receive (T1, M1) and (T2, M1) with same
timestamp TS1 (in TaskInfo, not current time), and received later will be
discarded from TimelineMetricsCache.
If we want to have unique metric point of Storm, we should use "topology name +
component name + task id + metric name" to metric name so that "metric name +
timestamp" will be unique.
There're other issues I would like to address, too.
- Currently, hostname is written to hostname of the machine which runs metrics
sink. Since TaskInfo has hostname of the machine which runs task, we're better
to use this.
- Unit of timestamp of TaskInfo is second, while Storm Metrics Sink uses this
as millisecond, resulting in timestamp flaw, and malfunction of cache eviction.
It should be multiplied by 1000.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)