> On June 7, 2016, 6:07 a.m., Sid Wagle wrote:
> > Changes look good, only thing to consider is the changes to the metric 
> > name. Cluster Aggregation will not occur at topology level since appId = 
> > topologyName for metrics with the same metric name. Is the metric name to 
> > fine grained? Only thing to consider is whether we need task metrics to be 
> > aggregated across topology? If yes, taskId cannot be part of the metric 
> > name. 
> > 
> > Could you also add Aravindan Vijayan to the reviewers? Thanks.
> 
> Sid Wagle wrote:
>     Rephrase: Cluster Aggregation will *now* occur at topology level
> 
> Jungtaek Lim wrote:
>     > Only thing to consider is whether we need task metrics to be aggregated 
> across topology? If yes, taskId cannot be part of the metric name. 
>     
>     It depends on users, but most cases I don't think user will aggregate 
> metrics across topologies.
>     
>     And like what I was saying, technically there're no way to aggregate 
> metrics on sink side since parallelism of sink can be set to more than 1 (I 
> mean, we could have multiple aggregation points which breaks aggregation.)
>     So we need to push task level metrics into AMS, and task id should be 
> included as metric name for making it unique.
>     
>     Based on that, we need `series aggregation` to aggregate task level 
> metrics by higher level. (I'm trying to address this via AMBARI-17027)
>     
>     The only thing which affects aggregation is appId.
>     
>     - When we set appId to 'component name' (current), same component (Spout, 
> Bolt) name across topologies can be queried together.
>     - When we set appId to 'topology name', same metric name (for Storm's 
> view) from different components in topology can be queried together.
>     - When we set appId to 'Storm', all metrics can be queried together. 
> (metric name should also include topology name as well)
>     
>     I'm not familiar with structure of metrics so I'm not sure how they 
> affects performance while storing / querying. So I'd like to hear opinions on 
> reviewers.
>     
>     For reference, below is how storm-graphite composes prefix of metric 
> name. It uses topology name, component name, worker host, worker port, task 
> id.
>     
> https://github.com/verisign/storm-graphite/blob/master/src/main/java/com/verisign/storm/metrics/GraphiteMetricsConsumer.java#L278
>     
>     When querying metrics from Graphite users can query with wildcards & 
> series function to aggregate metrics into one and Grafana can show that. 
> That's what I want to address to AMS.
> 
> Aravindan Vijayan wrote:
>     Jungtaek, there is also another level of classification called 
> "instancedId" Every appId can have multiple instanceIds. You should find that 
> in the TimelineMetric object. Perhaps, we can use that here.
> 
> Sriharsha Chintalapani wrote:
>     @Jungtaek what Sid was saying is we won't be able to see the aggregated 
> metrics for a single topology if we use taskId. Lets say if I want to see how 
> many tuples are processed in last hour for that topology that aggregation 
> won't be possible using taskId.
> 
> Jungtaek Lim wrote:
>     @Aravindan
>     Yeah, actually I was trying to use instanceId for the first time, but 
> TimelineMetricCache (TimelineMetricHolder) only checks metric name as unique 
> key while putting so that's what I described. Furthermore, I would want to 
> aggregate task level metrics into component level metrics but there's no 
> wildcard support on instanceId. So there seems to much effort to use 
> instanceId for taskId, and having taskId as metric name is easier way to 
> achieve aggregation for now.
>     But I'm open to use instanceId, and support wildcard & aggregation on 
> this. Please let me know we would want to use instanceId. @Sid @Sriharsha
>     
>     @Sriharsha
>     I guess what Sid was saying is opposite to what you're saying. Let's 
> pretend cluster aggregation is occurred.
>     
>     A. current
>     
>     appId is component name and metric name is just metric name so there's no 
> distinction between topologies. (topology name is not placed anywhere now) So 
> cluster aggregation will aggregate metrics across topologies, and we can't 
> query last 1 hours of Topology T1, Bolt B1, Metric M1. There're no way to 
> query within Topology T1 regardless of aggregation.
>     
>     B. after patch
>     
>     appId is topology name and metric name is component name + task id + 
> metric name so metrics are classified as topology name. So cluster 
> aggregation will aggregate metrics for each topology, and we can't query last 
> 1 hours of all topologies, Bolt B1, Metric M1.
>     There're no way to query across topologies regardless of aggregation.
>     
>     So if we want to have full flexible queries, topology name also should be 
> placed to metric name as same as component name and task id.
>     
>     @Sid Please correct me if I'm wrong.
>     
>     Metrics aggregation at query time is not possible for now but will be 
> possible with AMBARI-17027.
>     https://issues.apache.org/jira/browse/AMBARI-17027
>     
>     From what I'm understanding, AMS aggregates metrics by host level and 
> stores to other tables.
>     But there could be another kind of aggregations so having flexible 
> aggregation is great.

@Jungtaek Makes sense about aggreation of topology metrics. We don't need that 
anyway.


- Sriharsha


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/48030/#review136411
-----------------------------------------------------------


On June 9, 2016, 1:58 a.m., Jungtaek Lim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/48030/
> -----------------------------------------------------------
> 
> (Updated June 9, 2016, 1:58 a.m.)
> 
> 
> Review request for Ambari, Aravindan Vijayan, Sriharsha Chintalapani, and Sid 
> Wagle.
> 
> 
> Bugs: AMBARI-16946
>     https://issues.apache.org/jira/browse/AMBARI-16946
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> There's a mismatch between TimelineMetricsCache and Storm metrics unit, while 
> TimelineMetricsCache considers "metric name + timestamp" to be unique but 
> Storm is not.
> 
> For example, assume that bolt B has task T1, T2 and B has registered metrics 
> M1. It's possible for metrics sink to receive (T1, M1) and (T2, M1) with same 
> timestamp TS1 (in TaskInfo, not current time), and received later will be 
> discarded from TimelineMetricsCache.
> 
> If we want to have unique metric point of Storm, we should use "topology name 
> + component name + task id + metric name" to metric name so that "metric name 
> + timestamp" will be unique.
> 
> There're other issues I would like to address, too.
> 
> - Currently, hostname is written to hostname of the machine which runs 
> metrics sink. Since TaskInfo has hostname of the machine which runs task, 
> we're better to use this.
> - Unit of timestamp of TaskInfo is second, while Storm Metrics Sink uses this 
> as millisecond, resulting in timestamp flaw, and malfunction of cache 
> eviction. It should be multiplied by 1000.
> - 'component name' is not unique across the cluster, so it's not fit for app 
> id. 'topology name' is unique so proper value of app id is topology name.
> 
> Consideration: Hostname for determining 'write shard' is set to hostname of 
> the machine which runs metrics sink. Since I don't know read also be sharded, 
> I'm not sure it's safe to use TaskInfo's hostname to hostname of 
> TimelineMetric. Please review carefully regarding this.
> 
> 
> Diffs
> -----
> 
>   
> ambari-metrics/ambari-metrics-storm-sink/src/main/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsSink.java
>  02f5598 
>   
> ambari-metrics/ambari-metrics-storm-sink/src/test/java/org/apache/hadoop/metrics2/sink/storm/StormTimelineMetricsSinkTest.java
>  8171a4d 
> 
> Diff: https://reviews.apache.org/r/48030/diff/
> 
> 
> Testing
> -------
> 
> I tested this only ambari-metrics module since changeset is not related on 
> other modules.
> 
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Reactor Summary:
> [INFO]
> [INFO] ambari-metrics ..................................... SUCCESS [  0.896 
> s]
> [INFO] Ambari Metrics Common .............................. SUCCESS [ 13.386 
> s]
> [INFO] Ambari Metrics Hadoop Sink ......................... SUCCESS [  5.085 
> s]
> [INFO] Ambari Metrics Flume Sink .......................... SUCCESS [  6.827 
> s]
> [INFO] Ambari Metrics Kafka Sink .......................... SUCCESS [  4.190 
> s]
> [INFO] Ambari Metrics Storm Sink .......................... SUCCESS [  1.384 
> s]
> [INFO] Ambari Metrics Collector ........................... SUCCESS [04:06 
> min]
> [INFO] Ambari Metrics Monitor ............................. SUCCESS [  3.556 
> s]
> [INFO] Ambari Metrics Grafana ............................. SUCCESS [01:03 
> min]
> [INFO] Ambari Metrics Assembly ............................ SUCCESS [  3.567 
> s]
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] BUILD SUCCESS
> [INFO] 
> ------------------------------------------------------------------------
> [INFO] Total time: 05:48 min
> [INFO] Finished at: 2016-05-30T16:46:07+09:00
> [INFO] Final Memory: 78M/1038M
> [INFO] 
> ------------------------------------------------------------------------
> 
> In fact, I tried to run `mvn test` from ambari root directory but build is 
> failing from ambari-web.
> 
> ```
> > fsevents@0.2.1 install 
> > /Users/jlim/WorkArea/JavaProjects/ambari/ambari-web/node_modules/chokidar/node_modules/fsevents
> > node-gyp rebuild
> ...
> npm WARN install:fsevents fsevents@0.2.1 install: `node-gyp rebuild`
> npm WARN install:fsevents Exit status 1
> ```
> 
> No luck on `npm install`, too.
> 
> 
> Thanks,
> 
> Jungtaek Lim
> 
>

Reply via email to