Thanks Bobby for the prompt reply and the info. The part about the UI metrics coming from worker heartbeats and being stored in zookeeper is interesting. I had a question here and I am not very familiar with Clojure, so I have trouble reading the storm-core code - When the worker heartbeats, I am assuming this triggers a dispatch of some information to the zookeeper and if the metrics are a part of that information, are they drawn from the same stream as the one from which the LoggingMetricsConsumer receives them? (the __Metrics stream, I believe).
For the part where you mention that Storm does guarantee delivery of metrics - Is this because of the way a MetricsConsumer is implemented. I believe every MetricsConsumer is tied to a MetricsConsumerBolt, which receives the Metric DataPoints (from the __Metrics stream?) and there is a possibility that Metric Datapoints can get lost in this? Thanks, Yash On Wed, Nov 26, 2014 at 2:12 PM, Bobby Evans <[email protected]> wrote: > There are actually two separate metrics systems in storm. Something that > would be nice to rectify in the future. All of the metrics that appear in > the UI come from worker heartbeats and are stored in zookeeper. The other > metrics are periodically polled by storm and are sent through the topology > to Metrics Consumers like the logging metrics consumer. Some of the > metrics are similar, but because of the difference in the collection > period, and and how/when metrics can be lost you are not likely to get > exactly the same answer when comparing the two. They should be close, but > not exactly the same. > > For the metrics on the UI, when a worker dies and is relaunched all of the > metrics associated with it on the UI are lost/overwritten by the new > instance. For the other metrics storm does not guarantee delivery of them > so it is possible that they are just lost. > > If you look at > > > https://github.com/apache/storm/blob/master/storm-core/src/clj/backtype/storm/ui/core.clj > > > you can see how the metrics are calculated. Specifically look for the > functions that start with aggregate. > > - Bobby > > > On Wednesday, November 26, 2014 3:44 PM, Yashwant Ganti < > [email protected]> wrote: > > > Hello All, > > I am working on extracting the metrics using the Metrics API and had a > question about how the Nimbus UI displays the 'TopologyLevel' metrics for a > component. By 'TopologyLevel', I mean the values which are summed across > all tasks for the component. By clicking on the component in the Nimbus UI, > one also gets to see the metrics-per-task for that component. > > I was wondering if the aggregated metrics per component are calculated and > displayed at the UI layer or if there is any of extracting them from the > Metrics API itself. > > Also, I'll briefly describe my approach for computing these aggregated > metrics and would appreciate any feedback, especially if there is anything > inherent in this approach that would cause the aggregate metrics to be > calculated incorrectly - > > > 1. Register a Metrics consumer like the 'LoggingMetricsConsumer' - > > https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/LoggingMetricsConsumer.java. > Have multiple tasks running for this consumer > 2. For every data point sent to this consumer, using the corresponding > 'taskInfo' object, create wrapper metrics for 'TopologyLevel' and > 'Component/Task Level' > 3. Send these wrapper metrics to an aggregation framework that can > aggregate based on the Metric Key/ Name. The expectation here is that the > 'TopologyLevel' metrics that would be aggregated based on the Metric key > should be the same as the ones displayed on the Nimbus UI. > > Any feedback/pointers are much appreciated. > > Thanks, > Yash > > >
