Okay. Then yes, a LoggingMetricsConsumer configured with a parallelism of 1 should work, since it would receive all the metrics. Although, if the Topology is rebalanced, the location of this MetricsConsumer can change (different worker on the same supervisor or a different supervisor altogether).
For what it's worth, we haven't observed any significant performance hit in our production topology, which has a single instance of a StatsDMetricsConsumer running. - Yash On Sat, Feb 14, 2015 at 1:51 PM, Martin Illecker <[email protected]> wrote: > Hi Yash, > > I would prefer to have a solution within Storm only, so that there is no > external service involved. > Because the impact in performance should be as small as possible. > > I don't know if its possible in Storm? > (aggregating CountMetrics or end-to-end latencies by a single global > LoggingMetricsConsumer) > > Best regards > Martin > > > 2015-02-14 22:05 GMT+01:00 Yashwant Ganti <[email protected]>: > >> Hi Martin, >> >> Do you need the metric information to be written to logs? If that is not >> a hard constraint, replacing the 'LoggingMetricsConsumer' with a component >> that sends the metric data to a metric aggregation daemon like StatsD can >> solve your issue. All you need to make sure is that every metric >> corresponding to a task is uniquely identified across the Topology and >> StatsD will take care of the aggregation for you. >> >> Regards, >> Yash >> >> On Sat, Feb 14, 2015 at 4:30 AM, Martin Illecker <[email protected]> >> wrote: >> >>> Hello, >>> >>> 1) I would like to measure and aggregate the tuples per second for a >>> bolt, which is running on multiple workers and multiple executors. >>> >>> Therefore I used the CountMetric [1] together with a >>> LoggingMetricsConsumer according to [2]. >>> But the results were spread among multiple worker logs and its executor. >>> How can I aggregate this data and get the average number of tuples per >>> second every 10 seconds? >>> >>> 2) Furthermore, I would also like to measure the end-to-end delay of the >>> whole topology. >>> Is there a better way than propagating the emitting time from the spout >>> to the last bolt? >>> And similar to 1), how can I finally aggregate the calculated end-to-end >>> delay among multiple workers and supervisors? >>> >>> What would be the best solution to get these aggregated measurements of >>> tuples per second and end-to-end delay without impacting the performance? >>> I would prefer one global LoggingMetricsConsumer. >>> >>> Thanks! >>> Best regards >>> Martin >>> >>> [1] >>> https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/api/CountMetric.java >>> [2] https://www.endgame.com/blog/storm-metrics-how-to.html >>> >> >> > >
