Hi Yash, but I will have to build a custom Consumer, which extends the LoggingMetricsConsumer [1] to aggregate the metrics? Do you know how I can calculate the total end-to-end latency of my topology? (simply accumulating the completion time of each bolt?)
Please can you share your StatsDMetricsConsumer? Thanks! Best regards Martin [1] https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/LoggingMetricsConsumer.java 2015-02-15 2:13 GMT+01:00 Yashwant Ganti <[email protected]>: > Okay. Then yes, a LoggingMetricsConsumer configured with a parallelism of > 1 should work, since it would receive all the metrics. Although, if the > Topology is rebalanced, the location of this MetricsConsumer can change > (different worker on the same supervisor or a different supervisor > altogether). > > For what it's worth, we haven't observed any significant performance hit > in our production topology, which has a single instance of a > StatsDMetricsConsumer running. > > - Yash > > On Sat, Feb 14, 2015 at 1:51 PM, Martin Illecker <[email protected]> > wrote: > >> Hi Yash, >> >> I would prefer to have a solution within Storm only, so that there is no >> external service involved. >> Because the impact in performance should be as small as possible. >> >> I don't know if its possible in Storm? >> (aggregating CountMetrics or end-to-end latencies by a single global >> LoggingMetricsConsumer) >> >> Best regards >> Martin >> >> >> 2015-02-14 22:05 GMT+01:00 Yashwant Ganti <[email protected]>: >> >>> Hi Martin, >>> >>> Do you need the metric information to be written to logs? If that is not >>> a hard constraint, replacing the 'LoggingMetricsConsumer' with a component >>> that sends the metric data to a metric aggregation daemon like StatsD can >>> solve your issue. All you need to make sure is that every metric >>> corresponding to a task is uniquely identified across the Topology and >>> StatsD will take care of the aggregation for you. >>> >>> Regards, >>> Yash >>> >>> On Sat, Feb 14, 2015 at 4:30 AM, Martin Illecker <[email protected]> >>> wrote: >>> >>>> Hello, >>>> >>>> 1) I would like to measure and aggregate the tuples per second for a >>>> bolt, which is running on multiple workers and multiple executors. >>>> >>>> Therefore I used the CountMetric [1] together with a >>>> LoggingMetricsConsumer according to [2]. >>>> But the results were spread among multiple worker logs and its executor. >>>> How can I aggregate this data and get the average number of tuples per >>>> second every 10 seconds? >>>> >>>> 2) Furthermore, I would also like to measure the end-to-end delay of >>>> the whole topology. >>>> Is there a better way than propagating the emitting time from the spout >>>> to the last bolt? >>>> And similar to 1), how can I finally aggregate the calculated >>>> end-to-end delay among multiple workers and supervisors? >>>> >>>> What would be the best solution to get these aggregated measurements of >>>> tuples per second and end-to-end delay without impacting the performance? >>>> I would prefer one global LoggingMetricsConsumer. >>>> >>>> Thanks! >>>> Best regards >>>> Martin >>>> >>>> [1] >>>> https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/api/CountMetric.java >>>> [2] https://www.endgame.com/blog/storm-metrics-how-to.html >>>> >>> >>> >> >> >
