Re: Storm Metrics - Tuples per second - End-to-End delay

Michael G. Noll Thu, 12 Mar 2015 13:11:37 -0700

Martin,

we recently open sourced storm-graphite, which sends Storm's built-in metrics 
to Graphite (and InfluxDB because it has a Graphite-compatible API).


https://github.com/verisign/storm-graphite

Maybe this helps,
Michael



> On 15.02.2015, at 14:47, Martin Illecker <[email protected]> wrote:
> 
> Hi Yash,
> 
> but I will have to build a custom Consumer, which extends the 
> LoggingMetricsConsumer [1] to aggregate the metrics?
> Do you know how I can calculate the total end-to-end latency of my topology?
> (simply accumulating the completion time of each bolt?)
> 
> Please can you share your StatsDMetricsConsumer?
> 
> Thanks!
> Best regards
> Martin
> 
> [1] 
> https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/LoggingMetricsConsumer.java
> 
> 2015-02-15 2:13 GMT+01:00 Yashwant Ganti <[email protected]>:
>> Okay. Then yes, a LoggingMetricsConsumer configured with a parallelism of 1 
>> should work, since it would receive all the metrics. Although, if the 
>> Topology is rebalanced, the location of this MetricsConsumer can change 
>> (different worker on the same supervisor or a different supervisor 
>> altogether). 
>> 
>> For what it's worth, we haven't observed any significant performance hit in 
>> our production topology, which has a single instance of a 
>> StatsDMetricsConsumer running. 
>> 
>> - Yash
>> 
>>> On Sat, Feb 14, 2015 at 1:51 PM, Martin Illecker <[email protected]> 
>>> wrote:
>>> Hi Yash,
>>> 
>>> I would prefer to have a solution within Storm only, so that there is no 
>>> external service involved.
>>> Because the impact in performance should be as small as possible.
>>> 
>>> I don't know if its possible in Storm?
>>> (aggregating CountMetrics or end-to-end latencies by a single global 
>>> LoggingMetricsConsumer)
>>> 
>>> Best regards
>>> Martin
>>> 
>>> 
>>> 2015-02-14 22:05 GMT+01:00 Yashwant Ganti <[email protected]>:
>>>> Hi Martin, 
>>>> 
>>>> Do you need the metric information to be written to logs? If that is not a 
>>>> hard constraint, replacing the 'LoggingMetricsConsumer' with a component 
>>>> that sends the metric data to a metric aggregation daemon like StatsD can 
>>>> solve your issue. All you need to make sure is that every metric 
>>>> corresponding to a task is uniquely identified across the Topology and 
>>>> StatsD will take care of the aggregation for you. 
>>>> 
>>>> Regards,
>>>> Yash
>>>> 
>>>>> On Sat, Feb 14, 2015 at 4:30 AM, Martin Illecker <[email protected]> 
>>>>> wrote:
>>>>> Hello,
>>>>> 
>>>>> 1) I would like to measure and aggregate the tuples per second for a 
>>>>> bolt, which is running on multiple workers and multiple executors.
>>>>> 
>>>>> Therefore I used the CountMetric [1] together with a 
>>>>> LoggingMetricsConsumer according to [2].
>>>>> But the results were spread among multiple worker logs and its executor.
>>>>> How can I aggregate this data and get the average number of tuples per 
>>>>> second every 10 seconds?
>>>>> 
>>>>> 2) Furthermore, I would also like to measure the end-to-end delay of the 
>>>>> whole topology.
>>>>> Is there a better way than propagating the emitting time from the spout 
>>>>> to the last bolt?
>>>>> And similar to 1), how can I finally aggregate the calculated end-to-end 
>>>>> delay among multiple workers and supervisors?
>>>>> 
>>>>> What would be the best solution to get these aggregated measurements of 
>>>>> tuples per second and end-to-end delay without impacting the performance?
>>>>> I would prefer one global LoggingMetricsConsumer.
>>>>> 
>>>>> Thanks!
>>>>> Best regards
>>>>> Martin
>>>>> 
>>>>> [1] 
>>>>> https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/api/CountMetric.java
>>>>> [2] https://www.endgame.com/blog/storm-metrics-how-to.html
>

Re: Storm Metrics - Tuples per second - End-to-End delay

Reply via email to