Re: Storm Metrics - Tuples per second - End-to-End delay

Yashwant Ganti Mon, 16 Feb 2015 08:32:14 -0800

Hi Martin,

For the 'end-to-end latency' - Have you looked at the
'complete-spout-latency' metric that Storm provides as a part of the
built-in metrics?  Will that not serve your purpose? If I am not wrong, in
addition to the individual bolt completion times, it also includes
serialization and transmission times for a Tuple and should give you the
correct estimate of how much time a Tuple takes to flow through the entire
system.


The StatsDMetricsConsumer I refer to is similar to the one here -
https://github.com/endgameinc/storm-metrics-statsd/blob/master/src/main/java/com/endgame/storm/metrics/statsd/StatsdMetricConsumer.java

It reads from the built-in 'Metrics' stream, which receives Tuples once you
register the Metrics Consumer, sanitizes the names and sends them across to
StatsD for aggregation. If the built-in metrics don't fit your purpose, you
can register custom metrics and have them sent to this stream as well.

Regards,
Yash

On Sun, Feb 15, 2015 at 5:47 AM, Martin Illecker <[email protected]>
wrote:

> Hi Yash,
>
> but I will have to build a custom Consumer, which extends the
> LoggingMetricsConsumer [1] to aggregate the metrics?
> Do you know how I can calculate the total end-to-end latency of my
> topology?
> (simply accumulating the completion time of each bolt?)
>
> Please can you share your StatsDMetricsConsumer?
>
> Thanks!
> Best regards
> Martin
>
> [1]
> https://github.com/apache/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/LoggingMetricsConsumer.java
>
> 2015-02-15 2:13 GMT+01:00 Yashwant Ganti <[email protected]>:
>
>> Okay. Then yes, a LoggingMetricsConsumer configured with a parallelism of
>> 1 should work, since it would receive all the metrics. Although, if the
>> Topology is rebalanced, the location of this MetricsConsumer can change
>> (different worker on the same supervisor or a different supervisor
>> altogether).
>>
>> For what it's worth, we haven't observed any significant performance hit
>> in our production topology, which has a single instance of a
>> StatsDMetricsConsumer running.
>>
>> - Yash
>>
>> On Sat, Feb 14, 2015 at 1:51 PM, Martin Illecker <[email protected]>
>> wrote:
>>
>>> Hi Yash,
>>>
>>> I would prefer to have a solution within Storm only, so that there is no
>>> external service involved.
>>> Because the impact in performance should be as small as possible.
>>>
>>> I don't know if its possible in Storm?
>>> (aggregating CountMetrics or end-to-end latencies by a single global
>>> LoggingMetricsConsumer)
>>>
>>> Best regards
>>> Martin
>>>
>>>
>>> 2015-02-14 22:05 GMT+01:00 Yashwant Ganti <[email protected]>:
>>>
>>>> Hi Martin,
>>>>
>>>> Do you need the metric information to be written to logs? If that is
>>>> not a hard constraint, replacing the 'LoggingMetricsConsumer' with a
>>>> component that sends the metric data to a metric aggregation daemon like
>>>> StatsD can solve your issue. All you need to make sure is that every metric
>>>> corresponding to a task is uniquely identified across the Topology and
>>>> StatsD will take care of the aggregation for you.
>>>>
>>>> Regards,
>>>> Yash
>>>>
>>>> On Sat, Feb 14, 2015 at 4:30 AM, Martin Illecker <[email protected]>
>>>> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> 1) I would like to measure and aggregate the tuples per second for a
>>>>> bolt, which is running on multiple workers and multiple executors.
>>>>>
>>>>> Therefore I used the CountMetric [1] together with a
>>>>> LoggingMetricsConsumer according to [2].
>>>>> But the results were spread among multiple worker logs and its
>>>>> executor.
>>>>> How can I aggregate this data and get the average number of tuples per
>>>>> second every 10 seconds?
>>>>>
>>>>> 2) Furthermore, I would also like to measure the end-to-end delay of
>>>>> the whole topology.
>>>>> Is there a better way than propagating the emitting time from the
>>>>> spout to the last bolt?
>>>>> And similar to 1), how can I finally aggregate the calculated
>>>>> end-to-end delay among multiple workers and supervisors?
>>>>>
>>>>> What would be the best solution to get these aggregated measurements
>>>>> of tuples per second and end-to-end delay without impacting the 
>>>>> performance?
>>>>> I would prefer one global LoggingMetricsConsumer.
>>>>>
>>>>> Thanks!
>>>>> Best regards
>>>>> Martin
>>>>>
>>>>> [1]
>>>>> https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/api/CountMetric.java
>>>>> [2] https://www.endgame.com/blog/storm-metrics-how-to.html
>>>>>
>>>>
>>>>
>>>
>>>
>>
>

Re: Storm Metrics - Tuples per second - End-to-End delay

Reply via email to