Okay. Then yes, a LoggingMetricsConsumer configured with a parallelism of 1
should work, since it would receive all the metrics. Although, if the
Topology is rebalanced, the location of this MetricsConsumer can change
(different worker on the same supervisor or a different supervisor
altogether).

For what it's worth, we haven't observed any significant performance hit in
our production topology, which has a single instance of a
StatsDMetricsConsumer running.

- Yash

On Sat, Feb 14, 2015 at 1:51 PM, Martin Illecker <[email protected]>
wrote:

> Hi Yash,
>
> I would prefer to have a solution within Storm only, so that there is no
> external service involved.
> Because the impact in performance should be as small as possible.
>
> I don't know if its possible in Storm?
> (aggregating CountMetrics or end-to-end latencies by a single global
> LoggingMetricsConsumer)
>
> Best regards
> Martin
>
>
> 2015-02-14 22:05 GMT+01:00 Yashwant Ganti <[email protected]>:
>
>> Hi Martin,
>>
>> Do you need the metric information to be written to logs? If that is not
>> a hard constraint, replacing the 'LoggingMetricsConsumer' with a component
>> that sends the metric data to a metric aggregation daemon like StatsD can
>> solve your issue. All you need to make sure is that every metric
>> corresponding to a task is uniquely identified across the Topology and
>> StatsD will take care of the aggregation for you.
>>
>> Regards,
>> Yash
>>
>> On Sat, Feb 14, 2015 at 4:30 AM, Martin Illecker <[email protected]>
>> wrote:
>>
>>> Hello,
>>>
>>> 1) I would like to measure and aggregate the tuples per second for a
>>> bolt, which is running on multiple workers and multiple executors.
>>>
>>> Therefore I used the CountMetric [1] together with a
>>> LoggingMetricsConsumer according to [2].
>>> But the results were spread among multiple worker logs and its executor.
>>> How can I aggregate this data and get the average number of tuples per
>>> second every 10 seconds?
>>>
>>> 2) Furthermore, I would also like to measure the end-to-end delay of the
>>> whole topology.
>>> Is there a better way than propagating the emitting time from the spout
>>> to the last bolt?
>>> And similar to 1), how can I finally aggregate the calculated end-to-end
>>> delay among multiple workers and supervisors?
>>>
>>> What would be the best solution to get these aggregated measurements of
>>> tuples per second and end-to-end delay without impacting the performance?
>>> I would prefer one global LoggingMetricsConsumer.
>>>
>>> Thanks!
>>> Best regards
>>> Martin
>>>
>>> [1]
>>> https://github.com/nathanmarz/storm/blob/master/storm-core/src/jvm/backtype/storm/metric/api/CountMetric.java
>>> [2] https://www.endgame.com/blog/storm-metrics-how-to.html
>>>
>>
>>
>
>

Reply via email to