Writing to the database could be slow and would require bulk. We are using 
riemann to forward the data.

See https://github.com/forter/riemann-storm-monitor

We plan to modify the code to use the metrics framework, but currently it wraps 
the bolts.


________________________________
From: Brunner, Bill <[email protected]>
Sent: Tuesday, October 28, 2014 1:28 PM
To: Michael Pershyn; [email protected]
Subject: RE: Metrics question

Thanks Michael.  My use case is to be able to have visibility into the metrics 
within the topology at runtime.  More specifically, the last step in my 
processing writes all the elapsed times, counts, etc to a db.  I know that the 
metrics framework is designed for this, but I was just wondering whether there 
was another way to read/write data to some data structure that is in scope for 
all components of the topology (without creating a static class).  Let me know. 
 In the meantime I will take a closer look at the metrics framework.
Thanks again.

From: Michael Pershyn [mailto:[email protected]]
Sent: Tuesday, October 28, 2014 5:56 AM
To: Brunner, Bill; [email protected]
Subject: Re: Metrics question


Hi Bill,

There are two options known to me, I have tried only first one.
Apart from using profiling tools, when you want just measure one function - 
there is function time in java, so you can measure how long the call takes. 
example<http://stackoverflow.com/questions/180158/how-do-i-time-a-methods-execution-in-java>.
 Which way you take depends on your goals.

I think the answer heavily depends on your goals - would you like to capture 
time for short period (debugging), log period (tuning), or for lifetime.

For the case of time:

1.     The result of this function you can write into storm-logs using storm, 
so you end-up with log entry like "my-function has taken 55 ms to execute" in 
worker log. In simple case you can just watch this log with shell tools or 
storm logviewer.
However, these logs also can be processed by logstash<http://logstash.net/>.
Logstash can send the events into elasticsearch, where, using 
kibana<http://www.elasticsearch.org/overview/kibana/> you skim over them, watch 
them, count them, build nice graphics, filter-out non-relevant information and 
so on. It is easy to setup.
Logstash can also parse logs and send events to opentsdb (and other metrics 
backends) where you can save and analyze the data. Docker containers can be a 
great help<https://registry.hub.docker.com/u/dockerfile/elasticsearch/>.

2.     For long-term/continuous measurement I would recommend to try out storm 
metrics framework<https://storm.apache.org/documentation/Metrics.html> (if you 
use storm 0.9+). They are also applicable for trident. I would suggest to use 
AssignableMetric or Reduced metric to calculate mean for your function calls.
Metrics then could be processed by metrics consumer (and may end-up in log or 
graphite, opentsdb or other database).

Best regards,
Michael Pershyn

On 10/27/2014 12:29 PM, Brunner, Bill wrote:
What is the best way to capture start/end times of functions/aggregators in 
Trident?  I am interested in capturing elapsed times for each, but would prefer 
not having to pass the info around in tuples, or write incrementally to a 
database.  Wondering if anyone else has done this and how.  Thanks
________________________________
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer. If you are not the intended 
recipient, please delete this message.
?
________________________________
This message, and any attachments, is for the intended recipient(s) only, may 
contain information that is privileged, confidential and/or proprietary and 
subject to important terms and conditions available at 
http://www.bankofamerica.com/emaildisclaimer. If you are not the intended 
recipient, please delete this message.

Reply via email to