[
https://issues.apache.org/jira/browse/GIRAPH-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jakob Homan updated GIRAPH-232:
-------------------------------
Attachment: GIRAPH-231.patch
Since Giraph jobs come and go pretty quickly, JMX isn't a good choice. Using
Hadoop logging and counters works, but is not particularly fine-grained or easy
to consume in other contexts. The attached patch instead uses Yammer's Metrics
library: http://metrics.codahale.com/ This has several advantages:
* It's quite robust - we don't need to write something half-assed.
* It's well thought out. There are pretty much all the types of metrics we
need included
* It's actively maintained with a wide community
* It can pipe things out to stdout/stderr, which avoids the issues with
Hadoop's log4j setup we're seeing and is appropriate for jobs when they're
running
* It's extremely easy to plug various reporting and collecting systems into:
http://metrics.codahale.com/manual/
* It's modular. We don't have to include the whole system but connecting it to
other bits is as easy as:
https://github.com/jghoman/giraphGraphiteReporter/blob/master/src/main/java/com/homan/GiraphGraphiteReporterConnector.java
With this quick file we had all the metrics from our GiraphJob being pumped
into a Graphite web interface.
Some drawbacks.
* As I implemented it currently, it's one big static instance. If this gets
committed, I'll convert this over to a Guiced resource.
* Without better DI, there's some overhead even if you turn everything off, but
again, Guice should help with this.
* Meters have to be sprinkled throughout the code. I'd love it to be able to
mark values to be measured via annotation, but that may be too much for Java.
* Right now testing is pretty much impossible due to the single instance being
used.
Attached patch passes checkstyle and we've tested it. Some metrics have been
added to the code - mainly where it's been helpful already - but more are
expected to be added shortly.
> Add metrics system into Giraph
> ------------------------------
>
> Key: GIRAPH-232
> URL: https://issues.apache.org/jira/browse/GIRAPH-232
> Project: Giraph
> Issue Type: New Feature
> Reporter: Jakob Homan
> Assignee: Jakob Homan
> Attachments: GIRAPH-231.patch
>
>
> Currently a lot of Giraph's operations are not transparent. As a Hadoop job,
> the Giraph logging is at the mercy of Hadoop's logging system and can
> disappear when one encounters a memory issue.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira