[ 
https://issues.apache.org/jira/browse/GIRAPH-232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jakob Homan updated GIRAPH-232:
-------------------------------

    Attachment: GIRAPH-231.patch

Since Giraph jobs come and go pretty quickly, JMX isn't a good choice.  Using 
Hadoop logging and counters works, but is not particularly fine-grained or easy 
to consume in other contexts.  The attached patch instead uses Yammer's Metrics 
library: http://metrics.codahale.com/ This has several advantages:
* It's quite robust - we don't need to write something half-assed.
* It's well thought out.  There are pretty much all the types of metrics we 
need included
* It's actively maintained with a wide community
* It can pipe things out to stdout/stderr, which avoids the issues with 
Hadoop's log4j setup we're seeing and is appropriate for jobs when they're 
running
* It's extremely easy to plug various reporting and collecting systems into: 
http://metrics.codahale.com/manual/
* It's modular. We don't have to include the whole system but connecting it to 
other bits is as easy as: 
https://github.com/jghoman/giraphGraphiteReporter/blob/master/src/main/java/com/homan/GiraphGraphiteReporterConnector.java
  With this quick file we had all the metrics from our GiraphJob being pumped 
into a Graphite web interface.

Some drawbacks.  
* As I implemented it currently, it's one big static instance.  If this gets 
committed, I'll convert this over to a Guiced resource.
* Without better DI, there's some overhead even if you turn everything off, but 
again, Guice should help with this.
* Meters have to be sprinkled throughout the code.  I'd love it to be able to 
mark values to be measured via annotation, but that may be too much for Java.
* Right now testing is pretty much impossible due to the single instance being 
used.

Attached patch passes checkstyle and we've tested it.  Some metrics have been 
added to the code - mainly where it's been helpful already - but more are 
expected to be added shortly.
                
> Add metrics system into Giraph
> ------------------------------
>
>                 Key: GIRAPH-232
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-232
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Jakob Homan
>            Assignee: Jakob Homan
>         Attachments: GIRAPH-231.patch
>
>
> Currently a lot of Giraph's operations are not transparent. As a Hadoop job, 
> the Giraph logging is at the mercy of Hadoop's logging system and can 
> disappear when one encounters a memory issue.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to