Good idea. The current instrumentations give a whole lot of very low-level 
information, some of which does not contribute to understanding the system load 
and behavior.
E.g. Counters for the various commands - knowing ‘coord.action.input.check' is 
useful but not so much ‘kill.preconditionfailed’. We could possibly exclude the 
unimportant ones to expose only the required, most impactful metrics.

In addition to JVM metrics, following are some of the most useful metrics. No 
latency related information is currently captured so not sure how much work 
that will involve, but the 2nd and 3rd ones are indeed captured but just not in 
a very graspable manner.

  *   average latency of the different database retrieval and update API
  *   A trend of number of jobs handled by Recovery Service
  *   A trend of callable queue occupancy

On 5/5/14, 4:16 PM, "Rohini Palaniswamy" 
<[email protected]<mailto:[email protected]>> wrote:

+1


On Mon, May 5, 2014 at 2:37 PM, Alejandro Abdelnur 
<[email protected]<mailto:[email protected]>>wrote:

+1,

I've used codehale metrics in my last few projects (Llama is one of them)
and I love it.

I'd suggest the following, lets create a new Instrumentation class
delegates to metrics, then people can chose to upgrade. In Oozie 5 we can
swap the default instrumentation in the oozie-default.xml to metrics.

I'd add Hadoop JMX JSON Servlet to Oozie, if metrics is not used, the
servlet would just report JVM metric, if metrics is used, everything will
come out there.

Thx.


On Mon, May 5, 2014 at 1:23 PM, Robert Kanter 
<[email protected]<mailto:[email protected]>>
wrote:

> Hi all,
>
> The JIRA at OOZIE-1817
> <https://issues.apache.org/jira/browse/OOZIE-1817>wants to make the
> instrumentation timers biased; that is, to make them look
> only at the last X amount of time, instead of forever.  When trying
> monitoring Oozie, if your Oozie server has been running a long time, they
> become less useful.
>
> While looking at some of the instrumentation code, I also saw that in a
lot
> of places, we create and use a timer (i.e. a Cron object), but we never
add
> it to the Instrumentation, so it's never actually reported to the user,
and
> is essentially wasteful and useless.
>
> We haven't really touched our instrumentation code in a while, and I was
> thinking it might be a good idea to switch to using a library like
Codahale
> Metrics (also called Yammer Metrics).  They include a bunch of stuff for
us
> (e.g. JVM and servlet metrics) and do all of the math for computing
> standard deviation, percentiles, etc.  It also looks like most of the
> metric types can be upgraded more immediately, instead of every minute.
>  Their API looks pretty similar to what we already have, so it shouldn't
be
> too hard to switch over.  Their "Getting Started" page is here if you
want
> to see more about it: http://metrics.codahale.com/getting-started/
>
> We'd have to keep the current Instrumentation for Oozie 4.x, but we could
> add a /v2/metrics URL for the new one and deprecate the old one.
>
> Thoughts?
>
>
> thanks
> - Robert
>



--
Alejandro


Reply via email to