> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > Looks great, thanks for doing this. I have a few comments nothing major. 
> > 
> > G.

Thanks.  I've answered most of the issues below and made code changes or 
explained why not.  There's a few things I'll have to look into more and I need 
to do some testing for some of the changes.  I'll try to post an updated patch 
tomorrow.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/command/Command.java, line 195
> > <https://reviews.apache.org/r/21278/diff/3/?file=580644#file580644line195>
> >
> >     So does it make sense to normalize command metrics? Currently Command 
> > and XCommand create different metrics. XCommand has metrics for failure 
> > (xexception and exception) and an 'executions' counter. Command does not 
> > have these.

I checked and we currently only have one Command 
(CoordActionMaterializeCommand) that is using Command and not XCommand.  Plus, 
it's only being used during a dryrun operation so it's not frequently used.  I 
think it would be better to just convert CoordActionMaterializeCommand to an 
CoordActionMaterializeXCommand and remove Command completely; I've opened 
OOZIE-1846 for that.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/command/XCommand.java, line 332
> > <https://reviews.apache.org/r/21278/diff/3/?file=580645#file580645line332>
> >
> >     I think it makes more sense to only time the executions that succeeded. 
> > I think that in general you don't want the timer to get data from failed 
> > executions that may be very short.

The "call" timer includes other stuff than just the main part of the command, 
including loading state, verifying a precondition, etc.  I think we should keep 
the "call" timer as-is so it includes these other things.  For example, if a 
command fails its precondition a lot, the timer should show that (failing the 
precondition throws an exception).

However, the "execute" timer only includes the main part of the command and 
only gets updated if there's no exceptions.  So I think this timer covers 
essentially what you're looking for here.  


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/service/InstrumentationService.java, 
> > line 51
> > <https://reviews.apache.org/r/21278/diff/3/?file=580649#file580649line51>
> >
> >     nit: what if you are already initialized? Is that okay to call init 
> > twice? Also, if you remove the isEnabled you'll need to assign the member 
> > at the end to make sure that you only initializes it when init was 
> > successful.

You're not allowed to init() twice without first doing destroy().  I imagine 
that many of the Services would have problems if you tried to do that.  We 
don't have anything enforcing this really (and it's actually caused problems in 
tests before), but for now we just assume you don't.  

That's is a good point about assigning the member at the end.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/service/MetricsInstrumentationService.java,
> >  line 34
> > <https://reviews.apache.org/r/21278/diff/3/?file=580650#file580650line34>
> >
> >     super nit: rename constant to include the unit?

I know Hadoop usually does that, but Oozie doesn't.  We just mention it in the 
description in oozie-default/oozie-site and/or the documentation.  I had 
forgotten to add this to oozie-default.xml, so thanks for reminding me though.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/service/MetricsInstrumentationService.java,
> >  line 37
> > <https://reviews.apache.org/r/21278/diff/3/?file=580650#file580650line37>
> >
> >     nit: you don't really need this member. Use the base class member 
> > instead?

MetricsInstrumentation has a destroy() method that we have to call to stop the 
Metrics stuff from logging.  Instrumentation doesn't have this, so I need it to 
be a MetricsInstrumentation object and not simply an Instrumentation object 
(and I'd rather not deal with casting).


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/servlet/BaseAdminServlet.java, line 397
> > <https://reviews.apache.org/r/21278/diff/3/?file=580653#file580653line397>
> >
> >     nit: Why not have the same signature and pass the Instrumentation 
> > object?

We generate the json from the Instrumentation and MetricsInstrumentation 
differently, so this can't be done without a lot of refactoring.  I think it's 
simpler to just have two methods.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/servlet/V2AdminServlet.java, line 48
> > <https://reviews.apache.org/r/21278/diff/3/?file=580657#file580657line48>
> >
> >     nit: I am not sure if it is possible but the actual metric 
> > instrumentation instance can change (by calling destroy and init on the 
> > metric instrumentation service). So maybe instead of caching a copy of the 
> > instrumentation object check the object that is passed in the methods 
> > instead?

We actually cache the regular Instrumentation in other places too.  While the 
Services can technically be destroyed and then re-init again, in practice, you 
can't do that without restarting Oozie, in which case V2AdminServlet gets 
recreated and it's not a problem.  I'd imagine that restarting the Services 
without restarting Oozie would lead to other similar problems too.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line 
> > 70
> > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line70>
> >
> >     super nit: this controls not only the logging frequency but also the 
> > actual collection frequency, no?

Nope, just the logging frequency.  It's only used by the XLogReporter.  The 
collection frequency, at least for the samplers/histograms is whatever is 
passed to addSampler(...).


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line 
> > 105
> > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line105>
> >
> >     can the sliding time window be made configurable?

I need to investigate this some more.  I'm wondering if this is the right type 
of Reservoir to be using here.

According to 
http://metrics.codahale.com/manual/core/#exponentially-decaying-reservoirs 
SlidingTimeWindowReservoir can be memory intensive and they recommend 
ExponentiallyDecayingReservoir instead (which does roughly the last 5min with 
some fancy algorithm).  That's also what I'm using for the samplers/histograms.

There's also the SlidingWindowReservoir, which stores the last N measurements 
(SlidingTimeWindowReservoir does the last N seconds/minutes).  

Thoughts?


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line 
> > 111
> > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line111>
> >
> >     why not use loading caches for these as well?

Can't do that because we have to implement a Gauge object and return the value 
from the Variable object, which we don't have in the CacheLoader's load(String 
key) method.  We have to create the Gauge and then put it into the Map.  


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line 
> > 147
> > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line147>
> >
> >     >= 0 ? what if something takes less than 1ms?

The reason I added that was that if you add a cron that was never actually run, 
without the if statement, you'll add an extra 0 value to the timer.  I'll look 
into what to do about this.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line 
> > 173
> > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line173>
> >
> >     this can throw if the gauge's name already exist. Why not use a loading 
> > cache as you do for counters and timers? This is also true for histograms.

I address why not using a loading cache in a previous comment.  In general, 
Gauges and Histograms are registered only once by classes that implement 
Instrumentable so that shouldn't be a problem; the other metrics are added on 
the fly.  However, just in case, I've added a check that if the key already 
exists to remove it before re-adding it.


> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote:
> > docs/src/site/twiki/WebServicesAPI.twiki, line 279
> > <https://reviews.apache.org/r/21278/diff/3/?file=580665#file580665line279>
> >
> >     mention that this is not an exhaustive list? And as I mentioned 
> > off-line, it would be nice if one day oozie can expose an exhaustive list 
> > of all possible built-in metrics.

I added something about this to the docs (it's also true for instrumentation).

I agree that it would be nice.  Unfortunately, that's kind of tricky because 
timers and counters are registered on the fly and many are dynamically 
generated by subclasses.  This would likely require an offline tool that would 
look through the code and try to figure it out.  The other 
metrics/instrumentation types are all registered on Oozie startup though.


- Robert


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21278/#review42740
-----------------------------------------------------------


On May 13, 2014, 5:40 p.m., Robert Kanter wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/21278/
> -----------------------------------------------------------
> 
> (Updated May 13, 2014, 5:40 p.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1817
>     https://issues.apache.org/jira/browse/OOZIE-1817
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> See 
> https://issues.apache.org/jira/browse/OOZIE-1817?focusedCommentId=13993838&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13993838
> 
> 
> Diffs
> -----
> 
>   client/src/main/java/org/apache/oozie/client/rest/RestConstants.java 
> 808f9b2 
>   core/pom.xml c935dd7 
>   core/src/main/java/org/apache/oozie/command/Command.java d3f1011 
>   core/src/main/java/org/apache/oozie/command/XCommand.java b712ce0 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordActionInputCheckXCommand.java
>  e8667c1 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordActionMaterializeCommand.java
>  6962fb2 
>   
> core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java
>  57cbb34 
>   core/src/main/java/org/apache/oozie/service/InstrumentationService.java 
> 80437b1 
>   
> core/src/main/java/org/apache/oozie/service/MetricsInstrumentationService.java
>  PRE-CREATION 
>   core/src/main/java/org/apache/oozie/service/Services.java 5feac7b 
>   core/src/main/java/org/apache/oozie/service/XLogService.java 403a089 
>   core/src/main/java/org/apache/oozie/servlet/BaseAdminServlet.java 29d7bd6 
>   core/src/main/java/org/apache/oozie/servlet/JsonRestServlet.java 5c05acd 
>   core/src/main/java/org/apache/oozie/servlet/V0AdminServlet.java 97bd81c 
>   core/src/main/java/org/apache/oozie/servlet/V1AdminServlet.java a47f737 
>   core/src/main/java/org/apache/oozie/servlet/V2AdminServlet.java 002a367 
>   core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java 
> PRE-CREATION 
>   core/src/main/java/org/apache/oozie/util/XLogReporter.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/service/TestInstrumentationService.java 
> b5206cf 
>   
> core/src/test/java/org/apache/oozie/service/TestMetricsInstrumentationService.java
>  PRE-CREATION 
>   core/src/test/java/org/apache/oozie/util/TestInstrumentation.java 6040b47 
>   core/src/test/java/org/apache/oozie/util/TestMetricsInstrumentation.java 
> PRE-CREATION 
>   docs/src/site/twiki/AG_Install.twiki e343d7e 
>   docs/src/site/twiki/WebServicesAPI.twiki 6730255 
>   pom.xml b5e0e4e 
>   webapp/src/main/webapp/index.jsp 3c7ffe5 
>   webapp/src/main/webapp/oozie-console.js 764d888 
> 
> Diff: https://reviews.apache.org/r/21278/diff/
> 
> 
> Testing
> -------
> 
> Unit tests, also verified in a cluster
> 
> 
> Thanks,
> 
> Robert Kanter
> 
>

Reply via email to