> On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > Looks great, thanks for doing this. I have a few comments nothing major. > > > > G.
Thanks. I've answered most of the issues below and made code changes or explained why not. There's a few things I'll have to look into more and I need to do some testing for some of the changes. I'll try to post an updated patch tomorrow. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/command/Command.java, line 195 > > <https://reviews.apache.org/r/21278/diff/3/?file=580644#file580644line195> > > > > So does it make sense to normalize command metrics? Currently Command > > and XCommand create different metrics. XCommand has metrics for failure > > (xexception and exception) and an 'executions' counter. Command does not > > have these. I checked and we currently only have one Command (CoordActionMaterializeCommand) that is using Command and not XCommand. Plus, it's only being used during a dryrun operation so it's not frequently used. I think it would be better to just convert CoordActionMaterializeCommand to an CoordActionMaterializeXCommand and remove Command completely; I've opened OOZIE-1846 for that. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/command/XCommand.java, line 332 > > <https://reviews.apache.org/r/21278/diff/3/?file=580645#file580645line332> > > > > I think it makes more sense to only time the executions that succeeded. > > I think that in general you don't want the timer to get data from failed > > executions that may be very short. The "call" timer includes other stuff than just the main part of the command, including loading state, verifying a precondition, etc. I think we should keep the "call" timer as-is so it includes these other things. For example, if a command fails its precondition a lot, the timer should show that (failing the precondition throws an exception). However, the "execute" timer only includes the main part of the command and only gets updated if there's no exceptions. So I think this timer covers essentially what you're looking for here. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/service/InstrumentationService.java, > > line 51 > > <https://reviews.apache.org/r/21278/diff/3/?file=580649#file580649line51> > > > > nit: what if you are already initialized? Is that okay to call init > > twice? Also, if you remove the isEnabled you'll need to assign the member > > at the end to make sure that you only initializes it when init was > > successful. You're not allowed to init() twice without first doing destroy(). I imagine that many of the Services would have problems if you tried to do that. We don't have anything enforcing this really (and it's actually caused problems in tests before), but for now we just assume you don't. That's is a good point about assigning the member at the end. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/service/MetricsInstrumentationService.java, > > line 34 > > <https://reviews.apache.org/r/21278/diff/3/?file=580650#file580650line34> > > > > super nit: rename constant to include the unit? I know Hadoop usually does that, but Oozie doesn't. We just mention it in the description in oozie-default/oozie-site and/or the documentation. I had forgotten to add this to oozie-default.xml, so thanks for reminding me though. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/service/MetricsInstrumentationService.java, > > line 37 > > <https://reviews.apache.org/r/21278/diff/3/?file=580650#file580650line37> > > > > nit: you don't really need this member. Use the base class member > > instead? MetricsInstrumentation has a destroy() method that we have to call to stop the Metrics stuff from logging. Instrumentation doesn't have this, so I need it to be a MetricsInstrumentation object and not simply an Instrumentation object (and I'd rather not deal with casting). > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/servlet/BaseAdminServlet.java, line 397 > > <https://reviews.apache.org/r/21278/diff/3/?file=580653#file580653line397> > > > > nit: Why not have the same signature and pass the Instrumentation > > object? We generate the json from the Instrumentation and MetricsInstrumentation differently, so this can't be done without a lot of refactoring. I think it's simpler to just have two methods. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/servlet/V2AdminServlet.java, line 48 > > <https://reviews.apache.org/r/21278/diff/3/?file=580657#file580657line48> > > > > nit: I am not sure if it is possible but the actual metric > > instrumentation instance can change (by calling destroy and init on the > > metric instrumentation service). So maybe instead of caching a copy of the > > instrumentation object check the object that is passed in the methods > > instead? We actually cache the regular Instrumentation in other places too. While the Services can technically be destroyed and then re-init again, in practice, you can't do that without restarting Oozie, in which case V2AdminServlet gets recreated and it's not a problem. I'd imagine that restarting the Services without restarting Oozie would lead to other similar problems too. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line > > 70 > > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line70> > > > > super nit: this controls not only the logging frequency but also the > > actual collection frequency, no? Nope, just the logging frequency. It's only used by the XLogReporter. The collection frequency, at least for the samplers/histograms is whatever is passed to addSampler(...). > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line > > 105 > > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line105> > > > > can the sliding time window be made configurable? I need to investigate this some more. I'm wondering if this is the right type of Reservoir to be using here. According to http://metrics.codahale.com/manual/core/#exponentially-decaying-reservoirs SlidingTimeWindowReservoir can be memory intensive and they recommend ExponentiallyDecayingReservoir instead (which does roughly the last 5min with some fancy algorithm). That's also what I'm using for the samplers/histograms. There's also the SlidingWindowReservoir, which stores the last N measurements (SlidingTimeWindowReservoir does the last N seconds/minutes). Thoughts? > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line > > 111 > > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line111> > > > > why not use loading caches for these as well? Can't do that because we have to implement a Gauge object and return the value from the Variable object, which we don't have in the CacheLoader's load(String key) method. We have to create the Gauge and then put it into the Map. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line > > 147 > > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line147> > > > > >= 0 ? what if something takes less than 1ms? The reason I added that was that if you add a cron that was never actually run, without the if statement, you'll add an extra 0 value to the timer. I'll look into what to do about this. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java, line > > 173 > > <https://reviews.apache.org/r/21278/diff/3/?file=580658#file580658line173> > > > > this can throw if the gauge's name already exist. Why not use a loading > > cache as you do for counters and timers? This is also true for histograms. I address why not using a loading cache in a previous comment. In general, Gauges and Histograms are registered only once by classes that implement Instrumentable so that shouldn't be a problem; the other metrics are added on the fly. However, just in case, I've added a check that if the key already exists to remove it before re-adding it. > On May 15, 2014, 8:55 p.m., Gilad Wolff wrote: > > docs/src/site/twiki/WebServicesAPI.twiki, line 279 > > <https://reviews.apache.org/r/21278/diff/3/?file=580665#file580665line279> > > > > mention that this is not an exhaustive list? And as I mentioned > > off-line, it would be nice if one day oozie can expose an exhaustive list > > of all possible built-in metrics. I added something about this to the docs (it's also true for instrumentation). I agree that it would be nice. Unfortunately, that's kind of tricky because timers and counters are registered on the fly and many are dynamically generated by subclasses. This would likely require an offline tool that would look through the code and try to figure it out. The other metrics/instrumentation types are all registered on Oozie startup though. - Robert ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/21278/#review42740 ----------------------------------------------------------- On May 13, 2014, 5:40 p.m., Robert Kanter wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/21278/ > ----------------------------------------------------------- > > (Updated May 13, 2014, 5:40 p.m.) > > > Review request for oozie. > > > Bugs: OOZIE-1817 > https://issues.apache.org/jira/browse/OOZIE-1817 > > > Repository: oozie-git > > > Description > ------- > > See > https://issues.apache.org/jira/browse/OOZIE-1817?focusedCommentId=13993838&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13993838 > > > Diffs > ----- > > client/src/main/java/org/apache/oozie/client/rest/RestConstants.java > 808f9b2 > core/pom.xml c935dd7 > core/src/main/java/org/apache/oozie/command/Command.java d3f1011 > core/src/main/java/org/apache/oozie/command/XCommand.java b712ce0 > > core/src/main/java/org/apache/oozie/command/coord/CoordActionInputCheckXCommand.java > e8667c1 > > core/src/main/java/org/apache/oozie/command/coord/CoordActionMaterializeCommand.java > 6962fb2 > > core/src/main/java/org/apache/oozie/command/coord/CoordMaterializeTransitionXCommand.java > 57cbb34 > core/src/main/java/org/apache/oozie/service/InstrumentationService.java > 80437b1 > > core/src/main/java/org/apache/oozie/service/MetricsInstrumentationService.java > PRE-CREATION > core/src/main/java/org/apache/oozie/service/Services.java 5feac7b > core/src/main/java/org/apache/oozie/service/XLogService.java 403a089 > core/src/main/java/org/apache/oozie/servlet/BaseAdminServlet.java 29d7bd6 > core/src/main/java/org/apache/oozie/servlet/JsonRestServlet.java 5c05acd > core/src/main/java/org/apache/oozie/servlet/V0AdminServlet.java 97bd81c > core/src/main/java/org/apache/oozie/servlet/V1AdminServlet.java a47f737 > core/src/main/java/org/apache/oozie/servlet/V2AdminServlet.java 002a367 > core/src/main/java/org/apache/oozie/util/MetricsInstrumentation.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/XLogReporter.java PRE-CREATION > core/src/test/java/org/apache/oozie/service/TestInstrumentationService.java > b5206cf > > core/src/test/java/org/apache/oozie/service/TestMetricsInstrumentationService.java > PRE-CREATION > core/src/test/java/org/apache/oozie/util/TestInstrumentation.java 6040b47 > core/src/test/java/org/apache/oozie/util/TestMetricsInstrumentation.java > PRE-CREATION > docs/src/site/twiki/AG_Install.twiki e343d7e > docs/src/site/twiki/WebServicesAPI.twiki 6730255 > pom.xml b5e0e4e > webapp/src/main/webapp/index.jsp 3c7ffe5 > webapp/src/main/webapp/oozie-console.js 764d888 > > Diff: https://reviews.apache.org/r/21278/diff/ > > > Testing > ------- > > Unit tests, also verified in a cluster > > > Thanks, > > Robert Kanter > >
