On Tue, Jan 3, 2017 at 8:56 PM Ben Chambers <[email protected]> wrote:
> The Metrics API in Beam is proposed to support both committed metrics (only > from the successfully committed bundles) and attempted metrics (from all > attempts at processing bundles). I think any mechanism based on the workers > reporting metrics to a monitoring system will only be able to support > attempted metrics. Reporting committed metrics needs runner support to make > sure that the metrics are reported (and acknowledged) atomically with the > completion of the bundle. > > The way I have been thinking this would work is that the runner would be > involved in actually gathering the metrics, and then we could have some > mechanism that periodically retrieved the metrics from the runner (both > committed and attempted) and pushed those into the monitoring systems. > This is similar to the ScheduledReporter I mentioned, with the subtlety that the runner should "gather" the metrics (because of the distributed nature). > > In this model, each runner should test that (1) the mechanism it has of > gathering metrics works and (2) that the metric reporting plugin runs > appropriately. It would not be necessary to test that a specific metric > reporting plugin works with a specific runner, since all the plugins should > be using the same API to get metrics from the runner. > Agree, we only really need to test the reporting plugin. > > The API that has been built so far supports (1) as well as exposing metrics > from the runner on the PipelineResult object. I'm currently working on > building support for that in the Dataflow runner. > So we're in agreement that (2) "reporting plugin" is needed ? do we have a ticket ? do you have something in progress ? > > On Mon, Jan 2, 2017 at 11:57 PM Amit Sela <[email protected]> wrote: > > I think that in the spirit of Codahale/Dropwizard metrics-like API, the > question is do we want to have something like ScheduledReporter > < > > http://metrics.dropwizard.io/3.1.0/apidocs/com/codahale/metrics/ScheduledReporter.html > > > as > a contract to collect and report the metrics to different monitoring > systems (e.g., Graphite, Ganglia, etc.). > > > On Mon, Jan 2, 2017 at 8:07 PM Stas Levin <[email protected]> wrote: > > > I see. > > > > Just to make sure I get it right, in (2), by sinks I mean various metrics > > backends (e.g., Graphite). So it boils down to having integration tests > as > > part of Beam (runners?) that beyond testing the SDK layer (i.e., > asserting > > over pipeline.metrics()) and actually test the specific metrics backend > > (i.e., asserting over inMemoryGraphite.metrics()), right? > > > > On Mon, Jan 2, 2017 at 7:14 PM Davor Bonaci <[email protected]> wrote: > > > > > Sounds like we should do both, right? > > > > > > 1. Test the metrics API without accounting for the various sink types, > > i.e. > > > > against the SDK. > > > > > > > > > > Metrics API is a runner-independent SDK concept. I'd imagine we'd want > to > > > have runner-independent test that interact with the API, outside of any > > > specific transform implementation, execute them on all runners, and > query > > > the results. Goal: make sure Metrics work. > > > > > > 2. Have the sink types, or at least some of them, tested as part of > > > > integration tests, e.g., have an in-memory Graphite server to test > > > Graphite > > > > metrics and so on. > > > > > > > > > > This is valid too -- this is testing *usage* of Metrics API in the > given > > > IO. If a source/sink, or a transform in general, is exposing a metric, > > that > > > metric should be tested in its own right as a part of the transform > > > implementation. > > > > > >
