Thanks for starting the discussion! I'm going to hold off saying what I think and instead just provide some background and additional questions, because I want to see where the discussion goes.
When I first suggested the API for querying metrics I was adding it for parity with aggregators. A good first question might be does the pipeline result even need query methods? Runners could add them as necessary based on the levels of querying the support. The other desire was to make the accuracy clear. One implementation path was reporting metrics directly from the workers while attempting work. This can overcount when retrying and may be under the actual attempts if the worker lost connectivity before reporting. Another implementation was something like a side output where the counts are committed as part of each bundles results, and then aggregated. This committed value is more accurate and represents the value that occurred along the success path of the pipeline. I suspect there are other possible implementations so trying to make an API that expresses all of them is difficult. So: 1. Does pipeline result need to support querying (which is useful for programmatic consumption) or are metrics intended only to get values out of a pipeline and into some metrics store? 2. How should pipeline results indicate the different kinds of metrics? What if a runner supported multiple kinds (eg, the runner reports both attempted and committed results)? As Luke mentions it may be useful to look at both to understand how much retries affected the value. On Thu, Jan 19, 2017, 1:42 AM Aviem Zur <aviem...@gmail.com> wrote: Hi all, While working on the implementation of metrics API in Spark runner the question of committed vs. attempted results has come up, sparking (no pun intended) an interesting conversation. (See API: MetricResult < https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/metrics/MetricResult.java > and discussion: PR #1750 <https://github.com/apache/beam/pull/1750>) The separation of `attempted` and `committed` metric results seems a bit unclear. Seeing that current implementations of aggregators in the different runners do not guarantee correctness, one could assume that the metrics API implementations will also follow the same guarantees. If this is correct, then you could assume that only `attempted()` metrics results can be fulfilled. Would it then be better to just have a single method such as `get()` in the API, and have the guarantees of each runner explained in the capability matrix / documentation?