[
https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=297207&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297207
]
ASF GitHub Bot logged work on BEAM-1974:
----------------------------------------
Author: ASF GitHub Bot
Created on: 19/Aug/19 15:24
Start Date: 19/Aug/19 15:24
Worklog Time Spent: 10m
Work Description: echauchot commented on pull request #9328: [BEAM-1974]
Add Metrics user-oriented documentation
URL: https://github.com/apache/beam/pull/9328#discussion_r315270391
##########
File path: website/src/documentation/programming-guide.md
##########
@@ -2907,3 +2907,143 @@ elements, or after a minute.
```py
{% github_sample
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py
tag:model_other_composite_triggers
%}```
+
+## 9. Metrics {#metrics}
+In the Beam model, metrics provide some insight into the current state of a
user pipeline,
+potentially while the pipeline is running. There could be different reasons
for that, for instance:
+* Check the number of errors encountered while running a specific step in
the pipeline;
+* Monitor the number of RPCs made to backend service;
+* Retrieve an accurate count of the number of elements that have been
processed;
+* ...and so on.
+
+### 9.1 The main concepts of Beam metrics
+* **Named**. Each metric has a name which consists of a namespace and an
actual name. The
+ namespace can be used to differentiate between multiple metrics with the
same name and also
+ allows querying for all metrics within a specific namespace.
+* **Scoped**. Each metric is reported against a specific step in the
pipeline, indicating what
+ code was running when the metric was incremented.
+* **Dynamically Created**. Metrics may be created during runtime without
pre-declaring them, in
+ much the same way a logger could be created. This makes it easier to
produce metrics in utility
+ code and have them usefully reported.
+* **Degrade Gracefully**. If a runner doesn’t support some part of reporting
metrics, the
+ fallback behavior is to drop the metric updates rather than failing the
pipeline. If a runner
+ doesn’t support some part of querying metrics, the runner will not return
the associated data.
+
+Reported metrics are implicitly scoped to the transform within the pipeline
that reported them.
+This allows reporting the same metric name in multiple places and identifying
the value each
+transform reported, as well as aggregating the metric across the entire
pipeline.
+
+> **Note:** It is runner-dependent whether metrics are accessible during
pipeline execution or only
+after jobs have completed.
+
+### 9.2 Types of metrics {#types-of-metrics}
+There are three types of metrics that are supported for the moment: `Counter`,
`Distribution` and
+`Gauge`.
+
+**Counter**: A metric that reports a single long value and can be incremented
or decremented.
+
+```java
+Counter counter = Metrics.counter( "namespace", "counter1");
+
+@ProcessElement
+public void processElement(ProcessContext context) {
+ // count the elements
+ counter.inc();
+ ...
+}
+```
+
+**Distribution**: A metric that reports information about the distribution of
reported values.
+
+```java
+Distribution distribution = Metrics.distribution( "namespace",
"distribution1");
+
+@ProcessElement
+public void processElement(ProcessContext context) {
+ Integer element = context.element();
+ // create a distribution (histogram) of the values
+ distribution.update(element);
+ ...
+}
+```
+
+**Gauge**: A metric that reports the latest value out of reported values.
Since metrics are
+collected from many workers the value may not be the absolute last, but one of
the latest values.
+
+```java
+Gauge gauge = Metrics.gauge( "namespace", "gauge1");
+
+@ProcessElement
+public void processElement(ProcessContext context) {
+ Integer element = context.element();
+ // create a gauge (latest value received) of the values
+ gauge.set(element);
+ ...
+}
+```
+
+### 9.3 Querying metrics {#querying-metrics}
+`PipelineResult` has a method `metrics()` which returns a `MetricResults`
object that allows
+accessing metrics. The main method available in `MetricResults` allows
querying for all metrics
+matching a given filter.
+
+```java
+public interface PipelineResult {
+ MetricResults metrics();
+}
+
+public abstract class MetricResults {
+ public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter
filter);
+}
+
+public interface MetricQueryResults {
+ Iterable<MetricResult<Long>> getCounters();
+ Iterable<MetricResult<DistributionResult>> getDistributions();
+ Iterable<MetricResult<GaugeResult>> getGauges();
+}
+
+public interface MetricResult<T> {
+ MetricName getName();
+ String getStep();
+ T getCommitted();
+ T getAttempted();
+}
+```
+
+### 9.4 Using metrics in pipeline {#using-metrics}
+Below, there is a simple example of how to use a `Counter` metric in a user
pipeline.
+
+```java
+// creating a pipeline with custom metrics DoFn
+pipeline
+ .apply(...)
+ .apply(ParDo.of(new MyMetricsDoFn()));
+
+pipelineResult = pipeline.run().waitUntilFinish(...);
+
+// request the metric, called "counter1", in namespace, called "namespace"
+MetricQueryResults metrics =
+ pipelineResult
+ .metrics()
+ .queryMetrics(
+ MetricsFilter.builder()
+ .addNameFilter(MetricNameFilter.named("namespace", "counter1"))
+ .build());
+
+// print the metric value - there should be only one line because there is
only one metric,
Review comment:
remove ","
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 297207)
Time Spent: 2h 10m (was: 2h)
> Metrics documentation
> ---------------------
>
> Key: BEAM-1974
> URL: https://issues.apache.org/jira/browse/BEAM-1974
> Project: Beam
> Issue Type: Improvement
> Components: website
> Reporter: Aviem Zur
> Assignee: Alexey Romanenko
> Priority: Major
> Time Spent: 2h 10m
> Remaining Estimate: 0h
>
> Document metrics API and uses (make sure to remark that it is still
> experimental).
--
This message was sent by Atlassian Jira
(v8.3.2#803003)