[ 
https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296326&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296326
 ]

ASF GitHub Bot logged work on BEAM-1974:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Aug/19 15:06
            Start Date: 16/Aug/19 15:06
    Worklog Time Spent: 10m 
      Work Description: echauchot commented on pull request #9328: [BEAM-1974] 
Add Metrics user-oriented documentation
URL: https://github.com/apache/beam/pull/9328#discussion_r314757953
 
 

 ##########
 File path: website/src/documentation/programming-guide.md
 ##########
 @@ -2907,3 +2907,142 @@ elements, or after a minute.
 ```py
 {% github_sample 
/apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py
 tag:model_other_composite_triggers
 %}```
+
+## 9. Metrics {#metrics}
+In the Beam model, metrics provide some insight into the current state of a 
user pipeline, 
+potentially while the pipeline is running. There could be different reasons 
for that, for instance:
+*   Check the number of errors encountered while running a specific step in 
the pipeline;
+*   Monitor the number of RPCs made to backend service;
+*   Retrieve an accurate count of the number of elements that have been 
processed;
+*   ...and so on.
+
+### 9.1 The main concepts of Beam metrics
+*   **Named**. Each metric has a name which consists of a namespace and an 
actual name. The 
+    namespace can be used to differentiate between multiple metrics with the 
same name and also 
+    allows querying for all metrics within a specific namespace. 
+*   **Scoped**. Each metric is reported against a specific step in the 
pipeline, indicating what 
+    code was running when the metric was incremented.
+*   **Dynamically Created**. Metrics may be created during runtime without 
pre-declaring them, in 
+    much the same way a logger could be created. This makes it easier to 
produce metrics in utility 
+    code and have them usefully reported. 
+*   **Degrade Gracefully**. If a runner doesn’t support some part of reporting 
metrics, the 
+    fallback behavior is to drop the metric updates rather than failing the 
pipeline. If a runner 
+    doesn’t support some part of querying metrics, the runner will not return 
the associated data.
+
+Reported metrics are implicitly scoped to the transform within the pipeline 
that reported them. 
+This allows reporting the same metric name in multiple places and identifying 
the value each 
+transform reported, as well as aggregating the metric across the entire 
pipeline.
+
+> **Note:** It is runner-dependent whether metrics are accessible during 
pipeline execution or only 
+after jobs have completed.
+
+### 9.2 Types of metrics {#types-of-metrics}
+There are three types of metrics that are supported for the moment: `Counter`, 
`Distribution` and 
+`Gauge`.
+
+**Counter**: A metric that reports a single long value and can be incremented 
or decremented.
+
+```java
+Counter counter = Metrics.counter( "namespace", "counter1");
+
+@ProcessElement
+public void processElement(ProcessContext context) {
+  // count the elements
+  counter.inc();
+  ...
+}
+```
+
+**Distribution**: A metric that reports information about the distribution of 
reported values.
+
+```java
+Distribution distribution = Metrics.distribution( "namespace", 
"distribution1");
+
+@ProcessElement
+public void processElement(ProcessContext context) {
+  Integer element = context.element();
+    // create a distribution (histogram) of the values 
+    distribution.update(element);
+    ...
+}
+```
+
+**Gauge**: A metric that reports the latest value out of reported values. 
Since metrics are 
+collected from many workers the value may not be the absolute last, but one of 
the latest values.
+
+```java
+Gauge gauge = Metrics.gauge( "namespace", "gauge1");
+
+@ProcessElement
+public void processElement(ProcessContext context) {
+  Integer element = context.element();
+  // create a gauge (latest value received) of the values 
+  gauge.set(element);
+  ...
+}
+```
+
+### 9.3 Querying metrics {#querying-metrics}
+`PipelineResult` has a method `metrics()` which returns a `MetricResults` 
object that allows 
+accessing metrics. The main method available in `MetricResults` allows 
querying for all metrics 
+matching a given filter.
+
+```java
+public interface PipelineResult {
+  MetricResults metrics();
+}
+
+public abstract class MetricResults {
+  public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter 
filter);
+}
+
+public interface MetricQueryResults {
+  Iterable<MetricResult<Long>> getCounters();
+  Iterable<MetricResult<DistributionResult>> getDistributions();
+  Iterable<MetricResult<GaugeResult>> getGauges();
+}
+
+public interface MetricResult<T> {
+  MetricName getName();
+  String getStep();
+  T getCommitted();
+  T getAttempted();
+}
+```
+
+### 9.4 Using metrics in pipeline {#using-metrics}
+Below, there is a simple example of how to use a `Counter` metric in a user 
pipeline.
+
+```java
+// creating a pipeline with custom metrics DoFn
+pipeline
+    .apply(...)
+    .apply(ParDo.of(new MyMetricsDoFn()));
+
+pipelineResult = pipeline.run().waitUntilFinish(...);
+
+// query metric by namespace and metric name
+MetricQueryResults metrics =
+    pipelineResult
+        .metrics()
+        .queryMetrics(
+            MetricsFilter.builder()
+                .addNameFilter(MetricNameFilter.named("namespace", "counter1"))
+                .build());
+
+// Find and print the queried counter:
+for (MetricResult<Long> counter: metrics.getCounters()) {
 
 Review comment:
   There is only one counter defined I think the for loop is thus misleading. 
Remove, or to avoid .getCounters().get(0), maybe a comment that says that there 
should be only one printed line
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 296326)
    Time Spent: 1.5h  (was: 1h 20m)

> Metrics documentation
> ---------------------
>
>                 Key: BEAM-1974
>                 URL: https://issues.apache.org/jira/browse/BEAM-1974
>             Project: Beam
>          Issue Type: Improvement
>          Components: website
>            Reporter: Aviem Zur
>            Assignee: Alexey Romanenko
>            Priority: Major
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Document metrics API and uses (make sure to remark that it is still 
> experimental).



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to