[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=297682=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297682 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 20/Aug/19 08:16 Start Date: 20/Aug/19 08:16 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#issuecomment-522907257 @echauchot Thanks for review. I squashed and self-merged This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 297682) Time Spent: 2h 40m (was: 2.5h) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=297681=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297681 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 20/Aug/19 08:15 Start Date: 20/Aug/19 08:15 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 297681) Time Spent: 2.5h (was: 2h 20m) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=297208=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297208 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 19/Aug/19 15:24 Start Date: 19/Aug/19 15:24 Worklog Time Spent: 10m Work Description: echauchot commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r315270951 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,143 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +In the Beam model, metrics provide some insight into the current state of a user pipeline, +potentially while the pipeline is running. There could be different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across the entire pipeline. + +> **Note:** It is runner-dependent whether metrics are accessible during pipeline execution or only +after jobs have completed. + +### 9.2 Types of metrics {#types-of-metrics} +There are three types of metrics that are supported for the moment: `Counter`, `Distribution` and +`Gauge`. + +**Counter**: A metric that reports a single long value and can be incremented or decremented. + +```java +Counter counter = Metrics.counter( "namespace", "counter1"); + +@ProcessElement +public void processElement(ProcessContext context) { + // count the elements + counter.inc(); + ... +} +``` + +**Distribution**: A metric that reports information about the distribution of reported values. + +```java +Distribution distribution = Metrics.distribution( "namespace", "distribution1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); +// create a distribution (histogram) of the values +distribution.update(element); +... +} +``` + +**Gauge**: A metric that reports the latest value out of reported values. Since metrics are +collected from many workers the value may not be the absolute last, but one of the latest values. + +```java +Gauge gauge = Metrics.gauge( "namespace", "gauge1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); + // create a gauge (latest value received) of the values + gauge.set(element); + ... +} +``` + +### 9.3 Querying metrics {#querying-metrics} +`PipelineResult` has a method `metrics()` which returns a `MetricResults` object that allows +accessing metrics. The main method available in `MetricResults` allows querying for all metrics +matching a given filter. + +```java +public interface PipelineResult { + MetricResults metrics(); +} + +public abstract class MetricResults { + public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter filter); +} + +public interface MetricQueryResults { + Iterable> getCounters(); + Iterable> getDistributions(); + Iterable> getGauges(); +} + +public interface MetricResult { + MetricName getName(); + String getStep(); + T getCommitted(); + T getAttempted(); +} +``` + +### 9.4 Using metrics in pipeline {#using-metrics} +Below, there is a simple example of how to use a `Counter` metric in a user
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=297209=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297209 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 19/Aug/19 15:24 Start Date: 19/Aug/19 15:24 Worklog Time Spent: 10m Work Description: echauchot commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r315270650 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,143 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +In the Beam model, metrics provide some insight into the current state of a user pipeline, +potentially while the pipeline is running. There could be different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across the entire pipeline. + +> **Note:** It is runner-dependent whether metrics are accessible during pipeline execution or only +after jobs have completed. + +### 9.2 Types of metrics {#types-of-metrics} +There are three types of metrics that are supported for the moment: `Counter`, `Distribution` and +`Gauge`. + +**Counter**: A metric that reports a single long value and can be incremented or decremented. + +```java +Counter counter = Metrics.counter( "namespace", "counter1"); + +@ProcessElement +public void processElement(ProcessContext context) { + // count the elements + counter.inc(); + ... +} +``` + +**Distribution**: A metric that reports information about the distribution of reported values. + +```java +Distribution distribution = Metrics.distribution( "namespace", "distribution1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); +// create a distribution (histogram) of the values +distribution.update(element); +... +} +``` + +**Gauge**: A metric that reports the latest value out of reported values. Since metrics are +collected from many workers the value may not be the absolute last, but one of the latest values. + +```java +Gauge gauge = Metrics.gauge( "namespace", "gauge1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); + // create a gauge (latest value received) of the values + gauge.set(element); + ... +} +``` + +### 9.3 Querying metrics {#querying-metrics} +`PipelineResult` has a method `metrics()` which returns a `MetricResults` object that allows +accessing metrics. The main method available in `MetricResults` allows querying for all metrics +matching a given filter. + +```java +public interface PipelineResult { + MetricResults metrics(); +} + +public abstract class MetricResults { + public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter filter); +} + +public interface MetricQueryResults { + Iterable> getCounters(); + Iterable> getDistributions(); + Iterable> getGauges(); +} + +public interface MetricResult { + MetricName getName(); + String getStep(); + T getCommitted(); + T getAttempted(); +} +``` + +### 9.4 Using metrics in pipeline {#using-metrics} +Below, there is a simple example of how to use a `Counter` metric in a user
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=297207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-297207 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 19/Aug/19 15:24 Start Date: 19/Aug/19 15:24 Worklog Time Spent: 10m Work Description: echauchot commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r315270391 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,143 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +In the Beam model, metrics provide some insight into the current state of a user pipeline, +potentially while the pipeline is running. There could be different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across the entire pipeline. + +> **Note:** It is runner-dependent whether metrics are accessible during pipeline execution or only +after jobs have completed. + +### 9.2 Types of metrics {#types-of-metrics} +There are three types of metrics that are supported for the moment: `Counter`, `Distribution` and +`Gauge`. + +**Counter**: A metric that reports a single long value and can be incremented or decremented. + +```java +Counter counter = Metrics.counter( "namespace", "counter1"); + +@ProcessElement +public void processElement(ProcessContext context) { + // count the elements + counter.inc(); + ... +} +``` + +**Distribution**: A metric that reports information about the distribution of reported values. + +```java +Distribution distribution = Metrics.distribution( "namespace", "distribution1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); +// create a distribution (histogram) of the values +distribution.update(element); +... +} +``` + +**Gauge**: A metric that reports the latest value out of reported values. Since metrics are +collected from many workers the value may not be the absolute last, but one of the latest values. + +```java +Gauge gauge = Metrics.gauge( "namespace", "gauge1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); + // create a gauge (latest value received) of the values + gauge.set(element); + ... +} +``` + +### 9.3 Querying metrics {#querying-metrics} +`PipelineResult` has a method `metrics()` which returns a `MetricResults` object that allows +accessing metrics. The main method available in `MetricResults` allows querying for all metrics +matching a given filter. + +```java +public interface PipelineResult { + MetricResults metrics(); +} + +public abstract class MetricResults { + public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter filter); +} + +public interface MetricQueryResults { + Iterable> getCounters(); + Iterable> getDistributions(); + Iterable> getGauges(); +} + +public interface MetricResult { + MetricName getName(); + String getStep(); + T getCommitted(); + T getAttempted(); +} +``` + +### 9.4 Using metrics in pipeline {#using-metrics} +Below, there is a simple example of how to use a `Counter` metric in a user
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296326=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296326 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 15:06 Start Date: 16/Aug/19 15:06 Worklog Time Spent: 10m Work Description: echauchot commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r314757953 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,142 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +In the Beam model, metrics provide some insight into the current state of a user pipeline, +potentially while the pipeline is running. There could be different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across the entire pipeline. + +> **Note:** It is runner-dependent whether metrics are accessible during pipeline execution or only +after jobs have completed. + +### 9.2 Types of metrics {#types-of-metrics} +There are three types of metrics that are supported for the moment: `Counter`, `Distribution` and +`Gauge`. + +**Counter**: A metric that reports a single long value and can be incremented or decremented. + +```java +Counter counter = Metrics.counter( "namespace", "counter1"); + +@ProcessElement +public void processElement(ProcessContext context) { + // count the elements + counter.inc(); + ... +} +``` + +**Distribution**: A metric that reports information about the distribution of reported values. + +```java +Distribution distribution = Metrics.distribution( "namespace", "distribution1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); +// create a distribution (histogram) of the values +distribution.update(element); +... +} +``` + +**Gauge**: A metric that reports the latest value out of reported values. Since metrics are +collected from many workers the value may not be the absolute last, but one of the latest values. + +```java +Gauge gauge = Metrics.gauge( "namespace", "gauge1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); + // create a gauge (latest value received) of the values + gauge.set(element); + ... +} +``` + +### 9.3 Querying metrics {#querying-metrics} +`PipelineResult` has a method `metrics()` which returns a `MetricResults` object that allows +accessing metrics. The main method available in `MetricResults` allows querying for all metrics +matching a given filter. + +```java +public interface PipelineResult { + MetricResults metrics(); +} + +public abstract class MetricResults { + public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter filter); +} + +public interface MetricQueryResults { + Iterable> getCounters(); + Iterable> getDistributions(); + Iterable> getGauges(); +} + +public interface MetricResult { + MetricName getName(); + String getStep(); + T getCommitted(); + T getAttempted(); +} +``` + +### 9.4 Using metrics in pipeline {#using-metrics} +Below, there is a simple example of how to use a `Counter` metric in a user
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296319=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296319 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 14:42 Start Date: 16/Aug/19 14:42 Worklog Time Spent: 10m Work Description: echauchot commented on issue #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#issuecomment-522033769 @aromanenko-dev taking a look. Thanks for your work and thanks for pinging me :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 296319) Time Spent: 1h 20m (was: 1h 10m) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296288 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 13:28 Start Date: 16/Aug/19 13:28 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#issuecomment-522008606 @RyanSkraba Thank for taking a look, I addressed your comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 296288) Time Spent: 1h 10m (was: 1h) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296266=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296266 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 12:47 Start Date: 16/Aug/19 12:47 Worklog Time Spent: 10m Work Description: RyanSkraba commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r314705104 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,137 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +The Beam model provides a way to use different types of metrics in a user pipeline. There could be +different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across + +> **Note:** It is runner-dependent whether metrics are accessible during pipeline execution or only +after jobs have completed. + +### 9.2 Types of metrics {#types-of-metrics} +There are three types of metrics that are supported for the moment: `Counter`, `Distribution` and +`Gauge`. + +**Counter**: A metric that reports a single long value and can be incremented or decremented. + +```java +Counter counter = Metrics.counter( "namespace", "counter1"); + +@ProcessElement +public void processElement(ProcessContext context) { + // count the elements + counter.inc(); + ... +} +``` + +**Distribution**: A metric that reports information about the distribution of reported values. + +```java +Distribution distribution = Metrics.distribution( "namespace", "distribution1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); +// create a distribution (histogram) of the values +distribution.update(element); +... +} +``` + +**Gauge**: A metric that reports the latest value out of reported values. Since metrics are +collected from many workers the value may not be the absolute last, but one of the latest values. + +```java +Gauge gauge = Metrics.gauge( "namespace", "gauge1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); + // create a gauge (latest value received) of the values + gauge.set(element); + ... +} +``` + +### 9.3 Querying metrics {#querying-metrics} +`PipelineResult` has a method `metrics()` which returns a `MetricResults` object that allows +accessing metrics. The main method available in `MetricResults` allows querying for all metrics +matching a given filter. + +```java +public interface PipelineResult { + MetricResults metrics(); +} + +public abstract class MetricResults { + public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter filter); +} + +public interface MetricQueryResults { + Iterable> getCounters(); + Iterable> getDistributions(); + Iterable> getGauges(); +} + +public interface MetricResult { + MetricName getName(); + String getStep(); + T getCommitted(); + T getAttempted(); +} +``` + +### 9.4 Using metrics in pipeline {#using-metrics} +Below, there is a simple example of how to use `Counter` metric in user pipeline. + +```java +// creating a pipeline with custom metrics DoFn
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296264=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296264 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 12:47 Start Date: 16/Aug/19 12:47 Worklog Time Spent: 10m Work Description: RyanSkraba commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r314703039 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,137 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +The Beam model provides a way to use different types of metrics in a user pipeline. There could be +different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across Review comment: ```suggestion transform reported, as well as aggregating the metric across the entire pipeline. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 296264) Time Spent: 40m (was: 0.5h) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296265 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 12:47 Start Date: 16/Aug/19 12:47 Worklog Time Spent: 10m Work Description: RyanSkraba commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r314703910 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,137 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +The Beam model provides a way to use different types of metrics in a user pipeline. There could be +different reasons for that, for instance: +* Check the number of errors encountered while running a specific step in the pipeline; +* Monitor the number of RPCs made to backend service; +* Retrieve an accurate count of the number of elements that have been processed; +* ...and so on. + +### 9.1 The main concepts of Beam metrics +* **Named**. Each metric has a name which consists of a namespace and an actual name. The +namespace can be used to differentiate between multiple metrics with the same name and also +allows querying for all metrics within a specific namespace. +* **Scoped**. Each metric is reported against a specific step in the pipeline, indicating what +code was running when the metric was incremented. +* **Dynamically Created**. Metrics may be created during runtime without pre-declaring them, in +much the same way a logger could be created. This makes it easier to produce metrics in utility +code and have them usefully reported. +* **Degrade Gracefully**. If a runner doesn’t support some part of reporting metrics, the +fallback behavior is to drop the metric updates rather than failing the pipeline. If a runner +doesn’t support some part of querying metrics, the runner will not return the associated data. + +Reported metrics are implicitly scoped to the transform within the pipeline that reported them. +This allows reporting the same metric name in multiple places and identifying the value each +transform reported, as well as aggregating the metric across + +> **Note:** It is runner-dependent whether metrics are accessible during pipeline execution or only +after jobs have completed. + +### 9.2 Types of metrics {#types-of-metrics} +There are three types of metrics that are supported for the moment: `Counter`, `Distribution` and +`Gauge`. + +**Counter**: A metric that reports a single long value and can be incremented or decremented. + +```java +Counter counter = Metrics.counter( "namespace", "counter1"); + +@ProcessElement +public void processElement(ProcessContext context) { + // count the elements + counter.inc(); + ... +} +``` + +**Distribution**: A metric that reports information about the distribution of reported values. + +```java +Distribution distribution = Metrics.distribution( "namespace", "distribution1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); +// create a distribution (histogram) of the values +distribution.update(element); +... +} +``` + +**Gauge**: A metric that reports the latest value out of reported values. Since metrics are +collected from many workers the value may not be the absolute last, but one of the latest values. + +```java +Gauge gauge = Metrics.gauge( "namespace", "gauge1"); + +@ProcessElement +public void processElement(ProcessContext context) { + Integer element = context.element(); + // create a gauge (latest value received) of the values + gauge.set(element); + ... +} +``` + +### 9.3 Querying metrics {#querying-metrics} +`PipelineResult` has a method `metrics()` which returns a `MetricResults` object that allows +accessing metrics. The main method available in `MetricResults` allows querying for all metrics +matching a given filter. + +```java +public interface PipelineResult { + MetricResults metrics(); +} + +public abstract class MetricResults { + public abstract MetricQueryResults queryMetrics(@Nullable MetricsFilter filter); +} + +public interface MetricQueryResults { + Iterable> getCounters(); + Iterable> getDistributions(); + Iterable> getGauges(); +} + +public interface MetricResult { + MetricName getName(); + String getStep(); + T getCommitted(); + T getAttempted(); +} +``` + +### 9.4 Using metrics in pipeline {#using-metrics} +Below, there is a simple example of how to use `Counter` metric in user pipeline. Review comment: ```suggestion Below, there is a simple
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=296263=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-296263 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 16/Aug/19 12:47 Start Date: 16/Aug/19 12:47 Worklog Time Spent: 10m Work Description: RyanSkraba commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#discussion_r314702621 ## File path: website/src/documentation/programming-guide.md ## @@ -2907,3 +2907,137 @@ elements, or after a minute. ```py {% github_sample /apache/beam/blob/master/sdks/python/apache_beam/examples/snippets/snippets_test.py tag:model_other_composite_triggers %}``` + +## 9. Metrics {#metrics} +The Beam model provides a way to use different types of metrics in a user pipeline. There could be Review comment: ```suggestion In the Beam model, metrics provide some insight into the current state of a user pipeline, potentially while the pipeline is running. There could be ``` (Just a suggestion -- the first sentence should describe why metrics are interesting. The examples are a good illustration!) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 296263) Time Spent: 0.5h (was: 20m) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=293932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293932 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 13/Aug/19 14:42 Start Date: 13/Aug/19 14:42 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on issue #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328#issuecomment-520864493 R: @RyanSkraba @echauchot This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 293932) Time Spent: 20m (was: 10m) > Metrics documentation > - > > Key: BEAM-1974 > URL: https://issues.apache.org/jira/browse/BEAM-1974 > Project: Beam > Issue Type: Improvement > Components: website >Reporter: Aviem Zur >Assignee: Alexey Romanenko >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Document metrics API and uses (make sure to remark that it is still > experimental). -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (BEAM-1974) Metrics documentation
[ https://issues.apache.org/jira/browse/BEAM-1974?focusedWorklogId=293930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-293930 ] ASF GitHub Bot logged work on BEAM-1974: Author: ASF GitHub Bot Created on: 13/Aug/19 14:39 Start Date: 13/Aug/19 14:39 Worklog Time Spent: 10m Work Description: aromanenko-dev commented on pull request #9328: [BEAM-1974] Add Metrics user-oriented documentation URL: https://github.com/apache/beam/pull/9328 The principal things and examples of using Metrics API in usr pipeline. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [x] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)[![Build