This is an automated email from the ASF dual-hosted git repository.
yhu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push:
new b855ec58df5 Replace ip (104.154.241.245, 35.193.202.176) with
metrics.beam.apache.org (#27945)
b855ec58df5 is described below
commit b855ec58df5fd8257713597d71022c892856ca56
Author: liferoad <[email protected]>
AuthorDate: Mon Aug 14 09:46:43 2023 -0400
Replace ip (104.154.241.245, 35.193.202.176) with metrics.beam.apache.org
(#27945)
* replace 104.154.241.245 with 35.193.202.176
* more changes
* use metrics.beam.apache.org
---------
Co-authored-by: xqhu <[email protected]>
---
.test-infra/metrics/src/test/groovy/ProberTests.groovy | 2 +-
sdks/python/apache_beam/testing/analyzers/README.md | 17 ++++++++++-------
.../apache_beam/testing/analyzers/tests_config.yaml | 12 ++++++------
3 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/.test-infra/metrics/src/test/groovy/ProberTests.groovy
b/.test-infra/metrics/src/test/groovy/ProberTests.groovy
index 5a44d4410a9..c5de9ca64c8 100644
--- a/.test-infra/metrics/src/test/groovy/ProberTests.groovy
+++ b/.test-infra/metrics/src/test/groovy/ProberTests.groovy
@@ -27,7 +27,7 @@ import static groovy.test.GroovyAssert.shouldFail
*/
class ProberTests {
// TODO: Make this configurable
- def grafanaEndpoint = 'http://35.193.202.176'
+ def grafanaEndpoint = 'http://metrics.beam.apache.org'
@Test
void PingGrafanaHttpApi() {
diff --git a/sdks/python/apache_beam/testing/analyzers/README.md
b/sdks/python/apache_beam/testing/analyzers/README.md
index 71351fe3e57..6098c82fd54 100644
--- a/sdks/python/apache_beam/testing/analyzers/README.md
+++ b/sdks/python/apache_beam/testing/analyzers/README.md
@@ -19,7 +19,8 @@
# Performance alerts for Beam Python performance and load tests
-## Alerts
+## Alerts
+
Performance regressions or improvements detected with the [Change Point
Analysis](https://en.wikipedia.org/wiki/Change_detection) using
[edivisive](https://github.com/apache/beam/blob/0a91d139dea4276dc46176c4cdcdfce210fc50c4/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L30)
analyzer are automatically filed as Beam GitHub issues with a label
`perf-alert`.
@@ -32,7 +33,8 @@ If a performance alert is created on a test, a GitHub issue
will be created and
URL, issue number along with the change point value and timestamp are exported
to BigQuery. This data will be used to analyze the next change point observed
on the same test to
update already created GitHub issue or ignore performance alert by not
creating GitHub issue to avoid duplicate issue creation.
-## Config file structure
+## Config file structure
+
The config file defines the structure to run change point analysis on a given
test. To add a test to the config file,
please follow the below structure.
@@ -73,21 +75,22 @@ Sometimes, the change point found might be way back in time
and could be irrelev
reported only when it was observed in the last 7 runs from the current run,
setting `num_runs_in_change_point_window=7` will achieve it.
-## Register a test for performance alerts
+## Register a test for performance alerts
If a new test needs to be registered for the performance alerting tool, please
add the required test parameters to the
config file.
## Triage performance alert issues
-All the performance/load tests metrics defined at
[beam/.test-infra/jenkins](https://github.com/apache/beam/tree/master/.test-infra/jenkins)
are imported to [Grafana
dashboards](http://104.154.241.245/d/1/getting-started?orgId=1) for
visualization. Please
+All the performance/load tests metrics defined at
[beam/.test-infra/jenkins](https://github.com/apache/beam/tree/master/.test-infra/jenkins)
are imported to [Grafana
dashboards](http://metrics.beam.apache.org/d/1/getting-started?orgId=1) for
visualization. Please
find the alerted test dashboard to find a spike in the metric values.
For example, for the below configuration,
-* test_target:
`apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks`
-* metric_name: `mean_load_model_latency_milli_secs`
-Grafana dashboard can be found at
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
+- test_target:
`apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks`
+- metric_name: `mean_load_model_latency_milli_secs`
+
+Grafana dashboard can be found at
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
If the dashboard for a test is not found, you can use the
notebook `analyze_metric_data.ipynb` to generate a plot for the given test,
metric_name.
diff --git a/sdks/python/apache_beam/testing/analyzers/tests_config.yaml
b/sdks/python/apache_beam/testing/analyzers/tests_config.yaml
index e7741db93b0..bc74f292c48 100644
--- a/sdks/python/apache_beam/testing/analyzers/tests_config.yaml
+++ b/sdks/python/apache_beam/testing/analyzers/tests_config.yaml
@@ -22,7 +22,7 @@
pytorch_image_classification_benchmarks-resnet152-mean_inference_batch_latency_m
test_description:
Pytorch image classification on 50k images of size 224 x 224 with resnet
152.
Test link -
https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L63
- Test dashboard -
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2
+ Test dashboard -
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2
test_target:
metrics_dataset: beam_run_inference
metrics_table: torch_inference_imagenet_results_resnet152
@@ -33,7 +33,7 @@
pytorch_image_classification_benchmarks-resnet101-mean_load_model_latency_milli_
test_description:
Pytorch image classification on 50k images of size 224 x 224 with resnet
101.
Test link -
https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L34
- Test dashboard -
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
+ Test dashboard -
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
test_target:
apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks
metrics_dataset: beam_run_inference
metrics_table: torch_inference_imagenet_results_resnet101
@@ -44,7 +44,7 @@
pytorch_image_classification_benchmarks-resnet101-mean_inference_batch_latency_m
test_description:
Pytorch image classification on 50k images of size 224 x 224 with resnet
101.
Test link -
https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L34
- Test dashboard -
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2
+ Test dashboard -
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=2
test_target:
apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks
metrics_dataset: beam_run_inference
metrics_table: torch_inference_imagenet_results_resnet101
@@ -55,7 +55,7 @@
pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_laten
test_description:
Pytorch image classification on 50k images of size 224 x 224 with resnet
152 with Tesla T4 GPU.
Test link -
https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L151
- Test dashboard -
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
+ Test dashboard -
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
test_target:
apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks
metrics_dataset: beam_run_inference
metrics_table: torch_inference_imagenet_results_resnet101
@@ -66,7 +66,7 @@
pytorch_image_classification_benchmarks-resnet152-GPU-mean_load_model_latency_mi
test_description:
Pytorch image classification on 50k images of size 224 x 224 with resnet
152 with Tesla T4 GPU.
Test link -
https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L151
- Test dashboard -
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
+ Test dashboard -
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?orgId=1&viewPanel=7
test_target:
apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks
metrics_dataset: beam_run_inference
metrics_table: torch_inference_imagenet_results_resnet152_tesla_t4
@@ -77,7 +77,7 @@
pytorch_image_classification_benchmarks-resnet152-GPU-mean_inference_batch_laten
test_description:
Pytorch image classification on 50k images of size 224 x 224 with resnet
152 with Tesla T4 GPU.
Test link -
https://github.com/apache/beam/blob/42d0a6e3564d8b9c5d912428a6de18fb22a13ac1/.test-infra/jenkins/job_InferenceBenchmarkTests_Python.groovy#L151).
- Test dashboard -
http://104.154.241.245/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2
+ Test dashboard -
http://metrics.beam.apache.org/d/ZpS8Uf44z/python-ml-runinference-benchmarks?from=now-90d&to=now&viewPanel=2
test_target:
apache_beam.testing.benchmarks.inference.pytorch_image_classification_benchmarks
metrics_dataset: beam_run_inference
metrics_table: torch_inference_imagenet_results_resnet152_tesla_t4