Questions: - Are the metrics available on metrics.beam.apache.org? - What is the feature delta between usinig metrics.beam.apache.org (much better UI) and using apache-beam-testing.appspot.com? - Can we notice regressions faster than release cadence? - Can we get automated alerts?
Kenn On Thu, Jul 9, 2020 at 10:21 AM Maximilian Michels <[email protected]> wrote: > Hi, > > We recently saw an increase in latency migrating from Beam 2.18.0 to > 2.21.0 (Python SDK with Flink Runner). This proofed very hard to debug > and it looks like each version in between the two versions let to > increased latency. > > This is not the first time we saw issues when migrating, another time we > had a decline in checkpointing performance and thus added a > checkpointing test [1] and dashboard [2] (see checkpointing widget). > > That makes me wonder if we should monitor performance (throughput / > latency) for basic use cases as part of the release testing. Currently, > our release guide [3] mentions running examples but not evaluating the > performance. I think it would be good practice to check relevant charts > with performance measurements as part of of the release process. The > release guide should reflect that. > > WDYT? > > -Max > > PS: Of course, this requires tests and metrics to be available. This PR > adds latency measurements to the load tests [4]. > > > [1] https://github.com/apache/beam/pull/11558 > [2] > https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056 > [3] https://beam.apache.org/contribute/release-guide/ > [4] https://github.com/apache/beam/pull/12065 >
