This is an automated email from the ASF dual-hosted git repository. mergebot-role pushed a commit to branch mergebot in repository https://gitbox.apache.org/repos/asf/beam-site.git
commit 6fe36cc22a3b67fa0a4d4e635a463646e607577f Author: Udi Meiri <[email protected]> AuthorDate: Thu Aug 2 18:20:51 2018 -0700 Add precommit policies and triage guide. Also update some paragraphs regarding precommits and postcommits in the testing guide. --- src/_includes/section-menu/contribute.html | 4 + src/contribute/precommit-policies.md | 66 +++++++++++++++ src/contribute/precommit-triage-guide.md | 125 ++++++++++++++++++++++++++++ src/contribute/testing.md | 51 +++++++----- src/images/precommit_durations.png | Bin 0 -> 45673 bytes src/images/precommit_graph_queuing_time.png | Bin 0 -> 25809 bytes 6 files changed, 224 insertions(+), 22 deletions(-) diff --git a/src/_includes/section-menu/contribute.html b/src/_includes/section-menu/contribute.html index 07affbc..7a70f62 100644 --- a/src/_includes/section-menu/contribute.html +++ b/src/_includes/section-menu/contribute.html @@ -25,6 +25,9 @@ <ul class="section-nav-list"> <li><a href="{{ site.baseurl }}/contribute/testing/">Testing guide</a></li> + <ul> + <li><a href="{{ site.baseurl }}/contribute/precommit-triage-guide/">Precommit Slowness Triage Guide</a></li> + </ul> <li><a href="{{ site.baseurl }}/contribute/ptransform-style-guide/">PTransform style guide</a></li> <li><a href="{{ site.baseurl }}/contribute/runner-guide/">Runner authoring guide</a></li> <li><a href="{{ site.baseurl }}/contribute/portability/">Portability Framework</a></li> @@ -36,6 +39,7 @@ <li> <span class="section-nav-list-title">Policies</span> <ul class="section-nav-list"> + <li><a href="{{ site.baseurl }}/contribute/precommit-policies/">Precommit test policies</a></li> <li><a href="{{ site.baseurl }}/contribute/postcommits-policies/">Post-commit tests policies</a></li> </ul> </li> diff --git a/src/contribute/precommit-policies.md b/src/contribute/precommit-policies.md new file mode 100644 index 0000000..7261283 --- /dev/null +++ b/src/contribute/precommit-policies.md @@ -0,0 +1,66 @@ +--- +layout: section +title: "Precommit Test Policies" +permalink: /contribute/precommit-policies/ +section_menu: section-menu/contribute.html +--- +<!-- +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> + +# Precommit test policies + +## Definitions + +- Precommit test - Any single test in a precommit test suite. +- Precommit test suite - A collection of precommit tests that have a common +denominator. A test suite runs in a single Jenkins job. Currently, suites are +grouped by SDK languages, e.g., Python, Java, and Go. + +## Policies + +### Pull Requests + +- A PR must pass precommit tests before being committed to the main Beam repo. + - The relevant precommit test suites are automatically launched according to + PR contents. + +### Problems + +#### Breakage + +Breakage is when one or more tests in a precommit test suite fails or +is flaky (occasionally fails). + +- Breakages should be fixed within 8 hours. + +#### Slowness + +Slowness is when the total time to run a precommit suite exceeds 30 minutes\*, +including the time the job spends in the Jenkins queue. + +- Slowness should be fixed within 24 hours. + +\* See the [Precommit Slowness Triage +Guide](/contribute/precommit-triage-guide/) for a precise definition of slowness +and for information on dealing with slowness. + +### Problem Resolution + +For any problem, the options are, one of: + +- Roll back the culprit PR. +- Roll out a fix within 24 hours. +- Disable the slow test or feature temporarily (make sure there's a tracking + issue to re-enable it). + diff --git a/src/contribute/precommit-triage-guide.md b/src/contribute/precommit-triage-guide.md new file mode 100644 index 0000000..4fc67a8 --- /dev/null +++ b/src/contribute/precommit-triage-guide.md @@ -0,0 +1,125 @@ +--- +layout: section +title: "Precommit Slowness Triage Guide" +permalink: /contribute/precommit-triage-guide/ +section_menu: section-menu/contribute.html +--- +<!-- +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +--> + +# Precommit Slowness Triage Guide + +Beam precommit jobs are suites of tests run automatically on Jenkins build +machines for each pull request (PR) submitted to +[apache/beam](https://github.com/apache/beam). For more information and the +difference between precommits and postcommits, see +[testing](/contribute/testing/). + +## What are fast precommits? + +Precommit tests are required to pass before a pull request (PR) may be merged. +When these tests are slow they slow down Beam's development process. + +The aim is to have 95% of precommit jobs complete within 30 minutes +(failing or passing). +Technically, the 95th percentile of running time should be below 30 minutes over +the past 4 weeks, where running time is the duration of time the job spends in +the Jenkins queue + the actual time it spends running. + +## Detemining Slowness + +The current method for determining if precommmits are slow is to look at the +[Jupyter +notebook](https://github.com/apache/beam/tree/master/.test-infra/jupyter) +`precommit_job_times.ipynb`. + +Run the notebook. It should output a table with running times. The numbers in +the column `totalDurationMinutes_all` and the rows with a `job_name` like `4 +weeks 95th` contain the target numbers for determining slowness. +If any of these numbers are above 30, triaging is required. + +### Example +Here's an example table of running times: + + +In this example, Go precommits are taking approximately 14 minutes, which is +fast. Java and Python precommits are taking 78 and 32 minutes respectively, +which is slow. Both Java and Python precommits require triage. + +## Triage Process + +1. [Search for existing + issues](https://issues.apache.org/jira/issues/?filter=12344461) +1. Create a new issue if needed: [Apache + JIRA](https://issues.apache.org/jira/issues) + - Project: Beam + - Components: testing, anything else relevant + - Label: precommit + - Reference this page in the description. +1. Determine where the slowness is coming from and identify issues. Open + additional issues if needed (such as for multiple issues). +1. Assign the issue as appropriate, e.g., to the test's or PR's author. + +## Resolution + +It is expected that slowness is resolved promptly. See [precommit test +policies](/contribute/precommit-policies/) for details. + +## Possible Causes and Solutions + +This section lists some starting off points for fixing precommit slowness. + +### Jenkins + +Have a look at the graphs in the Jupyter notebook. Does the rise in total +duration match the rise in queuing time? If so, the slowness might be unrelated +to this specific precommit job. + +Example of when total and queuing durations rise and fall together (mostly): + + +Since Jenkins machines are a limited resource, other jobs can +affect precommit queueing times. Try to figure out if other jobs have been +recently slower, increased in frequency, or new jobs have been introduced. + +Another option is to look at adding more Jenkins machines. + +### Slow individual tests + +Sometimes a precommit job is slowed down due to one or more tests. One way of +determining if this is the case is by looking at individual test timings. + +Where to find individual test timings: + +- Look at the `Gradle Build Scan` link on the precommit job's Jenkins page. This + page will contain individual test timings for Java tests only (2018-08). +- Look at the `Test Result` link on the precommit job's Jenkins page. This + should be available for Java and Python tests (2018-08). + +Sometimes tests can be made faster by refactoring. A test that spends a lot of +time waiting (such as an integration test) could be made to run concurrently with +the other tests. + +If a test is determined to be too slow to be part of precommit tests, it could +be removed from precommit and placed in postcommit instead (but it should be in +postcommit already). In addition, ensure that the code covered by the removed +test is covered by a unit test in precommit. + +### Slow integration tests + +Integration test slowdowns may be caused by dependent services. + +## References + +- [Beam Fast Precommits design doc](https://docs.google.com/document/d/1udtvggmS2LTMmdwjEtZCcUQy6aQAiYTI3OrTP8CLfJM/edit?usp=sharing) diff --git a/src/contribute/testing.md b/src/contribute/testing.md index ef0814b..301b931 100644 --- a/src/contribute/testing.md +++ b/src/contribute/testing.md @@ -26,30 +26,37 @@ systems at the bottom. ## Testing Scenarios -With the tools at our disposal, we have a good set of utilities which we can use -to verify Beam correctness. To ensure an ongoing high quality of code, we use -precommit and postcommit testing. +Ideally, all available tests should be run against a pull request (PR) before +it's allowed to be committed to Beam's [Github](https://github.com/apache/beam) +repo. This is not possible, however, due to a combination of time and resource +constraints. Running all tests for each PR would take hours or even days using +available resources, which would slow down development considerably. + +Thus tests are split into *precommit* and *postcommit* suites. Precommit is +fast, while postcommit is comprehensive. (Or at least that's the idea.) As their +names imply, precommit tests are run on each PR before it is committed, while +postcommits run periodically against the master branch (i.e. on already +committed PRs). + +Beam uses [Jenkins](https://builds.apache.org/view/A-D/view/Beam/) to run +precommit and postcommit tests. ### Precommit -For precommit testing, Beam uses -[Jenkins](https://builds.apache.org/view/A-D/view/Beam/) and a code coverage tool -called [Coveralls](https://coveralls.io/github/apache/beam), hooked up -to [Github](https://github.com/apache/beam), to ensure that pull -requests meet a certain quality bar. These precommits verify correctness via two -of the below testing tools: unit tests (with coverage monitored by Coveralls) -and E2E tests. We run the full slate of unit tests in precommit, ensuring -correctness at a basic level, and then run the WordCount E2E test in both batch -and streaming (coming soon!) against each supported SDK / runner combination as -a smoke test, to verify that a basic level of functionality exists. We think -that this hits the appropriate tradeoff between a desire for short (ideally -\<30m) precommit times and a desire to verify that pull requests going into Beam -function in the way in which they are intended. - -Precommit tests are kicked off when a user makes a Pull Request against the -`apache/beam` repository and the Jenkins and Coveralls statuses are displayed at -the bottom of the pull request page. Clicking on “Details” will open the status -page in the selected tool; there, test status and output can be viewed. +The precommit test suite verifies correctness via two testing tools: unit tests +and end-to-end (E2E) tests. Unit tests ensure correctness at a basic level, +while WordCount E2E tests are run againsts each supported SDK / runner +combination as a smoke test, to verify that a basic level of functionality +exists. + +This combination of tests hits the appropriate tradeoff between a desire for +short (ideally \<30m) precommit times and a desire to verify that PRs going into +Beam function in the way in which they are intended. + +Precommit jobs are kicked off when a contributor makes a PR against the +`apache/beam` repository. Job statuses are displayed at the bottom of the PR +page. Clicking on “Details” will open the status page in the selected tool; +there, test status and output can be viewed. ### Postcommit @@ -87,7 +94,7 @@ To run all unit tests, execute the following command in the ``sdks/python`` subdirectory ``` -python setup.py test [-s apache_beam.package.module.TestClass.test_method] +$ python setup.py test [-s apache_beam.package.module.TestClass.test_method] ``` We also provide a [tox](https://tox.readthedocs.io/en/latest/) configuration diff --git a/src/images/precommit_durations.png b/src/images/precommit_durations.png new file mode 100644 index 0000000..c659677 Binary files /dev/null and b/src/images/precommit_durations.png differ diff --git a/src/images/precommit_graph_queuing_time.png b/src/images/precommit_graph_queuing_time.png new file mode 100644 index 0000000..5082943 Binary files /dev/null and b/src/images/precommit_graph_queuing_time.png differ
