[airflow] 08/44: Adds documentation about the optimized PR workflow (#12006)

potiuk Sat, 14 Nov 2020 08:43:23 -0800

This is an automated email from the ASF dual-hosted git repository.

potiuk pushed a commit to branch v1-10-test
in repository https://gitbox.apache.org/repos/asf/airflow.git


commit 80845847d5971e64a0b19a589a9998cf330abe07
Author: Jarek Potiuk <jarek.pot...@polidea.com>
AuthorDate: Sun Nov 1 00:20:38 2020 +0100

    Adds documentation about the optimized PR workflow (#12006)
    
    We had a lot of problems recently about the queues in Github
    Actions. This documentations explains the motivation and approach
    we have taken for optimizing our PR workflow.
    
    (cherry picked from commit d85a31f2d88b92b480438d86aa8e3e79e6c3614d)
---
 CI.rst                                   | 102 ++----------
 PULL_REQUEST_WORKFLOW.rst                | 260 +++++++++++++++++++++++++++++++
 images/pr/pr-full-tests-needed.png       | Bin 0 -> 88512 bytes
 images/pr/pr-likely-ok-to-merge.png      | Bin 0 -> 98362 bytes
 images/pr/pr-no-tests-needed-comment.png | Bin 0 -> 80852 bytes
 images/pr/selective_checks.md5           |   1 +
 images/pr/selective_checks.mermaid       |  35 +++++
 images/pr/selective_checks.png           | Bin 0 -> 64501 bytes
 8 files changed, 309 insertions(+), 89 deletions(-)

diff --git a/CI.rst b/CI.rst
index dc1cdf2..f4b5294 100644
--- a/CI.rst
+++ b/CI.rst
@@ -35,7 +35,7 @@ the CI jobs. And we have  a number of variables determine 
build behaviour.
 
 
 
-Github Actions runs
+GitHub Actions runs
 -------------------
 
 Our builds on CI are highly optimized. They utilise some of the latest 
features provided by GitHub Actions
@@ -65,7 +65,7 @@ utilise the WRITE access to Apache Airflow repository via an 
external Pull Reque
 
 Thanks to the WRITE access and fact that the 'workflow_run' by default uses 
the 'master' version of the
 sources, we can safely run some logic there will checkout the incoming Pull 
Request, build the container
-image from the sources from the incoming PR and push such image to an Github 
Docker Registry - so that
+image from the sources from the incoming PR and push such image to an GitHub 
Docker Registry - so that
 this image can be built only once and used by all the jobs running tests. The 
image is tagged with unique
 ``RUN_ID`` of the incoming Pull Request and the tests run in the Pull Request 
can simply pull such image
 rather than build it from the scratch. Pulling such image takes ~ 1 minute, 
thanks to that we are saving
@@ -92,7 +92,7 @@ connected with the run.
 You can read more about it in `BREEZE.rst <BREEZE.rst>`_ and `TESTING.rst 
<TESTING.rst>`_
 
 
-Difference between local runs and Github Action workflows
+Difference between local runs and GitHub Action workflows
 ---------------------------------------------------------
 
 Depending whether the scripts are run locally (most often via `Breeze 
<BREEZE.rst>`_) or whether they
@@ -470,7 +470,13 @@ The main purpose of those jobs is to check if PR builds 
cleanly, if the test run
 the PR is ready to review and merge. The runs are using cached images from the 
Private GitHub registry -
 CI, Production Images as well as base Python images that are also cached in 
the Private GitHub registry.
 Also for those builds we only execute Python tests if important files changed 
(so for example if it is
-doc-only change, no tests will be executed.
+"no-code" change, no tests will be executed.
+
+The workflow involved in Pull Requests review and approval is a bit more 
complex than simple workflows
+in most of other projects because we've implemented some optimizations related 
to efficient use
+of queue slots we share with other Apache Software Foundation projects. More 
details about it
+can be found in `PULL_REQUEST_WORKFLOW.rst <PULL_REQUEST_WORKFLOW.rst>`_.
+
 
 Direct Push/Merge Run
 ---------------------
@@ -641,7 +647,7 @@ Comments:
 
  (1) CRON jobs builds images from scratch - to test if everything works 
properly for clean builds
  (2) The tests are run when the Trigger Tests job determine that important 
files change (this allows
-     for example doc-only changes to build much faster)
+     for example "no-code" changes to build much faster)
  (3) The jobs wait for CI images if ``GITHUB_REGISTRY_WAIT_FOR_IMAGE`` 
variable is set to "true".
      You can set it to "false" to disable using shared images - this is slower 
though as the images
      are rebuilt in every job that needs them. You can also set your own 
fork's secret
@@ -685,7 +691,7 @@ way to sync your fork master to the Apache Airflow's one.
 Delete old artifacts
 --------------------
 
-This workflow is introduced, to delete old artifacts from the Github Actions 
build. We set it to
+This workflow is introduced, to delete old artifacts from the GitHub Actions 
build. We set it to
 delete old artifacts that are > 7 days old. It only runs for the 
'apache/airflow' repository.
 
 We also have a script that can help to clean-up the old artifacts:
@@ -695,89 +701,7 @@ CodeQL scan
 -----------
 
 The CodeQL security scan uses GitHub security scan framework to scan our code 
for security violations.
-It is run for javascript and python code.
-
-
-Selective CI Checks
-===================
-
-In order to optimise our CI builds, we've implemented optimisations to only 
run selected checks for some
-kind of changes. The logic implemented reflects the internal architecture of 
Airflow 2.0 packages
-and it helps to keep down both the usage of jobs in GitHub Actions as well as 
CI feedback time to
-contributors in case of simpler changes.
-
-We have the following test types (separated by packages in which they are):
-
-* Core - for the core Airflow functionality (core folder)
-
-We also have several special kinds of tests that are not separated by packages 
but they are marked with
-pytest markers. They can be found in any of those packages and they can be 
selected by the appropriate
-pylint custom command line options. See `TESTING.rst <TESTING.rst>`_ for 
details but those are:
-
-* Integration - tests that require external integration images running in 
docker-compose
-* Heisentests - tests that are vulnerable to some side effects and are better 
to be run on their own
-* Quarantined - tests that are flaky and need to be fixed
-* Postgres - tests that require Postgres database. They are only run when 
backend is Postgres
-* MySQL - tests that require MySQL database. They are only run when backend is 
MySQL
-
-Even if the types are separated, In case they share the same backend 
version/python version, they are
-run sequentially in the same job, on the same CI machine. Each of them in a 
separate ``docker run`` command
-and with additional docker cleaning between the steps to not fall into the 
trap of exceeding resource
-usage in one big test run, but also not to increase the number of jobs per 
each Pull Request.
-
-The logic implemented for the changes works as follows:
-
-1) In case of direct push (so when PR gets merged) or scheduled run, we always 
run all tests and checks.
-   This is in order to make sure that the merge did not miss anything 
important. The remainder of the logic
-   is executed only in case of Pull Requests.
-
-2) We retrieve which files have changed in the incoming Merge Commit 
(github.sha is a merge commit
-   automatically prepared by GitHub in case of Pull Request, so we can 
retrieve the list of changed
-   files from that commit directly).
-
-3) If any of the important, environment files changed (Dockerfile, ci scripts, 
setup.py, GitHub workflow
-   files), then we again run all tests and checks. Those are cases where the 
logic of the checks changed
-   or the environment for the checks changed so we want to make sure to check 
everything.
-
-4) If any of docs changed: we need to have CI image so we enable image building
-
-5) If any of chart files changed, we need to run helm tests so we enable helm 
unit tests
-
-6) If any of API files changed, we need to run API tests so we enable them
-
-7) If any of the relevant source files that trigger the tests have changed at 
all. Those are airflow
-   sources, chart, tests and kubernetes_tests. If any of those files changed, 
we enable tests and we
-   enable image building, because the CI images are needed to run tests.
-
-8) Then we determine which types of the tests should be run. We count all the 
changed files in the
-   relevant airflow sources (airflow, chart, tests, kubernetes_tests) first 
and then we count how many
-   files changed in different packages:
-
-   a) if any of the Kubernetes files changed we enable ``Kubernetes`` test type
-   b) Then we subtract count of all the ``specific`` above per-type changed 
files from the count of
-      all changed files. In case there are any files changed, then we assume 
that some unknown files
-      changed (likely from the core of airflow) and in this case we enable all 
test types above and the
-      Core test types - simply because we do not want to risk to miss anything.
-    g) In all cases where tests are enabled we also add Heisentests, 
Integration and - depending on
-       the backend used = Postgres or MySQL types of tests.
-
-9) Quarantined tests are always run when tests are run - we need to run them 
often to observe how
-   often they fail so that we can decide to move them out of quarantine. 
Details about the
-   Quarantined tests are described in `TESTING.rst <TESTING.rst>`_
-
-10) There is a special case of static checks. In case the above logic 
determines that the CI image
-    needs to be build, we run long and more comprehensive version of static 
checks - including Pylint,
-    MyPy, Flake8. And those tests are run on all files, no matter how many 
files changed.
-    In case the image is not built, we run only simpler set of changes - the 
longer static checks
-    that require CI image are skipped, and we only run the tests on the files 
that changed in the incoming
-    commit - unlike pylint/flake8/mypy, those static checks are per-file based 
and they should not miss any
-    important change.
-
-Similarly to selective tests we also run selective security scans. In Pull 
requests,
-the Python scan will only run when there is a python code change and 
javascript scan will only run if
-there is a javascript or yarn.lock file change. For master builds, all scans 
are always executed.
-
-
+It is run for JavaScript and python code.
 
 Naming conventions for stored images
 ====================================
diff --git a/PULL_REQUEST_WORKFLOW.rst b/PULL_REQUEST_WORKFLOW.rst
new file mode 100644
index 0000000..c9cc6bf
--- /dev/null
+++ b/PULL_REQUEST_WORKFLOW.rst
@@ -0,0 +1,260 @@
+ .. Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+ ..   http://www.apache.org/licenses/LICENSE-2.0
+
+ .. Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+
+.. contents:: :local:
+
+Why non-standard pull request workflow?
+---------------------------------------
+
+This document describes the Pull Request Workflow we've implemented in 
Airflow. The workflow is slightly
+more complex than regular workflow you might encounter in most of the projects 
because after experiencing
+some huge delays in processing queues in October 2020 with GitHub Actions, 
we've decided to optimize the
+workflow to minimize the use of Github Actions build time by utilising 
selective approach on which tests
+and checks in the CI system are run depending on analysis of which files 
changed in the incoming PR and
+allowing the Committers to control the scope of the tests during the 
approval/review process.
+
+Just to give a bit of context, we started off with the approach that we always 
run all tests for all the
+incoming PRs, however due to our matrix of tests growing, this approach did 
not scale with the increasing
+number of PRs and when we had to compete with other Apache Software Foundation 
projects for the 180
+slots that are available for the whole organization. More Apache Software 
Foundation projects started
+to use GitHub Actions and we've started to experience long queues when our 
jobs waited for free slots.
+
+We approached the problem by:
+
+1) Improving mechanism of cancelling duplicate workflow runs more efficiently 
in case of queue conditions
+   (duplicate workflow runs are generated when someone pushes a fixup quickly 
- leading to running both
+   out-dated and current run to completion, taking precious slots. This has 
been implemented by improving
+   `cancel-workflow-run <https://github.com/potiuk/cancel-workflow-runs/>`_ 
action we are using. In version
+   4.1 it got a new feature of cancelling all duplicates even if there is a 
long queue of builds.
+
+2) Heavily decreasing strain on the Github Actions jobs by introducing 
selective checks - mechanism
+   to control which parts of the tests are run during the tests. This is 
implemented by the
+   ``scripts/ci/selective_ci_checks.sh`` script in our repository. This script 
analyses which part of the
+   code has changed and based on that it sets the right outputs that control 
which tests are executed in
+   the CI build, and whether we need to build CI images necessary to run those 
steps. This allowed to
+   heavily decrease the strain especially for the Pull Requests that were not 
touching code (in which case
+   the builds can complete in < 2 minutes) but also by limiting the number of 
tests executed in PRs that do
+   not touch the "core" of Airflow, or only touching some - standalone - parts 
of Airflow such as
+   "Providers", "WWW" or "CLI". This solution is not yet perfect as there are 
likely some edge cases but
+   it is easy to maintain and we have an escape-hatch - all the tests are 
always executed in master pushes,
+   so contributors can easily spot if there is a "missed" case and fix it - 
both by fixing the problem and
+   adding those exceptions to the code. More about it can be found in the
+   `Selective CI checks <#selective-ci-checks>`_ chapter.
+
+3) Even more optimisation came from limiting the scope of tests to only 
"default" matrix parameters. So far
+   in Airflow we always run all tests for all matrix combinations. The primary 
matrix components are:
+
+   * Python versions (currently 3.6, 3.7, 3.8)
+   * Backend types (currently MySQL/Postgres)
+   * Backed version (currently MySQL 5.7, MySQL 8, Postgres 9.6, Postgres 13
+
+   We've decided that instead of running all the combinations of parameters 
for all matrix component we will
+   only run default values (Python 3.6, Mysql 5.7, Postgres 9.6) for all PRs 
which are not approved yet by
+   the committers. This has a nice effect, that full set of tests (though with 
limited combinations of
+   the matrix) are still run in the CI for every Pull Request that needs tests 
at all - allowing the
+   contributors to make sure that their PR is "good enough" to be reviewed.
+
+   Even after approval, the automated workflows we've implemented, check if 
the PR seems to need
+   "full test matrix" and provide helpful information to both contributors and 
committers in the form of
+   explanatory comments and labels set automatically showing the status of the 
PR. Committers have still
+   control whether they want to merge such requests automatically or ask for 
rebase or re-run the tests
+   and run "full tests" by applying the "full tests needed" label and 
re-running such request.
+   The "full tests needed" label is also applied automatically after approval 
when the change touches
+   the "core" of Airflow - also a separate check is added to the PR so that 
the "merge" button status
+   will indicate to the committer that full tests are still needed. The 
committer might still decide,
+   whether to merge such PR without the "full matrix". The "escape hatch" we 
have - i.e. running the full
+   matrix of tests in the "merge push" will enable committers to catch and fix 
such problems quickly.
+   More about it can be found in `Approval workflow and Matrix tests 
<#approval-workflow-and-matrix-tests>`_
+   chapter.
+
+4) We've also applied (and received) funds to run self-hosted runners. This is 
not yet implemented, due to
+   discussions about security of self-hosted runners for public repositories. 
Running self-hosted runners by
+   public repositories is currently (as of end of October 2020)
+   `Discouraged by GitHub 
<https://docs.github.com/en/free-pro-team@latest/actions/hosting-your-own-runners/about-self-hosted-runners#self-hosted-runner-security-with-public-repositories>`_
+   and we are working on solving the problem - also involving Apache Software 
Foundation infrastructure team.
+   This document does not describe this part of the approach. Most likely we 
will add soon a document
+   describing details of the approach taken there.
+
+Selective CI Checks
+-------------------
+
+In order to optimise our CI builds, we've implemented optimisations to only 
run selected checks for some
+kind of changes. The logic implemented reflects the internal architecture of 
Airflow 2.0 packages
+and it helps to keep down both the usage of jobs in GitHub Actions as well as 
CI feedback time to
+contributors in case of simpler changes.
+
+We have the following test types (separated by packages in which they are):
+
+* Always - those are tests that should be always executed (always folder)
+* Core - for the core Airflow functionality (core folder)
+* API - Tests for the Airflow API (api and api_connexion folders)
+* CLI - Tests for the Airflow CLI (cli folder)
+* WWW - Tests for the Airflow webserver (www and www_rbac in 1.10 folders)
+* Providers - Tests for all Providers of Airflow (providers folder)
+* Other - all other tests (all other folders that are not part of any of the 
above)
+
+We also have several special kinds of tests that are not separated by packages 
but they are marked with
+pytest markers. They can be found in any of those packages and they can be 
selected by the appropriate
+pylint custom command line options. See `TESTING.rst <TESTING.rst>`_ for 
details but those are:
+
+* Integration - tests that require external integration images running in 
docker-compose
+* Heisentests - tests that are vulnerable to some side effects and are better 
to be run on their own
+* Quarantined - tests that are flaky and need to be fixed
+* Postgres - tests that require Postgres database. They are only run when 
backend is Postgres
+* MySQL - tests that require MySQL database. They are only run when backend is 
MySQL
+
+Even if the types are separated, In case they share the same backend 
version/python version, they are
+run sequentially in the same job, on the same CI machine. Each of them in a 
separate ``docker run`` command
+and with additional docker cleaning between the steps to not fall into the 
trap of exceeding resource
+usage in one big test run, but also not to increase the number of jobs per 
each Pull Request.
+
+The logic implemented for the changes works as follows:
+
+1) In case of direct push (so when PR gets merged) or scheduled run, we always 
run all tests and checks.
+   This is in order to make sure that the merge did not miss anything 
important. The remainder of the logic
+   is executed only in case of Pull Requests.
+
+2) We retrieve which files have changed in the incoming Merge Commit 
(github.sha is a merge commit
+   automatically prepared by GitHub in case of Pull Request, so we can 
retrieve the list of changed
+   files from that commit directly).
+
+3) If any of the important, environment files changed (Dockerfile, ci scripts, 
setup.py, GitHub workflow
+   files), then we again run all tests and checks. Those are cases where the 
logic of the checks changed
+   or the environment for the checks changed so we want to make sure to check 
everything.
+
+4) If any of docs changed: we need to have CI image so we enable image building
+
+5) If any of chart files changed, we need to run helm tests so we enable helm 
unit tests
+
+6) If any of API files changed, we need to run API tests so we enable them
+
+7) If any of the relevant source files that trigger the tests have changed at 
all. Those are airflow
+   sources, chart, tests and kubernetes_tests. If any of those files changed, 
we enable tests and we
+   enable image building, because the CI images are needed to run tests.
+
+8) Then we determine which types of the tests should be run. We count all the 
changed files in the
+   relevant airflow sources (airflow, chart, tests, kubernetes_tests) first 
and then we count how many
+   files changed in different packages:
+
+   a) in any case tests in ``Always`` folder are run. Those are special tests 
that should be run any time
+      modifications to any Python code occurs. Example test of this type is 
verifying proper structure of
+      the project including proper naming of all files.
+   b) if any of the Airflow API files changed we enable ``API`` test type
+   c) if any of the Airflow CLI files changed we enable ``CLI`` test type
+   d) if any of the Provider files changed we enable ``Providers`` test type
+   e) if any of the WWW files changed we enable ``WWW`` test type
+   f) if any of the Kubernetes files changed we enable ``Kubernetes`` test type
+   g) Then we subtract count of all the ``specific`` above per-type changed 
files from the count of
+      all changed files. In case there are any files changed, then we assume 
that some unknown files
+      changed (likely from the core of airflow) and in this case we enable all 
test types above and the
+      Core test types - simply because we do not want to risk to miss anything.
+   h) In all cases where tests are enabled we also add Heisentests, 
Integration and - depending on
+      the backend used = Postgres or MySQL types of tests.
+
+9) Quarantined tests are always run when tests are run - we need to run them 
often to observe how
+   often they fail so that we can decide to move them out of quarantine. 
Details about the
+   Quarantined tests are described in `TESTING.rst <TESTING.rst>`_
+
+10) There is a special case of static checks. In case the above logic 
determines that the CI image
+    needs to be build, we run long and more comprehensive version of static 
checks - including Pylint,
+    MyPy, Flake8. And those tests are run on all files, no matter how many 
files changed.
+    In case the image is not built, we run only simpler set of changes - the 
longer static checks
+    that require CI image are skipped, and we only run the tests on the files 
that changed in the incoming
+    commit - unlike pylint/flake8/mypy, those static checks are per-file based 
and they should not miss any
+    important change.
+
+Similarly to selective tests we also run selective security scans. In Pull 
requests,
+the Python scan will only run when there is a python code change and 
JavaScript scan will only run if
+there is a JavaScript or yarn.lock file change. For master builds, all scans 
are always executed.
+
+The selective check algorithm is shown here:
+
+.. image:: images/pr/selective_checks.png
+    :align: center
+    :alt: Selective check algorithm
+
+Approval Workflow and Matrix tests
+----------------------------------
+
+As explained above the approval and matrix tests workflow works according to 
the algorithm below:
+
+1) In case of "no-code" changes - so changes that do not change any of the 
code or environment of
+   the application, no test are run (this is done via selective checks above). 
Also no CI/PROD images are
+   build saving extra minutes. Such build takes less than 2 minutes currently 
and only few jobs are run
+   which is a very small fraction of the "full build" time.
+
+2) When new PR is created, only a "default set" of matrix test are running. 
Only default
+   values for each of the parameters are used effectively limiting it to 
running matrix builds for only
+   one python version and one version of each of the backends. In this case 
only one CI and one PROD
+   image is built, saving precious job slots. This build takes around 50% less 
time than the "full matrix"
+   build.
+
+3) When such PR gets approved, the system further analyses the files changed 
in this PR and further
+   decision is made that should be communicated to both Committer and Reviewer.
+
+3a) In case of "no-code" builds, a message is communicated that the PR is 
ready to be merged and
+    no tests are needed.
+
+.. image:: images/pr/pr-no-tests-needed-comment.png
+    :align: center
+    :alt: No tests needed for "no-code" builds
+
+3b) In case of "non-core" builds a message is communicated that such PR is 
likely OK to be merged as is with
+    limited set of tests, but that the committer might decide to re-run the PR 
after applying
+    "full tests needed" label, which will trigger full matrix build for tests 
for this PR. The committer
+    might make further decision on what to do with this PR.
+
+.. image:: images/pr/pr-likely-ok-to-merge.png
+    :align: center
+    :alt: Likely ok to merge the PR with only small set of tests
+
+3c) In case of "core" builds (i. e. when the PR touches some "core" part of 
Airflow) a message is
+    communicated that this PR needs "full test matrix", the "full tests 
needed" label is applied
+    automatically and either the contributor might rebase the request to 
trigger full test build or the
+    committer might re-run the build manually to trigger such full test 
rebuild. Also a check "in-progress"
+    is added, so that the committer realises that the PR is not yet "green to 
merge". Pull requests with
+    "full tests needed" label always trigger the full matrix build when 
rebased or re-run so if the
+    PR gets rebased, it will continue triggering full matrix build.
+
+.. image:: images/pr/pr-full-tests-needed.png
+    :align: center
+    :alt: Full tests are needed for the PR
+
+4) If this or another committer "request changes" in in a  previously approved 
PR with "full tests needed"
+   label, the bot automatically removes the label, moving it back to "run only 
default set of parameters"
+   mode. For PRs touching core of airflow once the PR gets approved back, the 
label will be restored.
+   If it was manually set by the committer, it has to be restored manually.
+
+.. note:: Note that setting the labels and adding comments might be delayed, 
due to limitation of Github Actions,
+      in case of queues, processing of Pull Request reviews might take some 
time, so it is advised not to merge
+      PR immediately after approval. Luckily, the comments describing the 
status of the PR trigger notifications
+      for the PRs and they provide good "notification" for the committer to 
act on a PR that was recently
+      approved.
+
+The PR approval workflow is possible thanks two two custom Github Actions 
we've developed:
+
+* `Get workflow origin <https://github.com/potiuk/get-workflow-origin/>`_
+* `Label when approved <https://github.com/TobKed/label-when-approved-action>`_
+
+
+Next steps
+----------
+
+We are planning to also propose the approach to other projects from Apache 
Software Foundation to
+make it a common approach, so that our effort is not limited only to one 
project.
+
+Discussion about it in `this discussion 
<https://lists.apache.org/thread.html/r1708881f52adbdae722afb8fea16b23325b739b254b60890e72375e1%40%3Cbuilds.apache.org%3E>`_
diff --git a/images/pr/pr-full-tests-needed.png 
b/images/pr/pr-full-tests-needed.png
new file mode 100644
index 0000000..c863153
Binary files /dev/null and b/images/pr/pr-full-tests-needed.png differ
diff --git a/images/pr/pr-likely-ok-to-merge.png 
b/images/pr/pr-likely-ok-to-merge.png
new file mode 100644
index 0000000..9c04dee
Binary files /dev/null and b/images/pr/pr-likely-ok-to-merge.png differ
diff --git a/images/pr/pr-no-tests-needed-comment.png 
b/images/pr/pr-no-tests-needed-comment.png
new file mode 100644
index 0000000..78a1181
Binary files /dev/null and b/images/pr/pr-no-tests-needed-comment.png differ
diff --git a/images/pr/selective_checks.md5 b/images/pr/selective_checks.md5
new file mode 100644
index 0000000..d4a57c1
--- /dev/null
+++ b/images/pr/selective_checks.md5
@@ -0,0 +1 @@
+2ae1b3fadb26317f4a3531c40b7702f2  images/pr/selective_checks.mermaid
diff --git a/images/pr/selective_checks.mermaid 
b/images/pr/selective_checks.mermaid
new file mode 100644
index 0000000..e8e8c2a
--- /dev/null
+++ b/images/pr/selective_checks.mermaid
@@ -0,0 +1,35 @@
+%% Licensed to the Apache Software Foundation (ASF) under one
+%% or more contributor license agreements.  See the NOTICE file
+%% distributed with this work for additional information
+%% regarding copyright ownership.  The ASF licenses this file
+%% to you under the Apache License, Version 2.0 (the
+%% "License"); you may not use this file except in compliance
+%% with the License.  You may obtain a copy of the License at
+%%
+%%   http://www.apache.org/licenses/LICENSE-2.0
+%%
+%% Unless required by applicable law or agreed to in writing,
+%% software distributed under the License is distributed on an
+%% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+%% KIND, either express or implied.  See the License for the
+%% specific language governing permissions and limitations
+%% under the License.
+
+graph LR
+A[PR arrives] --> B[Selective check]
+B --> C{direct push merge?}
+C -->|Yes: enable images| D[Run Full tests<br>+Quarantined<br>run full static 
checks]
+C -->|No| E[Retrieve changed files]
+E --> F{environment files changed?}
+F -->|Yes: enable images| D
+F -->|No| G{docs changed}
+G -->|Yes: enable image building| H{Chart files changed?}
+G -->|No| H
+H -->|Yes: enable helm tests| I{API files changed?}
+H -->|No| I
+I -->|Yes: enable API tests| J{sources changed?}
+I -->|No| J
+J -->|Yes: enable Pytests| K{determine test type}
+J -->|No| L[skip running test<br/>Run subset of static checks]
+K -->|Core files changed: enable images| D
+K -->|No core files changed: enable images| M[Run selected tests +<br/> 
Heisentest, Integration, Quarantined<br/>Full static checks]
diff --git a/images/pr/selective_checks.png b/images/pr/selective_checks.png
new file mode 100644
index 0000000..afc2384
Binary files /dev/null and b/images/pr/selective_checks.png differ

[airflow] 08/44: Adds documentation about the optimized PR workflow (#12006)

Reply via email to