ashb commented on a change in pull request #11656:
URL: https://github.com/apache/airflow/pull/11656#discussion_r507751942



##########
File path: .github/workflows/ci.yml
##########
@@ -202,12 +228,58 @@ jobs:
         env:
           VERBOSE: false
 
+  # Those checks are run if no image needs to be built for checks. This is for 
simple changes that
+  # Do not touch any of the python code or any of the important files that 
might require building
+  # The CI Docker image and they can be run entirely using the pre-commit 
virtual environments on host
+  static-checks-basic-checks-only:
+    timeout-minutes: 30
+    name: "Static checks: basic checks only"
+    runs-on: ubuntu-latest
+    needs: [build-info]
+    env:
+      SKIP: "build,mypy,flake8,pylint,bats-in-container-tests"
+      MOUNT_LOCAL_SOURCES: "true"
+      PYTHON_MAJOR_MINOR_VERSION: 
${{needs.build-info.outputs.defaultPythonVersion}}
+    if: >
+      needs.build-info.outputs.basic-checks-only == 'true' &&
+      (github.repository == 'apache/airflow' || github.event_name != 
'schedule')
+    steps:
+      - name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
+        uses: actions/checkout@v2
+      - name: "Setup python"
+        uses: actions/setup-python@v2
+        with:
+          python-version: ${{needs.build-info.outputs.defaultPythonVersion}}
+      - name: Cache pre-commit env
+        uses: actions/cache@v2
+        env:
+          cache-name: cache-pre-commit-master-basic-checks-v1
+        with:
+          path: ~/.cache/pre-commit
+          key: ${{ env.cache-name }}-${{ github.job }}-${{ 
hashFiles('.pre-commit-config.yaml') }}
+      - name: "Free space"
+        run: ./scripts/ci/tools/ci_free_space_on_ci.sh
+      - name: >
+          Fetch incoming commit ${{ github.sha }} with its parent
+        uses: actions/checkout@v2
+        with:
+          ref: ${{ github.sha }}
+          fetch-depth: 2

Review comment:
       Do we need this here? Haven't we already decided what we've built, and 
the commit in question will already be built. 

##########
File path: .github/workflows/ci.yml
##########
@@ -202,12 +228,58 @@ jobs:
         env:
           VERBOSE: false
 
+  # Those checks are run if no image needs to be built for checks. This is for 
simple changes that
+  # Do not touch any of the python code or any of the important files that 
might require building
+  # The CI Docker image and they can be run entirely using the pre-commit 
virtual environments on host
+  static-checks-basic-checks-only:
+    timeout-minutes: 30
+    name: "Static checks: basic checks only"
+    runs-on: ubuntu-latest
+    needs: [build-info]
+    env:
+      SKIP: "build,mypy,flake8,pylint,bats-in-container-tests"
+      MOUNT_LOCAL_SOURCES: "true"
+      PYTHON_MAJOR_MINOR_VERSION: 
${{needs.build-info.outputs.defaultPythonVersion}}
+    if: >
+      needs.build-info.outputs.basic-checks-only == 'true' &&
+      (github.repository == 'apache/airflow' || github.event_name != 
'schedule')
+    steps:
+      - name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
+        uses: actions/checkout@v2
+      - name: "Setup python"
+        uses: actions/setup-python@v2
+        with:
+          python-version: ${{needs.build-info.outputs.defaultPythonVersion}}
+      - name: Cache pre-commit env
+        uses: actions/cache@v2
+        env:
+          cache-name: cache-pre-commit-master-basic-checks-v1
+        with:
+          path: ~/.cache/pre-commit
+          key: ${{ env.cache-name }}-${{ github.job }}-${{ 
hashFiles('.pre-commit-config.yaml') }}
+      - name: "Free space"
+        run: ./scripts/ci/tools/ci_free_space_on_ci.sh

Review comment:
       Not important, but we could probably remove the free space check here if 
we aren't pulling in big images

##########
File path: scripts/ci/selective_ci_checks.sh
##########
@@ -121,76 +182,93 @@ function show_changed_files() {
 # Output:
 #    Count of changed files matching the patterns
 function count_changed_files() {
+    local count_changed_files
     count_changed_files=$(echo "${CHANGED_FILES}" | grep -c -E 
"$(get_regexp_from_patterns)" || true)
     echo "${count_changed_files}"
 }
 
-function run_all_tests_when_push_or_schedule() {
-    if [[ ${GITHUB_EVENT_NAME} == "push" || ${GITHUB_EVENT_NAME} == "schedule" 
]]; then
-        echo
-        echo "Always run all tests in case of push/schedule events"
-        echo
-        set_outputs_run_all_tests
-        exit
+function check_if_api_tests_should_be_run() {
+    local pattern_array=(
+        "^airflow/api"
+    )
+    show_changed_files
+
+    if [[ $(count_changed_files) == "0" ]]; then
+        needs_api_tests "false"
+    else
+        needs_api_tests "true"
+    fi
+}
+
+function check_if_helm_tests_should_be_run() {
+    local pattern_array=(
+        "^chart"
+    )
+    show_changed_files
+
+    if [[ $(count_changed_files) == "0" ]]; then
+        needs_helm_tests "false"
+    else
+        needs_helm_tests "true"
+    fi
+}
+
+function check_if_docs_should_be_generated() {
+    local pattern_array=(
+        "^docs$"
+        "\.py$"

Review comment:
       ```suggestion
           "^airflow/.*\.py$"
   ```
   
   We don't need to rebuild docs for test-only changes.

##########
File path: CI.rst
##########
@@ -629,6 +691,90 @@ delete old artifacts that are > 7 days old. It only runs 
for the 'apache/airflow
 We also have a script that can help to clean-up the old artifacts:
 `remove_artifacts.sh <dev/remove_artifacts.sh>`_
 
+Selective CI Checks
+===================
+
+In order to optimise our CI builds, we've implemented optimisations to only 
run selected checks for some
+kind of changes. The logic implemented reflects the internal architecture of 
Airflow 2.0 packages
+and it helps to keep down both the usage of jobs in GitHub Actions as well as 
CI feedback time to
+contributors in case of simpler changes.
+
+We have the following test types (separated by packages in which they are):
+
+* Core - for the core Airflow functionality (core folder)

Review comment:
       There isn't a core folder -- do you mean airflow/, if not one of the 
more specific examples below?

##########
File path: CI.rst
##########
@@ -32,6 +32,69 @@ However part of the philosophy we have is that we are not 
tightly coupled with a
 environments we use. Most of our CI jobs are written as bash scripts which are 
executed as steps in
 the CI jobs. And we have  a number of variables determine build behaviour.
 
+
+
+
+Github Actions runs
+-------------------
+
+Our builds on CI I highly optimized. They utilise some of the latest features 
provided by GitHub Actions

Review comment:
       ```suggestion
   Our builds on CI are highly optimized. They utilise some of the latest 
features provided by GitHub Actions
   ```
   
   (I think you meant `I` -> `is`, but `are` is more correct.)

##########
File path: scripts/ci/selective_ci_checks.sh
##########
@@ -121,76 +182,93 @@ function show_changed_files() {
 # Output:
 #    Count of changed files matching the patterns
 function count_changed_files() {
+    local count_changed_files
     count_changed_files=$(echo "${CHANGED_FILES}" | grep -c -E 
"$(get_regexp_from_patterns)" || true)
     echo "${count_changed_files}"
 }
 
-function run_all_tests_when_push_or_schedule() {
-    if [[ ${GITHUB_EVENT_NAME} == "push" || ${GITHUB_EVENT_NAME} == "schedule" 
]]; then
-        echo
-        echo "Always run all tests in case of push/schedule events"
-        echo
-        set_outputs_run_all_tests
-        exit
+function check_if_api_tests_should_be_run() {
+    local pattern_array=(
+        "^airflow/api"
+    )
+    show_changed_files

Review comment:
       Wait, this works? `show_changed_files` calls `get_regexp_from_patterns`, 
and that references `pattern_array` -- and it can pick up the local array 
defined here?
   
   That was not how I thought local worked. I can see how the scoping rules of 
bash made this the case, but I guess I was expecting something else.

##########
File path: scripts/ci/selective_ci_checks.sh
##########
@@ -121,76 +182,93 @@ function show_changed_files() {
 # Output:
 #    Count of changed files matching the patterns
 function count_changed_files() {
+    local count_changed_files
     count_changed_files=$(echo "${CHANGED_FILES}" | grep -c -E 
"$(get_regexp_from_patterns)" || true)
     echo "${count_changed_files}"

Review comment:
       ```suggestion
       echo "${CHANGED_FILES}" | grep -c -E "$(get_regexp_from_patterns)" || 
true
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to