tvalentyn commented on code in PR #27723: URL: https://github.com/apache/beam/pull/27723#discussion_r1278177439
########## .github/workflows/beam_Python_ValidatesContainer_Dataflow_ARM.yml: ########## @@ -0,0 +1,116 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +name: Python ValidatesContainer Dataflow ARM + +on: + issue_comment: + types: [created] + push: + tags: ['v*'] + branches: ['master', 'release-*'] + paths: ["sdks/python/**",".github/workflows/beam_Python_ValidatesContainer_Dataflow_ARM.yml"] + schedule: + - cron: '0 */6 * * *' + workflow_dispatch: + +#Setting explicit permissions for the action to avoid the default permissions which are `write-all` in case of pull_request_target event +permissions: + actions: write + pull-requests: read + checks: read + contents: read + deployments: read + id-token: none + issues: read + discussions: read + packages: read + pages: read + repository-projects: read + security-events: read + statuses: read + +# This allows a subsequently queued workflow run to interrupt previous runs +concurrency: + group: '${{ github.workflow }} @ ${{ github.event.pull_request.head.label || github.head_ref || github.ref }}' + cancel-in-progress: true + +jobs: + beam_Python_ValidatesContainer_Dataflow_ARM: + name: beam_Python_ValidatesContainer_Dataflow_ARM + strategy: + fail-fast: false + matrix: + python_version: ['3.8','3.9','3.10','3.11'] + if: | + github.event_name == 'push' || + startsWith(github.event.comment.body, 'Run Python ValidatesContainer Dataflow ARM') || + github.event_name == 'schedule' + runs-on: [self-hosted, ubuntu-20.04, main] + steps: + - name: Check out repository code + uses: actions/checkout@v3 + with: + ref: ${{ github.event.pull_request.head.sha }} + - name: Set comment body with matrix + id: set_comment_body + run: | + echo "comment_body=Run Python ValidatesContainer Dataflow ARM (${{ matrix.python_version }})" >> $GITHUB_OUTPUT + - name: Rerun on comment + if: github.event.comment.body == steps.set_comment_body.outputs.comment_body + uses: ./.github/actions/rerun-job-action + with: + pull_request_url: ${{ github.event.issue.pull_request.url }} + github_repository: ${{ github.repository }} + github_token: ${{ secrets.GITHUB_TOKEN }} + github_job: "${{ github.job }} (${{ matrix.python_version }})" + github_current_run_id: ${{ github.run_id }} + - name: Install Python + uses: actions/setup-python@v4 + with: + python-version: ${{ matrix.python_version }} + - name: Install Java Review Comment: Do we need this? ########## sdks/python/container/run_validatescontainer_arm.sh: ########## @@ -0,0 +1,114 @@ +#!/bin/bash Review Comment: is there any opportunity for code reuse, perhaps by parameterizing existing validatescontainer script, or moving some parts of code to gh actions? ########## sdks/python/container/run_validatescontainer_arm.sh: ########## @@ -0,0 +1,114 @@ +#!/bin/bash +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# This script will be run by Jenkins as a post commit test. In order to run +# locally make the following changes: +# +# GCS_LOCATION -> Temporary location to use for service tests. +# PROJECT -> Project name to use for dataflow and docker images. +# REGION -> Region name to use for Dataflow +# +# Execute from the root of the repository: +# test Python3.8 container: +# ./gradlew :sdks:python:test-suites:dataflow:py38:dataflowArm +# or test all supported python versions together: +# ./gradlew :sdks:python:test-suites:dataflow:dataflowArm + +echo "This script must be executed in the root of beam project. Please set GCS_LOCATION, PROJECT and REGION as desired." + +if [[ $# != 2 ]]; then + printf "Usage: \n$> ./sdks/python/container/run_dataflow_arm.sh <python_version> <sdk_location>" + printf "\n\tpython_version: [required] Python version used for container build and run tests." + printf " Sample value: 3.9" + exit 1 +fi + +set -e +set -v + +# Where to store integration test outputs. +GCS_LOCATION=${GCS_LOCATION:-gs://temp-storage-for-end-to-end-tests} + +# Project for the container and integration test +PROJECT=${PROJECT:-apache-beam-testing} +REGION=${REGION:-us-central1} +IMAGE_PREFIX="$(grep 'docker_image_default_repo_prefix' gradle.properties | cut -d'=' -f2)" +SDK_VERSION="$(grep 'sdk_version' gradle.properties | cut -d'=' -f2)" +PY_VERSION=$1 +IMAGE_NAME="${IMAGE_PREFIX}python${PY_VERSION}_sdk" +CONTAINER_PROJECT="sdks:python:container:py${PY_VERSION//.}" # Note: we substitute away the dot in the version. +PY_INTERPRETER="python${PY_VERSION}" + +XUNIT_FILE="pytest-$IMAGE_NAME.xml" + +# Verify in the root of the repository +test -d sdks/python/container + +# Verify docker and gcloud commands exist +command -v docker +command -v gcloud +docker -v +gcloud -v + +CONTAINER=us.gcr.io/$PROJECT/$USER/$IMAGE_NAME +PREBUILD_SDK_CONTAINER_REGISTRY_PATH=us.gcr.io/$PROJECT/$USER/prebuild_python${PY_VERSION//.}_sdk + +function cleanup_container { + # Delete the container locally and remotely + docker rmi $CONTAINER:$TAG || echo "Built container image was not removed. Possibly, it was not not saved locally." + for image in $(docker images --format '{{.Repository}}:{{.Tag}}' | grep $PREBUILD_SDK_CONTAINER_REGISTRY_PATH) + do docker rmi $image || echo "Failed to remove prebuilt sdk container image." + done + gcloud --quiet container images delete $CONTAINER:$TAG || echo "Failed to delete container" + for digest in $(gcloud container images list-tags $PREBUILD_SDK_CONTAINER_REGISTRY_PATH/beam_python_prebuilt_sdk --format="get(digest)") + do gcloud container images delete $PREBUILD_SDK_CONTAINER_REGISTRY_PATH/beam_python_prebuilt_sdk@$digest --force-delete-tags --quiet || echo "Failed to remove prebuilt sdk container image" + done + + echo "Removed the container" +} +trap cleanup_container EXIT + +echo ">>> Successfully built and push container $CONTAINER" + +cd sdks/python +SDK_LOCATION=$2 + +# Run ValidatesRunner tests on Google Cloud Dataflow service +echo ">>> RUNNING DATAFLOW RUNNER VALIDATESCONTAINER ARM TEST" +pytest -o junit_suite_name=$IMAGE_NAME \ + -m="it_dataflow_arm" \ + --show-capture=no \ + --numprocesses=1 \ + --timeout=1800 \ + --junitxml=$XUNIT_FILE \ + --ignore-glob '.*py3\d?\.py$' \ + --log-cli-level=INFO \ + --test-pipeline-options=" \ + --runner=TestDataflowRunner \ + --project=$PROJECT \ + --region=$REGION \ + --sdk_container_image=$CONTAINER:$TAG \ + --staging_location=$GCS_LOCATION/staging-dataflow-arm-test \ + --temp_location=$GCS_LOCATION/temp-dataflow-arm-test \ + --output=$GCS_LOCATION/output \ + --sdk_location=$SDK_LOCATION \ + --num_workers=1 \ + --docker_registry_push_url=$PREBUILD_SDK_CONTAINER_REGISTRY_PATH \ + --machine_type=t2a-standard-1" Review Comment: test already sets it. ########## sdks/python/apache_beam/examples/wordcount_it_test.py: ########## @@ -111,6 +111,19 @@ def test_wordcount_it_with_prebuilt_sdk_container_cloud_build(self): @pytest.mark.it_validatescontainer def test_wordcount_it_with_use_sibling_sdk_workers(self): self._run_wordcount_it(wordcount.run, experiment='use_sibling_sdk_workers') + + @pytest.mark.it_postcommit Review Comment: We need to remove pytest.mark.it_postcommit, otherwise regular postcommit suite will pick it up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
