tbonelee commented on code in PR #5125:
URL: https://github.com/apache/zeppelin/pull/5125#discussion_r2594769163
##########
.github/workflows/core.yml:
##########
@@ -33,24 +33,93 @@ permissions:
contents: read # to fetch code (actions/checkout)
jobs:
+ # ============================================
+ # Job 1: Prepare Docker image
+ # ============================================
+ prepare-python-r-env:
+ runs-on: ubuntu-24.04
+ permissions:
+ contents: read
+ packages: write
+ outputs:
+ image-name: ${{ steps.image.outputs.name }}
+ pom-hash: ${{ steps.hash.outputs.pom }}
+
+ steps:
+ - name: Checkout
+ uses: actions/checkout@v4
+
+ - name: Calculate pom.xml hash
+ id: hash
+ run: |
+ echo "pom=${{ hashFiles('**/pom.xml') }}" >> $GITHUB_OUTPUT
+
+ - name: Generate image name with hash
+ id: image
+ run: |
+ # Include both environment file AND Dockerfile in hash calculation
+ COMBINED_HASH=$(cat testing/env_python_3.9_with_R.yml
.github/docker/python-r-env.Dockerfile | sha256sum | cut -d' ' -f1 | cut -c1-12)
+ IMAGE_NAME="ghcr.io/${{ github.repository_owner
}}/zeppelin-test-env:py39-r-${COMBINED_HASH}"
+
+ echo "name=${IMAGE_NAME}" >> $GITHUB_OUTPUT
+
+ - name: Check if image exists
+ id: check
+ run: |
+ if docker manifest inspect ${{ steps.image.outputs.name }}
>/dev/null 2>&1; then
+ echo "exists=true" >> $GITHUB_OUTPUT
+ else
+ echo "exists=false" >> $GITHUB_OUTPUT
+ fi
+
+ - name: Set up Docker Buildx
+ if: steps.check.outputs.exists != 'true'
+ uses: docker/setup-buildx-action@v3
+
+ - name: Log in to GHCR
+ if: steps.check.outputs.exists != 'true'
+ uses: docker/login-action@v3
+ with:
+ registry: ghcr.io
+ username: ${{ github.actor }}
+ password: ${{ secrets.GITHUB_TOKEN }}
+
+ - name: Build and push if needed
+ if: steps.check.outputs.exists != 'true'
+ uses: docker/build-push-action@v5
+ with:
+ context: .
+ file: .github/docker/python-r-env.Dockerfile
+ push: true
+ tags: |
+ ${{ steps.image.outputs.name }}
+ ghcr.io/${{ github.repository_owner
}}/zeppelin-test-env:py39-r-latest
+ cache-from: type=gha
+ cache-to: type=gha,mode=max
Review Comment:
Is the workflow defined here the same as the one in `build-docker-image.yml`?
If so, I'd suggest converting `build-docker-image.yml` into a reusable
workflow (using `on: workflow_call`).
That way, you can keep the current separation while allowing other
wofkflows, such as `prepare-python-r-env`, to call it via the `uses` keyword.
For reference, here are the cases I mentioned:
- Reusable workflow being called
-
https://github.com/apache/airflow/blob/5845599cb740b1c6bce3fcb5efc9688a71f4b2a7/.github/workflows/ci-image-build.yml
- Workflow that calls another workflow
-
https://github.com/apache/airflow/blob/5845599cb740b1c6bce3fcb5efc9688a71f4b2a7/.github/workflows/ci-amd-arm.yml#L237
##########
.github/workflows/build-docker-images.yml:
##########
@@ -0,0 +1,78 @@
+name: Build Docker Images
+
+on:
+ push:
+ branches: [master, 'branch-*']
+ paths:
+ - 'testing/env_*.yml'
+ - '.github/docker/**'
+ workflow_dispatch: # Allow manual trigger
+
+jobs:
+ build-python-r-env:
+ runs-on: ubuntu-24.04
+ permissions:
+ contents: read
+ packages: write # Required for GHCR push
+
+ steps:
+ - name: Checkout
+ uses: actions/checkout@v4
+
+ - name: Calculate environment file hash
+ id: hash
+ run: |
+ # Include both environment file AND Dockerfile in hash calculation
+ COMBINED_HASH=$(cat testing/env_python_3.9_with_R.yml
.github/docker/python-r-env.Dockerfile | sha256sum | cut -d' ' -f1 | cut -c1-12)
+ IMAGE_NAME="ghcr.io/${{ github.repository_owner
}}/zeppelin-test-env:py39-r-${COMBINED_HASH}"
+ echo "hash=${COMBINED_HASH}" >> $GITHUB_OUTPUT
+ echo "image-name=${IMAGE_NAME}" >> $GITHUB_OUTPUT
+ echo "📦 Image will be: ${IMAGE_NAME}"
+
+ - name: Check if image already exists
+ id: check
+ run: |
+ IMAGE="${{ steps.hash.outputs.image-name }}"
+
+ if docker manifest inspect ${IMAGE} >/dev/null 2>&1; then
+ echo "✅ Image ${IMAGE} already exists, skipping build"
+ echo "exists=true" >> $GITHUB_OUTPUT
+ else
+ echo "❌ Image ${IMAGE} does not exist, will build"
+ echo "exists=false" >> $GITHUB_OUTPUT
+ fi
+ continue-on-error: true
+
+ - name: Set up Docker Buildx
+ if: steps.check.outputs.exists != 'true'
+ uses: docker/setup-buildx-action@v3
+
+ - name: Log in to GitHub Container Registry
+ if: steps.check.outputs.exists != 'true'
+ uses: docker/login-action@v3
+ with:
+ registry: ghcr.io
+ username: ${{ github.actor }}
+ password: ${{ secrets.GITHUB_TOKEN }}
+
+ - name: Build and push Docker image
+ if: steps.check.outputs.exists != 'true'
+ uses: docker/build-push-action@v5
+ with:
+ context: .
+ file: .github/docker/python-r-env.Dockerfile
+ push: true
+ tags: |
+ ${{ steps.hash.outputs.image-name }}
+ ghcr.io/${{ github.repository_owner
}}/zeppelin-test-env:py39-r-latest
+ cache-from: type=gha
+ cache-to: type=gha,mode=max
+ labels: |
+ org.opencontainers.image.source=${{
github.event.repository.html_url }}
+ org.opencontainers.image.revision=${{ github.sha }}
+ build-args: |
+ ENV_FILE=testing/env_python_3.9_with_R.yml
Review Comment:
https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows#workflows-in-forked-repositories
PRs from forked repositories are limited to read-only permissions, which
prevents workflows triggered by them from pushing images to the registry.
I suggest separating the build and push steps. (Assuming the image is not
already in the registry,) we should build the image first for local testing,
and then execute the push step only if the workflow has write access to GHCR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]