tbonelee commented on code in PR #5125:
URL: https://github.com/apache/zeppelin/pull/5125#discussion_r2594769163


##########
.github/workflows/core.yml:
##########
@@ -33,24 +33,93 @@ permissions:
   contents: read # to fetch code (actions/checkout)
 
 jobs:
+  # ============================================
+  # Job 1: Prepare Docker image
+  # ============================================
+  prepare-python-r-env:
+    runs-on: ubuntu-24.04
+    permissions:
+      contents: read
+      packages: write
+    outputs:
+      image-name: ${{ steps.image.outputs.name }}
+      pom-hash: ${{ steps.hash.outputs.pom }}
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Calculate pom.xml hash
+        id: hash
+        run: |
+          echo "pom=${{ hashFiles('**/pom.xml') }}" >> $GITHUB_OUTPUT
+
+      - name: Generate image name with hash
+        id: image
+        run: |
+          # Include both environment file AND Dockerfile in hash calculation
+          COMBINED_HASH=$(cat testing/env_python_3.9_with_R.yml 
.github/docker/python-r-env.Dockerfile | sha256sum | cut -d' ' -f1 | cut -c1-12)
+          IMAGE_NAME="ghcr.io/${{ github.repository_owner 
}}/zeppelin-test-env:py39-r-${COMBINED_HASH}"
+
+          echo "name=${IMAGE_NAME}" >> $GITHUB_OUTPUT
+
+      - name: Check if image exists
+        id: check
+        run: |
+          if docker manifest inspect ${{ steps.image.outputs.name }} 
>/dev/null 2>&1; then
+            echo "exists=true" >> $GITHUB_OUTPUT
+          else
+            echo "exists=false" >> $GITHUB_OUTPUT
+          fi
+
+      - name: Set up Docker Buildx
+        if: steps.check.outputs.exists != 'true'
+        uses: docker/setup-buildx-action@v3
+
+      - name: Log in to GHCR
+        if: steps.check.outputs.exists != 'true'
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Build and push if needed
+        if: steps.check.outputs.exists != 'true'
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          file: .github/docker/python-r-env.Dockerfile
+          push: true
+          tags: |
+            ${{ steps.image.outputs.name }}
+            ghcr.io/${{ github.repository_owner 
}}/zeppelin-test-env:py39-r-latest
+          cache-from: type=gha
+          cache-to: type=gha,mode=max

Review Comment:
   Is the workflow defined here the same as the one in `build-docker-image.yml`?
   
   If so, I'd suggest converting `build-docker-image.yml` into a reusable 
workflow (using `on: workflow_call`).
   
   That way, you can keep the current separation while allowing other 
wofkflows, such as `prepare-python-r-env`, to call it via the `uses` keyword.
   
   For reference, here are the cases I mentioned:
   
   - Reusable workflow being called
     - 
https://github.com/apache/airflow/blob/5845599cb740b1c6bce3fcb5efc9688a71f4b2a7/.github/workflows/ci-image-build.yml
   - Workflow that calls another workflow
     - 
https://github.com/apache/airflow/blob/5845599cb740b1c6bce3fcb5efc9688a71f4b2a7/.github/workflows/ci-amd-arm.yml#L237



##########
.github/workflows/build-docker-images.yml:
##########
@@ -0,0 +1,78 @@
+name: Build Docker Images
+
+on:
+  push:
+    branches: [master, 'branch-*']
+    paths:
+      - 'testing/env_*.yml'
+      - '.github/docker/**'
+  workflow_dispatch:  # Allow manual trigger
+
+jobs:
+  build-python-r-env:
+    runs-on: ubuntu-24.04
+    permissions:
+      contents: read
+      packages: write  # Required for GHCR push
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Calculate environment file hash
+        id: hash
+        run: |
+          # Include both environment file AND Dockerfile in hash calculation
+          COMBINED_HASH=$(cat testing/env_python_3.9_with_R.yml 
.github/docker/python-r-env.Dockerfile | sha256sum | cut -d' ' -f1 | cut -c1-12)
+          IMAGE_NAME="ghcr.io/${{ github.repository_owner 
}}/zeppelin-test-env:py39-r-${COMBINED_HASH}"
+          echo "hash=${COMBINED_HASH}" >> $GITHUB_OUTPUT
+          echo "image-name=${IMAGE_NAME}" >> $GITHUB_OUTPUT
+          echo "📦 Image will be: ${IMAGE_NAME}"
+
+      - name: Check if image already exists
+        id: check
+        run: |
+          IMAGE="${{ steps.hash.outputs.image-name }}"
+
+          if docker manifest inspect ${IMAGE} >/dev/null 2>&1; then
+            echo "✅ Image ${IMAGE} already exists, skipping build"
+            echo "exists=true" >> $GITHUB_OUTPUT
+          else
+            echo "❌ Image ${IMAGE} does not exist, will build"
+            echo "exists=false" >> $GITHUB_OUTPUT
+          fi
+        continue-on-error: true
+
+      - name: Set up Docker Buildx
+        if: steps.check.outputs.exists != 'true'
+        uses: docker/setup-buildx-action@v3
+
+      - name: Log in to GitHub Container Registry
+        if: steps.check.outputs.exists != 'true'
+        uses: docker/login-action@v3
+        with:
+          registry: ghcr.io
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Build and push Docker image
+        if: steps.check.outputs.exists != 'true'
+        uses: docker/build-push-action@v5
+        with:
+          context: .
+          file: .github/docker/python-r-env.Dockerfile
+          push: true
+          tags: |
+            ${{ steps.hash.outputs.image-name }}
+            ghcr.io/${{ github.repository_owner 
}}/zeppelin-test-env:py39-r-latest
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+          labels: |
+            org.opencontainers.image.source=${{ 
github.event.repository.html_url }}
+            org.opencontainers.image.revision=${{ github.sha }}
+          build-args: |
+            ENV_FILE=testing/env_python_3.9_with_R.yml

Review Comment:
   
https://docs.github.com/en/actions/reference/workflows-and-actions/events-that-trigger-workflows#workflows-in-forked-repositories
   
   PRs from forked repositories are limited to read-only permissions, which 
prevents workflows triggered by them from pushing images to the registry.
   
   I suggest separating the build and push steps. (Assuming the image is not 
already in the registry,) we should build the image first for local testing, 
and then execute the push step only if the workflow has write access to GHCR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to