Copilot commented on code in PR #389:
URL: https://github.com/apache/doris-thirdparty/pull/389#discussion_r3097763734
##########
.github/workflows/build.yml:
##########
@@ -251,9 +252,125 @@ jobs:
tar -cf - installed | xz -z -T0 -
>"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
gh release upload --clobber automation
"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
+ update-docker:
+ name: Update Docker Image
+ needs: [prerelease, build]
+ if: |
+ always() && (
+ (needs.prerelease.outputs.should_release == 'true' &&
needs.build.result == 'success') ||
+ (github.event_name == 'workflow_dispatch' && needs.build.result !=
'failure')
Review Comment:
The `update-docker` job runs on `workflow_dispatch` when `needs.build.result
!= 'failure'`, which also includes `cancelled`/`skipped`. That can trigger a
Docker push even when the build was cancelled or never produced a fresh
prebuilt artifact, leading to confusing failures (missing release asset) or
publishing an unintended image. Consider tightening the condition to only allow
`success` (or explicitly `success || skipped` if intentionally reusing an
existing release asset), and/or add an explicit check that the expected release
asset exists before building/pushing.
```suggestion
(github.event_name == 'workflow_dispatch' && needs.build.result ==
'success')
```
##########
.github/workflows/build.yml:
##########
@@ -251,9 +252,125 @@ jobs:
tar -cf - installed | xz -z -T0 -
>"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
gh release upload --clobber automation
"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
+ update-docker:
+ name: Update Docker Image
+ needs: [prerelease, build]
+ if: |
+ always() && (
+ (needs.prerelease.outputs.should_release == 'true' &&
needs.build.result == 'success') ||
+ (github.event_name == 'workflow_dispatch' && needs.build.result !=
'failure')
+ )
+ runs-on: ubuntu-latest
+ env:
+ GH_REPO: ${{ github.repository }}
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+ steps:
+ - name: Set up Docker Buildx
+ uses: docker/setup-buildx-action@v3
+
+ - name: Login to Docker Hub
+ uses: docker/login-action@v3
+ with:
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
+ password: ${{ secrets.DOCKERHUB_TOKEN }}
+
+ - name: Prepare Dockerfiles
+ run: |
+ # Pull the upstream Dockerfile from apache/doris to stay in sync with
+ # any toolchain / dependency changes they make.
+ curl -fsSL
https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile
\
+ -o Dockerfile.upstream
+
+ python3 - << 'PYEOF'
+ import sys, re
+
+ with open('Dockerfile.upstream') as f:
+ content = f.read()
+
+ # Patch 1: fix epel metalink.
+ # The upstream RUN line installs epel-release then immediately runs
yum
+ # install/clean/makecache, but epel.repo's metalink may be
unreliable.
+ # Insert a sed fix between epel-release install and the next yum
install.
+ old_epel = 'yum install epel-release -y && yum install
https://packages.endpointdev.com'
+ new_epel = ('yum install epel-release -y \\\n'
+ ' && sed -i \\\n'
+ ' -e \'s/^metalink=/#metalink=/\' \\\n'
+ ' -e
\'s|^#baseurl=http://download.fedoraproject.org/pub/epel/7|baseurl=https://mirrors.aliyun.com/epel/7|\'
\\\n'
+ ' /etc/yum.repos.d/epel*.repo \\\n'
+ ' && yum install https://packages.endpointdev.com')
+ assert old_epel in content, f"Patch 1 failed: target string not
found in upstream Dockerfile"
+ patched = content.replace(old_epel, new_epel, 1)
+ assert patched != content, "Patch 1 was a no-op: upstream Dockerfile
may have changed"
+
+ # Patch 2: replace "clone & build thirdparty" block with downloading
+ # our prebuilt artifact. The block starts with "# clone lastest
source
+ # code" comment and ends with "rm -rf ${DEFAULT_DIR}/doris".
+ prebuilt_block = (
+ '# Download prebuilt thirdparty from GitHub Release (built by
doris-thirdparty automation)\n'
+ 'ARG GITHUB_REPOSITORY\n'
+ 'RUN mkdir -p /var/local/thirdparty \\\n'
+ ' && wget -q
"https://github.com/${GITHUB_REPOSITORY}/releases/download/automation/doris-thirdparty-prebuilt-linux-x86_64.tar.xz"
\\\n'
+ ' -O /tmp/prebuilt.tar.xz \\\n'
+ ' && tar -xf /tmp/prebuilt.tar.xz -C /var/local/thirdparty
\\\n'
+ ' && rm /tmp/prebuilt.tar.xz\n'
+ )
+ patched_2 = re.sub(
+ r'# clone lastest source code.*?rm -rf
\$\{DEFAULT_DIR\}/doris\n',
+ prebuilt_block,
+ patched, flags=re.DOTALL
+ )
+ assert patched_2 != patched, "Patch 2 was a no-op: 'clone lastest
source code' block not found in upstream Dockerfile"
+ patched = patched_2
Review Comment:
The Python patching logic is very brittle: Patch 2 depends on the exact
comment text `# clone lastest source code` and Patch 1 depends on an exact RUN
substring. If upstream reformats or fixes typos, this workflow will start
failing. Consider making the regex/string matches more resilient (e.g.,
tolerate `latest/lastest`, anchor around more stable commands, or use a small
patch file applied with `patch`), so upstream drift causes fewer spurious
failures.
##########
.github/workflows/build.yml:
##########
@@ -251,9 +252,125 @@ jobs:
tar -cf - installed | xz -z -T0 -
>"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
gh release upload --clobber automation
"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
+ update-docker:
+ name: Update Docker Image
+ needs: [prerelease, build]
+ if: |
+ always() && (
+ (needs.prerelease.outputs.should_release == 'true' &&
needs.build.result == 'success') ||
+ (github.event_name == 'workflow_dispatch' && needs.build.result !=
'failure')
+ )
+ runs-on: ubuntu-latest
+ env:
+ GH_REPO: ${{ github.repository }}
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+ steps:
+ - name: Set up Docker Buildx
+ uses: docker/setup-buildx-action@v3
+
+ - name: Login to Docker Hub
+ uses: docker/login-action@v3
+ with:
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
+ password: ${{ secrets.DOCKERHUB_TOKEN }}
+
+ - name: Prepare Dockerfiles
+ run: |
+ # Pull the upstream Dockerfile from apache/doris to stay in sync with
+ # any toolchain / dependency changes they make.
+ curl -fsSL
https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile
\
+ -o Dockerfile.upstream
+
+ python3 - << 'PYEOF'
+ import sys, re
+
+ with open('Dockerfile.upstream') as f:
+ content = f.read()
+
+ # Patch 1: fix epel metalink.
+ # The upstream RUN line installs epel-release then immediately runs
yum
+ # install/clean/makecache, but epel.repo's metalink may be
unreliable.
+ # Insert a sed fix between epel-release install and the next yum
install.
+ old_epel = 'yum install epel-release -y && yum install
https://packages.endpointdev.com'
+ new_epel = ('yum install epel-release -y \\\n'
+ ' && sed -i \\\n'
+ ' -e \'s/^metalink=/#metalink=/\' \\\n'
+ ' -e
\'s|^#baseurl=http://download.fedoraproject.org/pub/epel/7|baseurl=https://mirrors.aliyun.com/epel/7|\'
\\\n'
+ ' /etc/yum.repos.d/epel*.repo \\\n'
+ ' && yum install https://packages.endpointdev.com')
+ assert old_epel in content, f"Patch 1 failed: target string not
found in upstream Dockerfile"
+ patched = content.replace(old_epel, new_epel, 1)
+ assert patched != content, "Patch 1 was a no-op: upstream Dockerfile
may have changed"
+
+ # Patch 2: replace "clone & build thirdparty" block with downloading
+ # our prebuilt artifact. The block starts with "# clone lastest
source
+ # code" comment and ends with "rm -rf ${DEFAULT_DIR}/doris".
+ prebuilt_block = (
+ '# Download prebuilt thirdparty from GitHub Release (built by
doris-thirdparty automation)\n'
+ 'ARG GITHUB_REPOSITORY\n'
+ 'RUN mkdir -p /var/local/thirdparty \\\n'
+ ' && wget -q
"https://github.com/${GITHUB_REPOSITORY}/releases/download/automation/doris-thirdparty-prebuilt-linux-x86_64.tar.xz"
\\\n'
+ ' -O /tmp/prebuilt.tar.xz \\\n'
+ ' && tar -xf /tmp/prebuilt.tar.xz -C /var/local/thirdparty
\\\n'
+ ' && rm /tmp/prebuilt.tar.xz\n'
+ )
+ patched_2 = re.sub(
+ r'# clone lastest source code.*?rm -rf
\$\{DEFAULT_DIR\}/doris\n',
+ prebuilt_block,
+ patched, flags=re.DOTALL
+ )
+ assert patched_2 != patched, "Patch 2 was a no-op: 'clone lastest
source code' block not found in upstream Dockerfile"
+ patched = patched_2
+
+ # Write normal image Dockerfile
+ with open('Dockerfile.patched', 'w') as f:
+ f.write(patched)
+
+ # Patch 3 (no-avx2): add USE_AVX2=0 to the ENV block in builder
stage.
+ # The ENV block ends with PATH=... just before "# install ccache".
+ old_avx2 = 'PATH="/var/local/ldb-toolchain/bin/:$PATH"\n #
USE_AVX2=0'
+ new_avx2 = 'PATH="/var/local/ldb-toolchain/bin/:$PATH" \\\n
USE_AVX2=0'
+ assert old_avx2 in patched, "Patch 3 failed: '# USE_AVX2=0' comment
not found; upstream Dockerfile may have changed"
+ noavx2 = patched.replace(old_avx2, new_avx2)
+ assert noavx2 != patched, "Patch 3 was a no-op: no-avx2 Dockerfile
is identical to normal Dockerfile"
+ with open('Dockerfile.patched-noavx2', 'w') as f:
+ f.write(noavx2)
+
+ print("=== normal: check ===")
+ for line in open('Dockerfile.patched'):
+ if any(k in line for k in ['COPY doris', 'build-thirdparty',
'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
+ print(line, end='')
+ print("=== noavx2: check ===")
+ for line in open('Dockerfile.patched-noavx2'):
+ if any(k in line for k in ['COPY doris', 'build-thirdparty',
'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
+ print(line, end='')
Review Comment:
The script only prints diagnostic lines if `COPY doris` / `build-thirdparty`
are still present, but it doesn’t fail the job. Since this workflow never
checks out a build context, leaving any `COPY` instructions in the patched
Dockerfile will make the Docker build fail later with a less actionable error.
Consider adding assertions that these patterns are *absent* in
`Dockerfile.patched`/`Dockerfile.patched-noavx2` after patching.
```suggestion
diagnostic_patterns = ['COPY doris', 'build-thirdparty', 'ARG
GITHUB', 'prebuilt', 'USE_AVX2']
forbidden_patterns = ['COPY doris', 'build-thirdparty']
def check_patched_dockerfile(path, label):
print(f"=== {label}: check ===")
with open(path) as f:
lines = f.readlines()
for line in lines:
if any(k in line for k in diagnostic_patterns):
print(line, end='')
for pattern in forbidden_patterns:
assert not any(pattern in line for line in lines), (
f"Patch validation failed for {path}: found forbidden
pattern '{pattern}'; "
"patched Dockerfile still depends on unavailable build
context"
)
check_patched_dockerfile('Dockerfile.patched', 'normal')
check_patched_dockerfile('Dockerfile.patched-noavx2', 'noavx2')
```
##########
.github/workflows/build.yml:
##########
@@ -251,9 +252,125 @@ jobs:
tar -cf - installed | xz -z -T0 -
>"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
gh release upload --clobber automation
"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
+ update-docker:
+ name: Update Docker Image
+ needs: [prerelease, build]
+ if: |
+ always() && (
+ (needs.prerelease.outputs.should_release == 'true' &&
needs.build.result == 'success') ||
+ (github.event_name == 'workflow_dispatch' && needs.build.result !=
'failure')
+ )
+ runs-on: ubuntu-latest
+ env:
+ GH_REPO: ${{ github.repository }}
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+ steps:
+ - name: Set up Docker Buildx
+ uses: docker/setup-buildx-action@v3
+
+ - name: Login to Docker Hub
+ uses: docker/login-action@v3
+ with:
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
+ password: ${{ secrets.DOCKERHUB_TOKEN }}
+
+ - name: Prepare Dockerfiles
+ run: |
+ # Pull the upstream Dockerfile from apache/doris to stay in sync with
+ # any toolchain / dependency changes they make.
+ curl -fsSL
https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile
\
+ -o Dockerfile.upstream
Review Comment:
`curl` pulls the upstream Dockerfile from the moving `master` branch. That
makes the produced images non-reproducible and introduces a supply-chain risk
(the build content can change without any change in this repo). Consider
pinning to a specific commit SHA (e.g., derived from the Doris version you
already compute in `prerelease`) and/or verifying the fetched content before
building.
```suggestion
env:
DORIS_DOCKERFILE_COMMIT: 0123456789abcdef0123456789abcdef01234567
DORIS_DOCKERFILE_SHA256: REPLACE_WITH_VERIFIED_SHA256
run: |
# Pull the upstream Dockerfile from a pinned apache/doris commit
so the
# build input is reproducible and cannot drift with changes on
master.
curl -fsSL
"https://raw.githubusercontent.com/apache/doris/${DORIS_DOCKERFILE_COMMIT}/docker/compilation/Dockerfile"
\
-o Dockerfile.upstream
printf '%s %s\n' "${DORIS_DOCKERFILE_SHA256}" Dockerfile.upstream
| sha256sum -c -
```
##########
.github/workflows/build.yml:
##########
@@ -251,9 +252,125 @@ jobs:
tar -cf - installed | xz -z -T0 -
>"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
gh release upload --clobber automation
"doris-thirdparty-prebuilt-${kernel}-${arch}.tar.xz"
+ update-docker:
+ name: Update Docker Image
+ needs: [prerelease, build]
+ if: |
+ always() && (
+ (needs.prerelease.outputs.should_release == 'true' &&
needs.build.result == 'success') ||
+ (github.event_name == 'workflow_dispatch' && needs.build.result !=
'failure')
+ )
+ runs-on: ubuntu-latest
+ env:
+ GH_REPO: ${{ github.repository }}
+ GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+ steps:
+ - name: Set up Docker Buildx
+ uses: docker/setup-buildx-action@v3
+
+ - name: Login to Docker Hub
+ uses: docker/login-action@v3
+ with:
+ username: ${{ secrets.DOCKERHUB_USERNAME }}
+ password: ${{ secrets.DOCKERHUB_TOKEN }}
+
+ - name: Prepare Dockerfiles
+ run: |
+ # Pull the upstream Dockerfile from apache/doris to stay in sync with
+ # any toolchain / dependency changes they make.
+ curl -fsSL
https://raw.githubusercontent.com/apache/doris/master/docker/compilation/Dockerfile
\
+ -o Dockerfile.upstream
+
+ python3 - << 'PYEOF'
+ import sys, re
+
+ with open('Dockerfile.upstream') as f:
+ content = f.read()
+
+ # Patch 1: fix epel metalink.
+ # The upstream RUN line installs epel-release then immediately runs
yum
+ # install/clean/makecache, but epel.repo's metalink may be
unreliable.
+ # Insert a sed fix between epel-release install and the next yum
install.
+ old_epel = 'yum install epel-release -y && yum install
https://packages.endpointdev.com'
+ new_epel = ('yum install epel-release -y \\\n'
+ ' && sed -i \\\n'
+ ' -e \'s/^metalink=/#metalink=/\' \\\n'
+ ' -e
\'s|^#baseurl=http://download.fedoraproject.org/pub/epel/7|baseurl=https://mirrors.aliyun.com/epel/7|\'
\\\n'
+ ' /etc/yum.repos.d/epel*.repo \\\n'
+ ' && yum install https://packages.endpointdev.com')
+ assert old_epel in content, f"Patch 1 failed: target string not
found in upstream Dockerfile"
+ patched = content.replace(old_epel, new_epel, 1)
+ assert patched != content, "Patch 1 was a no-op: upstream Dockerfile
may have changed"
+
+ # Patch 2: replace "clone & build thirdparty" block with downloading
+ # our prebuilt artifact. The block starts with "# clone lastest
source
+ # code" comment and ends with "rm -rf ${DEFAULT_DIR}/doris".
+ prebuilt_block = (
+ '# Download prebuilt thirdparty from GitHub Release (built by
doris-thirdparty automation)\n'
+ 'ARG GITHUB_REPOSITORY\n'
+ 'RUN mkdir -p /var/local/thirdparty \\\n'
+ ' && wget -q
"https://github.com/${GITHUB_REPOSITORY}/releases/download/automation/doris-thirdparty-prebuilt-linux-x86_64.tar.xz"
\\\n'
+ ' -O /tmp/prebuilt.tar.xz \\\n'
+ ' && tar -xf /tmp/prebuilt.tar.xz -C /var/local/thirdparty
\\\n'
+ ' && rm /tmp/prebuilt.tar.xz\n'
+ )
+ patched_2 = re.sub(
+ r'# clone lastest source code.*?rm -rf
\$\{DEFAULT_DIR\}/doris\n',
+ prebuilt_block,
+ patched, flags=re.DOTALL
+ )
+ assert patched_2 != patched, "Patch 2 was a no-op: 'clone lastest
source code' block not found in upstream Dockerfile"
+ patched = patched_2
+
+ # Write normal image Dockerfile
+ with open('Dockerfile.patched', 'w') as f:
+ f.write(patched)
+
+ # Patch 3 (no-avx2): add USE_AVX2=0 to the ENV block in builder
stage.
+ # The ENV block ends with PATH=... just before "# install ccache".
+ old_avx2 = 'PATH="/var/local/ldb-toolchain/bin/:$PATH"\n #
USE_AVX2=0'
+ new_avx2 = 'PATH="/var/local/ldb-toolchain/bin/:$PATH" \\\n
USE_AVX2=0'
+ assert old_avx2 in patched, "Patch 3 failed: '# USE_AVX2=0' comment
not found; upstream Dockerfile may have changed"
+ noavx2 = patched.replace(old_avx2, new_avx2)
+ assert noavx2 != patched, "Patch 3 was a no-op: no-avx2 Dockerfile
is identical to normal Dockerfile"
+ with open('Dockerfile.patched-noavx2', 'w') as f:
+ f.write(noavx2)
+
+ print("=== normal: check ===")
+ for line in open('Dockerfile.patched'):
+ if any(k in line for k in ['COPY doris', 'build-thirdparty',
'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
+ print(line, end='')
+ print("=== noavx2: check ===")
+ for line in open('Dockerfile.patched-noavx2'):
+ if any(k in line for k in ['COPY doris', 'build-thirdparty',
'ARG GITHUB', 'prebuilt', 'USE_AVX2']):
+ print(line, end='')
+ PYEOF
+
+ - name: Build and Push Docker Image (normal)
+ uses: docker/build-push-action@v5
+ with:
+ context: .
+ file: Dockerfile.patched
+ build-args: |
+ GITHUB_REPOSITORY=${{ github.repository }}
+ push: true
+ tags: ${{ vars.DOCKER_TAGS ||
'apache/doris:build-env-ldb-toolchain-latest' }}
+ platforms: linux/amd64
+
+ - name: Build and Push Docker Image (no-avx2)
+ uses: docker/build-push-action@v5
+ with:
+ context: .
+ file: Dockerfile.patched-noavx2
+ build-args: |
+ GITHUB_REPOSITORY=${{ github.repository }}
+ push: true
+ tags: ${{ vars.DOCKER_TAGS_NOAVX2 ||
'apache/doris:build-env-ldb-toolchain-no-avx2-latest' }}
+ platforms: linux/amd64
+
success:
name: Success
- needs: [prerelease, build]
+ needs: [prerelease, build, update-docker]
if: needs.prerelease.outputs.should_release == 'true'
Review Comment:
Adding `update-docker` to the `success` job’s `needs` means the release note
will no longer be updated to `SUCCESS` if the Docker build/push fails (or is
skipped), even when the prebuilt artifacts were successfully built and
uploaded. If that coupling isn’t intended, consider decoupling the
release-status update from the Docker publish (e.g., keep `success` depending
only on prerelease/build, and have a separate status line/section for the image
publish).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]