DanielCarter-stack commented on PR #10500:
URL: https://github.com/apache/seatunnel/pull/10500#issuecomment-3920332374

   <!-- code-pr-reviewer -->
   <!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10500", "part": 1, 
"total": 1} -->
   ### Issue 1: Missing PR Description and Testing Instructions
   
   **Location**: PR Description (GitHub page)
   
   **Issue Description**:
   All key sections in the PR description are empty:
   - "Purpose of this pull request": empty
   - "Does this PR introduce any user-facing change?": empty
   - "How was this patch tested?": empty
   - "Check list": all options unchecked
   
   **Potential Risks**:
   - Reviewers cannot understand the true motivation for the change (why switch 
from `openjdk:8` to `seatunnelhub/openjdk:8u342`?)
   - Reviewers cannot confirm whether testing verification was performed
   - Violates Apache project PR template best practices
   
   **Impact Scope**:
   - Direct impact: PR review efficiency and quality
   - Indirect impact: may introduce unverified changes to production environment
   
   **Severity**: **MAJOR**
   
   **Improvement Suggestions**:
   Provide a complete PR description:
   ```
   ### Purpose of this pull request
   The official `openjdk:8` image on Docker Hub is deprecated and may lead to 
   unpredictable builds due to tag shifting. This PR updates the base image to 
   `seatunnelhub/openjdk:8u342`, which is already used in our E2E tests, to:
   1. Fix the deprecation warning
   2. Use a pinned version (8u342) for build reproducibility
   3. Align with the infrastructure used in our E2E test suite
   
   ### Does this PR introduce any user-facing change?
   Yes. The Docker image base layer changes from `openjdk:8` to 
`seatunnelhub/openjdk:8u342`.
   - Image size and performance should remain the same
   - No configuration or API changes required
   - Users can pull and use the image as before
   
   ### How was this patch tested?
   - Verified Docker build succeeds locally
   - [ ] Added E2E test to validate the Docker image
   - Tested image can run a simple SeaTunnel job
   ```
   
   **Rationale**: Apache projects require PR descriptions to clearly explain 
the reason for changes, impact, and testing methods; this is a basic code 
review requirement.
   
   ---
   
   ### Issue 2: Documentation Not Updated Synchronously
   
   **Location**:
   - `docs/en/getting-started/docker/docker.md:67`
   - `docs/zh/getting-started/docker/docker.md:67`
   
   **Related Context**:
   ```dockerfile
   # Example in current document (line 67):
   FROM openjdk:8
   
   # Should be updated to:
   FROM seatunnelhub/openjdk:8u342 as builder
   ```
   
   **Issue Description**:
   The Dockerfile example shown in the documentation still uses the old 
`openjdk:8` image, which is inconsistent with the code changes. Users following 
the documentation will get incorrect results.
   
   **Potential Risks**:
   - Users may fail when building images following the documentation (if 
`openjdk:8` is completely removed in the future)
   - User documentation is inconsistent with code implementation, reducing 
project credibility
   - Novice users may abandon using SeaTunnel due to documentation errors
   
   **Impact Scope**:
   - Direct impact: all users who refer to the official documentation to build 
Docker images
   - Indirect impact: increased user support costs (more issue reports)
   - Affected area: Docker deployment users
   
   **Severity**: **MAJOR**
   
   **Improvement Suggestions**:
   Update both Chinese and English documentation synchronously:
   
   ```markdown
   ## docs/en/getting-started/docker/docker.md
   
   The Dockerfile is like this:
   ```dockerfile
   -FROM openjdk:8
   +FROM seatunnelhub/openjdk:8u342 as builder
    
    ARG VERSION
    # Build from Source Code And Copy it into image
    COPY ./target/apache-seatunnel-${VERSION}-bin.tar.gz /opt/
   @@ -13,4 +12,6 @@ RUN cd /opt && \
        sed -i 
's/rootLogger.appenderRef.file.ref/#rootLogger.appenderRef.file.ref/' 
seatunnel/config/log4j2.properties && \    
        cp seatunnel/config/hazelcast-master.yaml 
seatunnel/config/hazelcast-worker.yaml
    
   +FROM seatunnelhub/openjdk:8u342
   +COPY --from=builder /opt/seatunnel /opt/seatunnel
    WORKDIR /opt/seatunnel
   ```
   ```
   
   同时在文档中添加说明:
   ```markdown
   > **Note:** We use `seatunnelhub/openjdk:8u342` as the base image instead of 
the 
   > deprecated `openjdk:8`. The `seatunnelhub` organization is maintained by 
the 
   > SeaTunnel community to ensure consistent and reproducible builds.
   ```
   
   ** Reason**: Documentation-code consistency is a basic requirement for open 
source projects, especially for user guide documentation.
   
   ---
   
   # ## Issue 3: Missing Docker Image Verification Tests
   
   ** Location**: CI/CD test suite
   
   ** Issue Description**:
   PR 中未添加任何测试来验证 Docker 镜像的可用性。虽然 E2E 测试使用了 
`seatunnelhub/openjdk:8u342`,但这不代表基于此镜像构建的 `apache/seatunnel` 镜像能正常工作。
   
   ** Related Context**:
   - E2E 
测试位置:`seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/common/container/seatunnel/`
   - 已有测试使用 `seatunnelhub/openjdk:8u342`,但未测试最终的 `apache/seatunnel` 镜像
   
   ** Potential Risks**:
   - 镜像构建可能成功但运行时失败(如 JDK 不兼容、库缺失等)
   - 镜像可能在特定环境(ARM vs x86)下行为不一致
   - 用户拉取到无法使用的镜像,损害项目声誉
   
   ** Impact Scope**:
   - 直接影响:所有使用 Docker 镜像的用户
   - 间接影响:CI/CD 流水线(如果镜像构建未在 CI 中测试)
   - 影响面:Docker 部署用户
   
   ** Severity**: **CRITICAL**
   
   ** Improvement Suggestions**:
   添加 Docker 镜像的 E2E 测试:
   ```java
   // 
seatunnel-e2e/seatunnel-e2e-common/src/test/java/org/apache/seatunnel/e2e/docker/DockerImageITCase.java
   
   package org.apache.seatunnel.e2e.docker;
   
   import org.junit.jupiter.api.Test;
   import org.testcontainers.containers.GenericContainer;
   import org.testcontainers.junit.jupiter.Container;
   import org.testcontainers.junit.jupiter.Testcontainers;
   import org.testcontainers.images.PullPolicy;
   
   import static org.junit.jupiter.api.Assertions.assertTrue;
   
   @Testcontainers
   public class DockerImageITCase {
       
       @Container
       private GenericContainer<?> seatunnelContainer = new GenericContainer<>(
           "apache/seatunnel:latest" // or use locally built image
       ).withCommand("./bin/seatunnel.sh", "--version")
        .withImagePullPolicy(PullPolicy.ifNotPresent());
       
       @Test
       public void testDockerImageCanExecute() {
           // Verify the container starts and command executes
           assertTrue(seatunnelContainer.isRunning());
       }
       
       @Test
       public void testDockerImageCanRunSimpleJob() {
           // Mount a test config and verify job completes
           // ...
       }
   }
   ```
   
   或在 CI/CD 中添加构建验证:
   ```yaml
   # .github/workflows/docker-build.yml
   name: Docker Image Build Test
   on: [pull_request]
   jobs:
     build-test:
       runs-on: ubuntu-latest
       steps:
         - uses: actions/checkout@v3
         - name: Build Docker image
           run: |
             cd seatunnel-dist
             docker build -f src/main/docker/Dockerfile --build-arg 
VERSION=test -t seatunnel:test .
         - name: Test Docker image
           run: |
             docker run --rm seatunnel:test ./bin/seatunnel.sh --version
   ```
   
   ** Reason**: Infrastructure changes (such as Docker base images) must have 
automated testing verification; this is a CI/CD best practice.
   
   ---
   
   # ## Issue 4: Image Supply Chain Security Not Documented
   
   ** Location**: `seatunnel-dist/src/main/docker/Dockerfile:1,15`
   
   ** Issue Description**:
   使用 `seatunnelhub/openjdk:8u342` 引入了外部依赖,但未说明:
   1. `seatunnelhub` 组织的归属和维护责任
   2. 镜像的构建流程和源代码
   3. 镜像的安全性保证(签名、SBOM、扫描)
   
   ** Related Context**:
   - Apache 项目对第三方依赖有严格要求(LICENSE、THIRD-PARTY 文件)
   - Docker 镜像作为二进制依赖,同样需要审计
   
   ** Potential Risks**:
   - 镜像可能包含恶意代码或后门
   - 镜像可能未及时更新安全补丁
   - 供应链攻击风险(如果 `seatunnelhub` 账户被攻破)
   
   ** Impact Scope**:
   - 直接影响:所有 Docker 镜像用户
   - 间接影响:Apache SeaTunnel 项目声誉
   - 影响面:生产环境安全性
   
   ** Severity**: **CRITICAL**
   
   ** Improvement Suggestions**:
   1. **公开说明 `seatunnelhub` 的维护**:
      - 在项目文档中说明 `seatunnelhub` Docker Hub 组织的维护者(PMC/Committer)
      - 在 CI/CD 中公开镜像构建脚本(如 `seatunnel-hub/docker-images/` 仓库)
   
   2. **添加镜像验证**:
      ```dockerfile
      # Verify image digests in Dockerfile or CI
      FROM seatunnelhub/openjdk:8u342@sha256:<expected-digest>
      ```
   
   3. **使用 Docker Content Trust**:
      ```bash
      export DOCKER_CONTENT_TRUST=1
      docker pull seatunnelhub/openjdk:8u342
      ```
   
   4. **添加 SBOM**:
      在镜像构建时生成 SBOM:
      ```bash
      syft scan seatunnelhub/openjdk:8u342 -o spdx-json > sbom.json
      ```
   
   5. **在 PR 描述中说明**:
      ```
      The `seatunnelhub/openjdk:8u342` image is maintained by the SeaTunnel PMC 
      and built from source at [repository-link]. The image is signed and 
scanned 
      for vulnerabilities in CI/CD.
      ```
   
   ** Reason**: Apache top-level projects have strict requirements for supply 
chain security; external image dependencies must be auditable.
   
   ---
   
   # ## Issue 5: Image Versions Pinned But No Upgrade Strategy
   
   ** Location**: `seatunnel-dist/src/main/docker/Dockerfile:1,15`
   
   ** Issue Description**:
   虽然使用固定版本 `8u342` 是好的实践,但未说明:
   1. 为何选择 `8u342` 而非更新的版本(如 `8u431`)
   2. 何时以及如何升级到更新的 JDK 版本
   3. 是否考虑 Java 11/17/21
   
   ** Related Context**:
   - Java 8 于 2014 年发布,已超过 10 年
   - Oracle JDK 8 公共更新于 2020 年结束(需要商业许可)
   - OpenJDK 8 由其他组织维护,版本号不一致
   
   ** Potential Risks**:
   - 长期使用旧版本 JDK 可能错过安全补丁
   - 未来升级 JDK 版本可能需要大量测试和适配工作
   - 用户可能对 JDK 版本选择有疑问
   
   ** Impact Scope**:
   - 直接影响:Docker 镜像的长期维护
   - 间接影响:用户对项目技术栈现代性的信心
   - 影响面:未来升级成本
   
   ** Severity**: **MINOR**
   
   ** Improvement Suggestions**:
   1. **在文档中说明 JDK 版本策略**:
      ```markdown
      ## JDK Version Policy
      
      - SeaTunnel Docker images currently use Java 8 (update 342) for maximum 
        compatibility with connectors and dependencies.
      - We plan to evaluate Java 11 support in 2.4.0 release.
      - JDK updates will be backported to stable releases for security fixes.
      ```
   
   2. **在 CI/CD 中添加 JDK 版本检查**:
      ```yaml
      - name: Check for JDK security updates
        run: |
          LATEST=$(curl -s 
https://api.github.com/repos/adoptium/temurin8-binaries/releases/latest | jq -r 
.tag_name)
          CURRENT="8u342"
          if [ "$LATEST" != "$CURRENT" ]; then
            echo "Warning: New JDK version available: $LATEST"
          fi
      ```
   
   3. **考虑 ARG 参数化版本**:
      ```dockerfile
      ARG JDK_VERSION=8u342
      FROM seatunnelhub/openjdk:${JDK_VERSION} as builder
      ```
   
   ** Reason**: Clear tech stack upgrade strategies help users plan and 
demonstrate the project's long-term maintenance commitment.
   
   ---
   
   # ## Issue 6: Missing Image Build Information Labels
   
   ** Location**: `seatunnel-dist/src/main/docker/Dockerfile`
   
   ** Issue Description**:
   Dockerfile 缺少 OCI 标准的 LABEL 元数据,不利于镜像的可追溯性和管理。
   
   ** Potential Risks**:
   - 无法从镜像本身获取构建信息(版本、源码、许可证)
   - 不符合 OCI(Open Container Initiative)最佳实践
   - 难以进行镜像生命周期管理
   
   ** Impact Scope**:
   - 直接影响:镜像的可管理性
   - 间接影响:问题排查和审计
   
   ** Severity**: **MINOR**
   
   ** Improvement Suggestions**:
   添加标准 LABEL:
   ```dockerfile
   FROM seatunnelhub/openjdk:8u342 as builder
   
   ARG VERSION
   ARG BUILD_DATE
   ARG COMMIT_SHA
   
   LABEL org.apache.seatunnel.version="${VERSION}"
   LABEL org.apache.seatunnel.build-date="${BUILD_DATE}"
   LABEL org.apache.seatunnel.commit="${COMMIT_SHA}"
   LABEL org.opencontainers.image.title="Apache SeaTunnel"
   LABEL org.opencontainers.image.description="SeaTunnel is a distributed, 
high-performance data integration platform"
   LABEL org.opencontainers.image.licenses="Apache-2.0"
   LABEL org.opencontainers.image.source="https://github.com/apache/seatunnel";
   LABEL org.opencontainers.image.vendor="Apache Software Foundation"
   ```
   
   **Rationale**: OCI standard labels are a best practice for container images, 
helping with image management and traceability.
   
   ---


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to