This is an automated email from the ASF dual-hosted git repository.

ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new ccf44c0c197a [SPARK-55960][INFRA][CONNECT][PYTHON] Add a docker image 
for spark connect codegen
ccf44c0c197a is described below

commit ccf44c0c197a1b96c594983e1db9d725d7d15313
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Wed Mar 11 16:59:55 2026 +0800

    [SPARK-55960][INFRA][CONNECT][PYTHON] Add a docker image for spark connect 
codegen
    
    ### What changes were proposed in this pull request?
    Add a docker image for spark connect codegen
    
    ### Why are the changes needed?
    Add such image so that we can run it locally without setting up buf and 
python env
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    manually check
    
    ```bash
    docker build -t connect-cg dev/spark-test-image/connect-gen-protos/
    
    docker run -it --rm -v "$(pwd)":/spark connect-cg
    ```
    after ^^^ commands, the files in `python/pyspark/sql/connect/proto/` are 
correctly re-generated
    
    ### Was this patch authored or co-authored using generative AI tooling?
    No
    
    Closes #54755 from zhengruifeng/infra_buf_docker.
    
    Authored-by: Ruifeng Zheng <[email protected]>
    Signed-off-by: Ruifeng Zheng <[email protected]>
---
 dev/spark-test-image/connect-gen-protos/Dockerfile | 62 ++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/dev/spark-test-image/connect-gen-protos/Dockerfile 
b/dev/spark-test-image/connect-gen-protos/Dockerfile
new file mode 100644
index 000000000000..9f29c178aff1
--- /dev/null
+++ b/dev/spark-test-image/connect-gen-protos/Dockerfile
@@ -0,0 +1,62 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Usage:
+# 1, Build the image
+# docker build -t connect-cg dev/spark-test-image/connect-gen-protos/
+# 2, Run the image under spark repo
+# docker run -it --rm -v "$(pwd)":/spark connect-cg
+
+# Image for generating Spark Connect protobuf files. Based on Ubuntu 24.04.
+FROM ubuntu:noble
+LABEL org.opencontainers.image.authors="Apache Spark project 
<[email protected]>"
+LABEL org.opencontainers.image.licenses="Apache-2.0"
+LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For Spark 
Connect CodeGen"
+# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
+LABEL org.opencontainers.image.version=""
+
+ENV FULL_REFRESH_DATE=20260311
+
+RUN apt-get update && apt-get install -y \
+    ca-certificates \
+    curl \
+    python3.12 \
+    python3.12-venv \
+    && apt-get autoremove --purge -y \
+    && apt-get clean \
+    && rm -rf /var/lib/apt/lists/*
+
+# Install buf binary from GitHub releases
+# See https://buf.build/docs/cli/installation/#github
+ARG BUF_VERSION="1.66.1"
+RUN curl -fsSL 
"https://github.com/bufbuild/buf/releases/download/v$BUF_VERSION/buf-Linux-$(uname
 -m)" -o /usr/local/bin/buf && \
+    chmod +x /usr/local/bin/buf
+
+# Setup virtual environment and install Python dependencies
+ENV VIRTUAL_ENV=/opt/spark-venv
+RUN python3.12 -m venv $VIRTUAL_ENV
+ENV PATH="$VIRTUAL_ENV/bin:$PATH"
+
+RUN python3.12 -m pip install \
+    'mypy==1.19.1' \
+    'mypy-protobuf==3.3.0' \
+    'black==23.12.1'
+
+# Mount the Spark repo at /spark
+WORKDIR /spark
+
+ENTRYPOINT [ "/spark/dev/connect-gen-protos.sh" ]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to