This is an automated email from the ASF dual-hosted git repository.
ruifengz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ccf44c0c197a [SPARK-55960][INFRA][CONNECT][PYTHON] Add a docker image
for spark connect codegen
ccf44c0c197a is described below
commit ccf44c0c197a1b96c594983e1db9d725d7d15313
Author: Ruifeng Zheng <[email protected]>
AuthorDate: Wed Mar 11 16:59:55 2026 +0800
[SPARK-55960][INFRA][CONNECT][PYTHON] Add a docker image for spark connect
codegen
### What changes were proposed in this pull request?
Add a docker image for spark connect codegen
### Why are the changes needed?
Add such image so that we can run it locally without setting up buf and
python env
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
manually check
```bash
docker build -t connect-cg dev/spark-test-image/connect-gen-protos/
docker run -it --rm -v "$(pwd)":/spark connect-cg
```
after ^^^ commands, the files in `python/pyspark/sql/connect/proto/` are
correctly re-generated
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #54755 from zhengruifeng/infra_buf_docker.
Authored-by: Ruifeng Zheng <[email protected]>
Signed-off-by: Ruifeng Zheng <[email protected]>
---
dev/spark-test-image/connect-gen-protos/Dockerfile | 62 ++++++++++++++++++++++
1 file changed, 62 insertions(+)
diff --git a/dev/spark-test-image/connect-gen-protos/Dockerfile
b/dev/spark-test-image/connect-gen-protos/Dockerfile
new file mode 100644
index 000000000000..9f29c178aff1
--- /dev/null
+++ b/dev/spark-test-image/connect-gen-protos/Dockerfile
@@ -0,0 +1,62 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements. See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License. You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# Usage:
+# 1, Build the image
+# docker build -t connect-cg dev/spark-test-image/connect-gen-protos/
+# 2, Run the image under spark repo
+# docker run -it --rm -v "$(pwd)":/spark connect-cg
+
+# Image for generating Spark Connect protobuf files. Based on Ubuntu 24.04.
+FROM ubuntu:noble
+LABEL org.opencontainers.image.authors="Apache Spark project
<[email protected]>"
+LABEL org.opencontainers.image.licenses="Apache-2.0"
+LABEL org.opencontainers.image.ref.name="Apache Spark Infra Image For Spark
Connect CodeGen"
+# Overwrite this label to avoid exposing the underlying Ubuntu OS version label
+LABEL org.opencontainers.image.version=""
+
+ENV FULL_REFRESH_DATE=20260311
+
+RUN apt-get update && apt-get install -y \
+ ca-certificates \
+ curl \
+ python3.12 \
+ python3.12-venv \
+ && apt-get autoremove --purge -y \
+ && apt-get clean \
+ && rm -rf /var/lib/apt/lists/*
+
+# Install buf binary from GitHub releases
+# See https://buf.build/docs/cli/installation/#github
+ARG BUF_VERSION="1.66.1"
+RUN curl -fsSL
"https://github.com/bufbuild/buf/releases/download/v$BUF_VERSION/buf-Linux-$(uname
-m)" -o /usr/local/bin/buf && \
+ chmod +x /usr/local/bin/buf
+
+# Setup virtual environment and install Python dependencies
+ENV VIRTUAL_ENV=/opt/spark-venv
+RUN python3.12 -m venv $VIRTUAL_ENV
+ENV PATH="$VIRTUAL_ENV/bin:$PATH"
+
+RUN python3.12 -m pip install \
+ 'mypy==1.19.1' \
+ 'mypy-protobuf==3.3.0' \
+ 'black==23.12.1'
+
+# Mount the Spark repo at /spark
+WORKDIR /spark
+
+ENTRYPOINT [ "/spark/dev/connect-gen-protos.sh" ]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]