andygrove opened a new issue, #3537:
URL: https://github.com/apache/datafusion-comet/issues/3537
### What is the problem the feature request solves?
### Description
Add support for running benchmarks on a Kubernetes cluster using Spark's
`spark-submit --master k8s://...` client mode. This was explored during #3534
but removed as out of scope for the initial PR.
### Motivation
The current benchmark runner supports local and standalone Spark clusters
via docker-compose. Adding K8s support would enable:
- Running benchmarks on multi-node clusters with realistic resource
constraints
- Leveraging existing K8s infrastructure (e.g., K3s, EKS, GKE) without
managing standalone Spark clusters
- Better reproducibility via containerized executor pods with defined
resource limits
### Proposed Scope
- **K8s profile config** (`conf/profiles/k8s.conf`) with
`spark.master=k8s://...`, executor pod templates, and container image settings
- **RBAC manifests** (namespace, service account, role, role binding) for
the `comet-bench` namespace
- **PV/PVC definitions** for mounting benchmark data and engine JARs into
executor pods
- **Documentation** for pushing the `comet-bench` image to a
cluster-accessible registry and running benchmarks
- **Validation** with at least one TPC-H query on a multi-node cluster
(e.g., K3s)
### Key Considerations
- The `comet-bench` Docker image already includes both Java 8 and Java 17
runtimes and the TPC query files, so it can serve as the executor image
- Spark client mode requires the driver pod (or host) to be reachable from
executor pods — network configuration may vary by cluster
- Engine JARs (Comet, Gluten) need to be accessible to executors, either
baked into the image or mounted via PVCs
- Gluten requires `JAVA_HOME` override to Java 8 on all executor pods
### Describe the potential solution
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]