[
https://issues.apache.org/jira/browse/SPARK-26398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757293#comment-17757293
]
comet commented on SPARK-26398:
-------------------------------
I see #23347 is closed without merge. Does that mean support for GPU is not
available in spark and we need to build the docker image ourself? Any guide
step available?
> Support building GPU docker images
> ----------------------------------
>
> Key: SPARK-26398
> URL: https://issues.apache.org/jira/browse/SPARK-26398
> Project: Spark
> Issue Type: Improvement
> Components: Kubernetes, Spark Core
> Affects Versions: 2.4.0
> Reporter: Rong Ou
> Priority: Minor
>
> To run Spark on Kubernetes, a user first needs to build docker images using
> the `bin/docker-image-tool.sh` script. However, this script only supports
> building images for running on CPUs. As parts of Spark and related libraries
> (e.g. XGBoost) get accelerated on GPUs, it's desirable to build base images
> that can take advantage of GPU acceleration.
> This issue only addresses building docker images with CUDA support. Actually
> accelerating Spark on GPUs is outside the scope, as is supporting other types
> of GPUs.
> Today if anyone wants to experiment with running Spark on Kubernetes with GPU
> support, they have to write their own custom `Dockerfile`. By providing an
> "official" way to build GPU-enabled docker images, we can make it easier to
> get started.
> For now probably not that many people care about this, but it's a necessary
> first step towards GPU acceleration for Spark on Kubernetes.
> The risks are minimal as we only need to make minor changes to
> `bin/docker-image-tool.sh`. The PR is already done and will be attached.
> Success means anyone can easily build Spark docker images with GPU support.
> Proposed API changes: add an optional `-g` flag to
> `bin/docker-image-tool.sh` for building GPU versions of the JVM/Python/R
> docker images. When the `-g` is omitted, existing behavior is preserved.
> Design sketch: when the `-g` flag is specified, we append `-gpu` to the
> docker image names, and switch to dockerfiles based on the official CUDA
> images. Since the CUDA images are based on Ubuntu while the Spark dockerfiles
> are based on Alpine, steps for setting up additional packages are different,
> so there are a parallel set of `Dockerfile.gpu` files.
> Alternative: if we are willing to forego Alpine and switch to Ubuntu for the
> CPU-only images, the two sets of dockerfiles can be unified, and we can just
> pass in a different base image depending on whether the `-g` flag is present
> or not.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]