dongjoon-hyun commented on a change in pull request #35870: URL: https://github.com/apache/spark/pull/35870#discussion_r837035197
########## File path: docs/running-on-kubernetes.md ########## @@ -1732,6 +1732,95 @@ Spark allows users to specify a custom Kubernetes schedulers. - Create additional Kubernetes custom resources for driver/executor scheduling. - Set scheduler hints according to configuration or existing Pod info dynamically. +#### Using Volcano as Customized Scheduler for Spark on Kubernetes + +**This feature is currently experimental. In future versions, there may be behavioral changes around configuration, feature step improvement.** + +##### Prerequisites +* Spark on Kubernetes with Volcano as a custom scheduler is supported since Spark v3.3.0 and Volcano v1.5.1. +* See also [Volcano installation](https://volcano.sh/en/docs/installation). + +##### Build +To create a Spark distribution along with Volcano suppport like those distributed by the Spark [Downloads page](https://spark.apache.org/downloads.html): + +``` +./dev/make-distribution.sh --name custom-spark --pip --r --tgz -Psparkr -Phive -Phive-thriftserver -Pmesos -Pyarn -Pkubernetes -Pvolcano +``` + +##### Usage +Spark on Kubernetes allows using Volcano as a custom scheduler. Users can use Volcano to +support more advanced resource scheduling: queue scheduling, resource reservation, priority scheduling, and more. + +To use Volcano as a custom scheduler the user needs to specify the following configuration options: + +``` +# Specify volcano scheduler +--conf spark.kubernetes.scheduler.name=volcano +# Specify driver/executor VolcanoFeatureStep +--conf spark.kubernetes.driver.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep +--conf spark.kubernetes.executor.pod.featureSteps=org.apache.spark.deploy.k8s.features.VolcanoFeatureStep +# Specify PodGroup template +--conf spark.kubernetes.scheduler.volcano.podGroupTemplateFile=/path/to/podgroup-template.yaml +``` + +##### Volcano Feature Step +Volcano feature steps help users to create a Volcano PodGroup and set driver/executor pod annotation to link with this PodGroup. + +Note that currently only driver/job level PodGroup is supported in Volcano Feature Step. + +##### Volcano PodGroup Template +Volcano defines PodGroup spec using [CRD yaml](https://volcano.sh/en/docs/podgroup/#example) + +Similar to [Pod template](#pod-template), Spark users can similarly use Volcano PodGroup Template to define the PodGroup spec configurations. + +To do so, specify the Spark properties `spark.kubernetes.scheduler.volcano.podGroupTemplateFile` to point to files accessible to the `spark-submit` process. + +Below is an example of PodGroup template, see also [PodGroup Introduction](https://volcano.sh/en/docs/podgroup/#introduction): + +``` +apiVersion: scheduling.volcano.sh/v1beta1 +kind: PodGroup +spec: + # Specify minMember to 1 to make driver + minMember: 1 + # Specify minResources to support resource reservation + minResources: + cpu: "2" + memory: "3Gi" + # Specify the priority + priorityClassName: high-priority + queue: default Review comment: For consistency, please add a comment for `queue` like the other items. `queue` is new concept to the users. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
