ryankert01 commented on code in PR #474: URL: https://github.com/apache/yunikorn-site/pull/474#discussion_r1797077542
########## docs/user_guide/workloads/run_spark.md: ########## @@ -25,12 +25,86 @@ specific language governing permissions and limitations under the License. --> +## Deploy Spark job with Spark Operator and Helm + +:::note +Pre-requisites: +- This tutorial assumes YuniKorn is [installed](../../get_started/get_started.md) under the namespace `yunikorn` +- Use spark-operator version >= 2.0 to enable support for YuniKorn gang scheduling +::: + +:::warning +This installation involves installing YuniKorn and Spark operator, which may take a few minutes to complete. To check the status we can use `kubectl get pods -n yunikorn` and `kubectl get pods -n spark-operator` +::: + +### Install YuniKorn + +A simple script to install YuniKorn under the namespace `yunikorn`, refer to [Get Started](../../get_started/get_started.md) for more details. + +```shell script +helm repo add yunikorn https://apache.github.io/yunikorn-release +helm repo update +helm install yunikorn yunikorn/yunikorn --create-namespace --namespace yunikorn +``` + +### Install spark operator + +We should install `spark-operator` with `controller.batchScheduler.enable=true` and set `controller.batchScheduler.default=yunikorn`. It's optional to set the default scheduler to YuniKorn since you can specify it later on, but it's recommended to do so. +Also, note that our total allocated memory is `memory + memoryOverhead + spark.executor.pyspark.memory`, which will further propagate to the `yunikorn.apache.org/task-groups` annotation for subsequent use. Review Comment: once [spark-operator#2209](https://github.com/kubeflow/spark-operator/pull/2209) merged and released, it would be `memory + memoryOverhead + spark.executor.pyspark.memory + spark.memory.offHeap.size` opened a jira to track it: https://issues.apache.org/jira/browse/YUNIKORN-2919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
