I do not know about AWS but in Google Kubernetes Cluster (GKE), you can autoscale as follows
gcloud container clusters create example-cluster \ --num-nodes 2 \ --zone us-central1-a \ --node-locations us-central1-a,us-central1-b,us-central1-f \ --enable-autoscaling --min-nodes 1 --max-nodes 4 https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 2 Feb 2022 at 16:40, Han Lin <h...@twistbioscience.com> wrote: > For Kubernetes yes -- we use AWS EKS. We manage the Spark cluster > ourselves and it is deployed via Helm chart. > > On Wed, Feb 2, 2022 at 8:32 AM Mich Talebzadeh <mich.talebza...@gmail.com> > wrote: > >> Is this part of cloud service offerings like Google GKE? >> >> >> >> view my Linkedin profile >> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >> >> >> >> *Disclaimer:* Use it at your own risk. Any and all responsibility for >> any loss, damage or destruction of data or any other property which may >> arise from relying on this email's technical content is explicitly >> disclaimed. The author will in no case be liable for any monetary damages >> arising from such loss, damage or destruction. >> >> >> >> >> On Wed, 2 Feb 2022 at 13:17, Han Lin <h...@twistbioscience.com> wrote: >> >>> Hello, >>> >>> Before I ask my questions, let me say what I am trying to do and briefly >>> describe the setup I have so far. I am basically building an API service >>> that serves a ML model which uses Spark ML. I have Spark deployed in >>> Kubernetes in standalone mode (the default Spark manager) with 2 worker >>> nodes, each worker running as a pod. The API app itself is in Python using >>> FastAPI, and uses PySpark to start a Spark application in the Spark >>> cluster. The Spark Driver is created on app startup and lives in the same >>> pod as the app, so the Spark application is expected to run for as long as >>> the API pod is alive, which is indefinitely. I understand that this is not >>> the typical pattern, since usually a Spark application is expected to >>> complete at some point, as a kind of workload that eventually returns a >>> result. >>> When load on the API increases, I would like to have the Spark cluster >>> scale up automatically by increasing the number of worker replicas, and >>> have Spark application scale up by using those new workers in the cluster. >>> For controlling horizontal pod scaling in Kubernetes we use KEDA >>> <https://keda.sh/>, and in this case I will probably use certain >>> Prometheus metrics exposed by Spark Master as the scaling trigger. This >>> leads to my questions. >>> >>> 1. I initially thought the number of pending jobs (jobs waiting for >>> available executors) in the Spark cluster would be a good metric to use as >>> a scaling trigger. But after some digging, I found that "pending" is not >>> one of the statuses of jobs. Spark doesn't seem to have an obvious concept >>> of a job-queue. Are there any metrics that would give some indication of >>> the number of jobs in the cluster waiting to be run? >>> 2. Are there other Spark metrics that would make sense to use as a >>> scaling trigger? >>> 3. Does this setup of having perpetual-running Spark applications even >>> make much sense in the first place? >>> >>> Thanks very much in advance for any advice! >>> >>