Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

Artemis User Thu, 03 Nov 2022 11:36:51 -0700

Now I see what you want to do. If you have access to the clusterconfiguration files, you can modify the spark-env.sh file on the workernodes to specify exactly which node you'd like to link with GPU coresand which one not. This would allow only those nodes configured withGPU-resources getting scheduled/acquired for your GPU tasks (see Rapidsuser guide athttps://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html).

We are using Rapids in our on-prem Spark environment with completecontrol of OS, file and network systems, containers and evenhardware/GPU settings. I guess you are using one of the cloud servicesso I am not sure if you have access to the low-level cluster config onEMR or GCP, which gave you a cookie-cutter type of cluster settings withlimited configurability. But under the hood, I believe they do useNvidia Rapids which currently is the only option for GPU acceleration inSpark (Spark 3.x.x distribution package doesn't include Rapids or anyGPU integration libs). So you may want to dive into the Rapidsinstructions for more configuration and usage info (it does providedetailed instructions on how to run Rapids on EMR, Databricks and GCP).


On 11/3/22 12:10 PM, Shay Elbaz wrote:

Thanks again Artemis, I really appreciate it. I have watched the videobut did not find an answer.
Please bear with me just one more iteration 🙂

Maybe I'll be more specific:
Suppose I start the application with maxExecutors=500,executors.cores=2, because that's the amount of resources needed forthe ETL part. But for the DL part I only need 20 GPUs. SLS API onlyallows to set the resources per executor/task, so Spark would (try to)allocate up to 500 GPUs, assuming I configure the profile with 1 GPUper executor.
So, the question is how do I limit the stage resources to 20 GPUs total?

Thanks again,
Shay

------------------------------------------------------------------------
*From:* Artemis User <arte...@dtechspace.com>
*Sent:* Thursday, November 3, 2022 5:23 PM
*To:* user@spark.apache.org <user@spark.apache.org>
*Subject:* [EXTERNAL] Re: Re: Stage level scheduling - lower thenumber of executors when using GPUs
*ATTENTION:*This email originated from outside of GM.
Shay, You may find this video helpful (with some API code samplesthat you are looking for).https://www.youtube.com/watch?v=JNQu-226wUc&t=171s<https://www.youtube.com/watch?v=JNQu-226wUc&t=171s>. The issue hereisn't how to limit the number of executors but to request for theright GPU-enabled executors dynamically. Those executors used inpre-GPU stages should be returned back to resource managers withdynamic resource allocation enabled (and with the right DRApolicies). Hope this helps..
Unfortunately there isn't a lot of detailed docs for this topic sinceGPU acceleration is kind of new in Spark (not straightforward like inTF). I wish the Spark doc team could provide more details in thenext release...
On 11/3/22 2:37 AM, Shay Elbaz wrote:
Thanks Artemis. We are *not* using Rapids, but rather using GPUsthrough the Stage Level Scheduling feature with ResourceProfile. InKubernetes you have to turn on shuffle tracking for dynamicallocation, anyhow.The question is how we can limit the *number of executors *whenbuilding a new ResourceProfile, directly (API) or indirectly (someadvanced workaround).
Thanks,
Shay

------------------------------------------------------------------------
*From:* Artemis User <arte...@dtechspace.com><mailto:arte...@dtechspace.com>
*Sent:* Thursday, November 3, 2022 1:16 AM
*To:* user@spark.apache.org <mailto:user@spark.apache.org><user@spark.apache.org> <mailto:user@spark.apache.org>*Subject:* [EXTERNAL] Re: Stage level scheduling - lower the numberof executors when using GPUs
*ATTENTION:*This email originated from outside of GM.
Are you using Rapids for GPU support in Spark? Couple of options youmay want to try:
 1. In addition to dynamic allocation turned on, you may also need to
    turn on external shuffling service.
 2. Sounds like you are using Kubernetes.  In that case, you may also
    need to turn on shuffle tracking.
 3. The "stages" are controlled by the APIs.  The APIs for dynamic
    resource request (change of stage) do exist, but only for RDDs
    (e.g. TaskResourceRequest and ExecutorResourceRequest).


On 11/2/22 11:30 AM, Shay Elbaz wrote:
Hi,
Our typical applications need less *executors* for a GPU stage thanfor a CPU stage. We are using dynamic allocation with stage levelscheduling, and Spark tries to maximize the number of executors alsoduring the GPU stage, causing a bit of resources chaos in thecluster. This forces us to use a lower value for 'maxExecutors' inthe first place, at the cost of the CPU stages performance. Or tryto solve this in the Kubernets scheduler level, which is notstraightforward and doesn't feel like the right way to go.
Is there a way to effectively use less executors in Stage LevelScheduling? The API does not seem to include such an option, butmaybe there is some more advanced workaround?
Thanks,
Shay

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

Reply via email to