Re: Re: should one every make a spark streaming job in pyspark

2022-11-03 Thread Lingzhe Sun
In addition to that: For now some stateful operations in structured streaming don't have equivalent python API, e.g. flatMapGroupsWithState. However spark engineers are making it possible in the upcoming version. See more:

Re: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Shay Elbaz
This is exactly what we ended up doing! The only drawback I saw with this approach is that the GPU tasks get pretty big (in terms of data and compute time), and task failures become expansive. That's why I reached out to the mailing list in the first place  Normally I try to aim for anything

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Artemis User
Now I see what you want to do.  If you have access to the cluster configuration files, you can modify the spark-env.sh file on the worker nodes to specify exactly which node you'd like to link with GPU cores and which one not.  This would allow only those nodes configured with GPU-resources

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Tom Graves
Stage level scheduling does not allow you to change configs right now. This is something we thought about as follow on but have never implemented.  How many tasks on the DL stage are you running?  The typical case is run some etl lots of tasks... do mapPartitions and then run your DL stuff,

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Sean Owen
Er, wait, this is what stage-level scheduling is right? this has existed since 3.1 https://issues.apache.org/jira/browse/SPARK-27495 On Thu, Nov 3, 2022 at 12:10 PM bo yang wrote: > Interesting discussion here, looks like Spark does not support configuring > different number of executors in

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread bo yang
Interesting discussion here, looks like Spark does not support configuring different number of executors in different stages. Would love to see the community come out such a feature. On Thu, Nov 3, 2022 at 9:10 AM Shay Elbaz wrote: > Thanks again Artemis, I really appreciate it. I have watched

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Shay Elbaz
Thanks again Artemis, I really appreciate it. I have watched the video but did not find an answer. Please bear with me just one more iteration  Maybe I'll be more specific: Suppose I start the application with maxExecutors=500, executors.cores=2, because that's the amount of resources needed

Re: [EXTERNAL] Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Artemis User
Shay,  You may find this video helpful (with some API code samples that you are looking for). https://www.youtube.com/watch?v=JNQu-226wUc=171s.  The issue here isn't how to limit the number of executors but to request for the right GPU-enabled executors dynamically.  Those executors used in

Re: should one every make a spark streaming job in pyspark

2022-11-03 Thread Mich Talebzadeh
Well your mileage varies so to speak. - Spark itself is written in Scala. However, that does not imply you should stick with Scala. - I have used both for spark streaming and spark structured streaming, they both work fine - PySpark has become popular with the widespread use of

Unsubscribe

2022-11-03 Thread sanjeev shrestha
Unsubscribe

unsubscribe

2022-11-03 Thread Huajian Mao
unsubscribe -- Best Regards, - Huajian

Re: [EXTERNAL] Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Shay Elbaz
Thanks Artemis. We are not using Rapids, but rather using GPUs through the Stage Level Scheduling feature with ResourceProfile. In Kubernetes you have to turn on shuffle tracking for dynamic allocation, anyhow. The question is how we can limit the number of executors when building a new