Re: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-14 Thread Shay Elbaz
We're actually running on on-prem Kubernetes with a custom-built build Spark 
image, with altered entrypoint.sh and other "low-level" scripts and configs, 
but I don't think this is a good direction to solve this specific issue.

Shay

From: Artemis User 
Sent: Thursday, November 3, 2022 8:35 PM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.

  Now I see what you want to do.  If you have access to the cluster 
configuration files, you can modify the spark-env.sh file on the worker nodes 
to specify exactly which node you'd like to link with GPU cores and which one 
not.  This would allow only those nodes configured with GPU-resources getting 
scheduled/acquired for your GPU tasks (see Rapids user guide at 
https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html).

We are using Rapids in our on-prem Spark environment with complete control of 
OS, file and network systems, containers and even hardware/GPU settings.  I 
guess you are using one of the cloud services so I am not sure if you have 
access to the low-level cluster config on EMR or GCP, which gave you a 
cookie-cutter type of cluster settings with limited configurability.  But under 
the hood, I believe they do use Nvidia Rapids which currently is the only 
option for GPU acceleration in Spark (Spark 3.x.x distribution package doesn't 
include Rapids or any GPU integration libs).  So you may want to dive into the 
Rapids instructions for more configuration and usage info (it does provide 
detailed instructions on how to run Rapids on EMR, Databricks and GCP).

On 11/3/22 12:10 PM, Shay Elbaz wrote:
Thanks again Artemis, I really appreciate it. I have watched the video but did 
not find an answer.

Please bear with me just one more iteration 🙂

Maybe I'll be more specific:
Suppose I start the application with maxExecutors=500, executors.cores=2, 
because that's the amount of resources needed for the ETL part. But for the DL 
part I only need 20 GPUs. SLS API only allows to set the resources per 
executor/task, so Spark would (try to) allocate up to 500 GPUs, assuming I 
configure the profile with 1 GPU per executor.
So, the question is how do I limit the stage resources to 20 GPUs total?

Thanks again,
Shay


From: Artemis User <mailto:arte...@dtechspace.com>
Sent: Thursday, November 3, 2022 5:23 PM
To: user@spark.apache.org<mailto:user@spark.apache.org> 
<mailto:user@spark.apache.org>
Subject: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.

  Shay,  You may find this video helpful (with some API code samples that you 
are looking for).  https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The 
issue here isn't how to limit the number of executors but to request for the 
right GPU-enabled executors dynamically.  Those executors used in pre-GPU 
stages should be returned back to resource managers with dynamic resource 
allocation enabled (and with the right DRA policies).  Hope this helps..

Unfortunately there isn't a lot of detailed docs for this topic since GPU 
acceleration is kind of new in Spark (not straightforward like in TF).   I wish 
the Spark doc team could provide more details in the next release...

On 11/3/22 2:37 AM, Shay Elbaz wrote:
Thanks Artemis. We are not using Rapids, but rather using GPUs through the 
Stage Level Scheduling feature with ResourceProfile. In Kubernetes you have to 
turn on shuffle tracking for dynamic allocation, anyhow.
The question is how we can limit the number of executors when building a new 
ResourceProfile, directly (API) or indirectly (some advanced workaround).

Thanks,
Shay



From: Artemis User <mailto:arte...@dtechspace.com>
Sent: Thursday, November 3, 2022 1:16 AM
To: user@spark.apache.org<mailto:user@spark.apache.org> 
<mailto:user@spark.apache.org>
Subject: [EXTERNAL] Re: Stage level scheduling - lower the number of executors 
when using GPUs


ATTENTION: This email originated from outside of GM.

  Are you using Rapids for GPU support in Spark?  Couple of options you may 
want to try:

  1.  In addition to dynamic allocation turned on, you may also need to turn on 
external shuffling service.
  2.  Sounds like you are using Kubernetes.  In that case, you may also need to 
turn on shuffle tracking.
  3.  The "stages" are controlled by the APIs.  The APIs for dynamic resource 
request (change of stage) do exist, but only for RDDs (e.g. TaskResourceRequest 
and ExecutorResourceRequest).

On 11/2/22 11:30 AM, Shay Elbaz wrote:
Hi,

Our typical applications need less executors for a GPU stage than for a CPU 
stage. We are using dynamic allocati

Re: [EXTERNAL] Re: Re: Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-06 Thread Shay Elbaz
I don't think there is a definitive right or wrong approach here. The SLS 
feature would not have been added to Spark if there was no real need for it, 
and AFAIK it required quite a bit of refactoring of Spark internals. So I'm 
sure this discussion was already made in the developers community  :)

In my specific case, I need it also for interactive dev/research sessions on 
Jupyter notebooks, and it makes more sense to switch resources than stopping 
the session and starting a new one (over and over again).

Shay

From: ayan guha 
Sent: Sunday, November 6, 2022 4:19 PM
To: Shay Elbaz 
Cc: Artemis User ; Tom Graves ; 
Tom Graves ; user@spark.apache.org 

Subject: [EXTERNAL] Re: Re: Re: Re: Re: Stage level scheduling - lower the 
number of executors when using GPUs


ATTENTION: This email originated from outside of GM.


May I ask why the ETL job and DL ( Assuming you mean deep learning here) task 
can not be run as 2 separate spark job?

IMHO it is better practice to split up entire pipeline into logical steps and 
orchestrate them.

That way you can pick your profile as you need for 2 very different type of 
workloads.

Ayan

On Sun, 6 Nov 2022 at 12:04 am, Shay Elbaz 
mailto:shay.el...@gm.com>> wrote:
Consider this:

  1.  The application is allowed to use only 20 GPUs.
  2.  To ensure exactly 20 GPUs, I use the 
df.rdd.repartition(20).withResources(gpus.build).mapPartitions(func) technique. 
(maxExecutors >> 20).
  3.  Given the volume of the input data, it takes 20 hours total to run the DL 
part (computer vision) on 20 GPUs, or 1 hour per GPU task.

Normally, I would repartition to 200 partitions to get a finer grained ~6 
minutes tasks instead of 1 hour. But here we're "forced" to use only 20 
partitions. To be clear, I'm only referring to potential failures/lags here. 
The job needs at least 20 hours total (on 20 GPUs) no matter what, but if any 
task fails after 50 minutes for example, we have to re-process these 50 minutes 
again. Or if a task/executor lags behind due to environment issues, then 
speculative execution will only trigger another task after 1 hour. These issues 
would be avoided if we used 200 partitions, but then Spark will try to allocate 
more than 20 GPUs.

I hope that was more clear.
Thank you very much for helping.

Shay


From: Tom Graves mailto:tgraves...@yahoo.com>>
Sent: Friday, November 4, 2022 4:19 PM
To: Tom Graves ; Artemis User 
mailto:arte...@dtechspace.com>>; 
user@spark.apache.org<mailto:user@spark.apache.org> 
mailto:user@spark.apache.org>>; Shay Elbaz 
mailto:shay.el...@gm.com>>
Subject: [EXTERNAL] Re: Re: Re: Re: Stage level scheduling - lower the number 
of executors when using GPUs


ATTENTION: This email originated from outside of GM.


So I'm not sure I completely follow. Are you asking for a way to change the 
limit without having to do the repartition?  And your DL software doesn't care 
if you got say 30 executors instead of 20?  Normally I would expect the number 
fo partitions at that point to be 200 (or whatever you set for your shuffle 
partitions) unless you are using AQE coalescing partitions functionality and 
then it could change. Are you using the latter?

> Normally I try to aim for anything between 30s-5m per task (failure-wise), 
> depending on the cluster, its stability, etc. But in this case, individual 
> tasks can take 30-60 minutes, if not much more. Any failure during this long 
> time is pretty expensive.

Are you saying when you manually do the repartition your DL tasks take 30-60 
minutes?  so again you want like AQE coalesce partitions to kick in to attempt 
to pick partition sizes for your?


Tom

On Thursday, November 3, 2022 at 03:18:07 PM CDT, Shay Elbaz 
mailto:shay.el...@gm.com>> wrote:


This is exactly what we ended up doing! The only drawback I saw with this 
approach is that the GPU tasks get pretty big (in terms of data and compute 
time), and task failures become expansive. That's why I reached out to the 
mailing list in the first place 🙂
Normally I try to aim for anything between 30s-5m per task (failure-wise), 
depending on the cluster, its stability, etc. But in this case, individual 
tasks can take 30-60 minutes, if not much more. Any failure during this long 
time is pretty expensive.


Shay

From: Tom Graves 
Sent: Thursday, November 3, 2022 7:56 PM
To: Artemis User mailto:arte...@dtechspace.com>>; 
user@spark.apache.org<mailto:user@spark.apache.org> 
mailto:user@spark.apache.org>>; Shay Elbaz 
mailto:shay.el...@gm.com>>
Subject: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.


Stage level scheduling does not allow you to change configs right now. This is 
something we thought about as f

Re: [EXTERNAL] Re: Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-06 Thread ayan guha
May I ask why the ETL job and DL ( Assuming you mean deep learning here)
task can not be run as 2 separate spark job?

IMHO it is better practice to split up entire pipeline into logical steps
and orchestrate them.

That way you can pick your profile as you need for 2 very different type of
workloads.

Ayan

On Sun, 6 Nov 2022 at 12:04 am, Shay Elbaz  wrote:

> Consider this:
>
>1. The application is allowed to use only 20 GPUs.
>2. To ensure exactly 20 GPUs, I use the *df*.
>*rdd.repartition(20).withResources(gpus.build).mapPartitions(func)* 
> technique.
>(maxExecutors >> 20).
>3. Given the volume of the input data, it takes 20 hours *total* to
>run the DL part (computer vision) on 20 GPUs, or* 1 hour per GPU task*.
>
> Normally, I would repartition to 200 partitions to get a finer grained ~6
> minutes tasks instead of 1 hour. But here we're "forced" to use only 20
> partitions. To be clear, I'm only referring to potential failures/lags
> here. The job needs at least 20 hours total (on 20 GPUs) no matter what,
> but if any task fails after 50 minutes for example, we have to re-process
> these 50 minutes again. Or if a task/executor lags behind due to
> environment issues, then speculative execution will only trigger another
> task after 1 hour. These issues would be avoided if we used 200 partitions,
> but then Spark will try to allocate more than 20 GPUs.
>
> I hope that was more clear.
> Thank you very much for helping.
>
> Shay
>
> --
> *From:* Tom Graves 
> *Sent:* Friday, November 4, 2022 4:19 PM
> *To:* Tom Graves ; Artemis User <
> arte...@dtechspace.com>; user@spark.apache.org ;
> Shay Elbaz 
> *Subject:* [EXTERNAL] Re: Re: Re: Re: Stage level scheduling - lower the
> number of executors when using GPUs
>
>
> *ATTENTION:* This email originated from outside of GM.
>
>
> So I'm not sure I completely follow. Are you asking for a way to change
> the limit without having to do the repartition?  And your DL software
> doesn't care if you got say 30 executors instead of 20?  Normally I would
> expect the number fo partitions at that point to be 200 (or whatever you
> set for your shuffle partitions) unless you are using AQE coalescing
> partitions functionality and then it could change. Are you using the latter?
>
> > Normally I try to aim for anything between 30s-5m per
> *task (failure-wise)*, depending on the cluster, its stability, etc. But
> in this case, individual tasks can take 30-60 minutes, if not much more.
> Any failure during this long time is pretty expensive.
>
> Are you saying when you manually do the repartition your DL tasks take
> 30-60 minutes?  so again you want like AQE coalesce partitions to kick in
> to attempt to pick partition sizes for your?
>
>
> Tom
>
> On Thursday, November 3, 2022 at 03:18:07 PM CDT, Shay Elbaz <
> shay.el...@gm.com> wrote:
>
>
> This is exactly what we ended up doing! The only drawback I saw with this
> approach is that the GPU tasks get pretty big (in terms of data and compute
> time), and task failures become expansive. That's why I reached out to the
> mailing list in the first place 🙂
> Normally I try to aim for anything between 30s-5m per
> *task (failure-wise)*, depending on the cluster, its stability, etc. But
> in this case, individual tasks can take 30-60 minutes, if not much more.
> Any failure during this long time is pretty expensive.
>
>
> Shay
> --
> *From:* Tom Graves 
> *Sent:* Thursday, November 3, 2022 7:56 PM
> *To:* Artemis User ; user@spark.apache.org <
> user@spark.apache.org>; Shay Elbaz 
> *Subject:* [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the
> number of executors when using GPUs
>
>
> *ATTENTION:* This email originated from outside of GM.
>
>
> Stage level scheduling does not allow you to change configs right now.
> This is something we thought about as follow on but have never
> implemented.  How many tasks on the DL stage are you running?  The typical
> case is run some etl lots of tasks... do mapPartitions and then run your DL
> stuff, before that mapPartitions you could do a repartition if necessary to
> get to exactly the number of tasks you want (20).  That way even if
> maxExecutors=500 you will only ever need 20 or whatever you repartition to
> and spark isn't going to ask for more then that.
>
> Tom
>
> On Thursday, November 3, 2022 at 11:10:31 AM CDT, Shay Elbaz <
> shay.el...@gm.com> wrote:
>
>
> Thanks again Artemis, I really appreciate it. I have watched the video
> but did not find an answer.
>
> Please bear with me just one more iteration

Re: [EXTERNAL] Re: Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-05 Thread Shay Elbaz
Consider this:

  1.  The application is allowed to use only 20 GPUs.
  2.  To ensure exactly 20 GPUs, I use the 
df.rdd.repartition(20).withResources(gpus.build).mapPartitions(func) technique. 
(maxExecutors >> 20).
  3.  Given the volume of the input data, it takes 20 hours total to run the DL 
part (computer vision) on 20 GPUs, or 1 hour per GPU task.

Normally, I would repartition to 200 partitions to get a finer grained ~6 
minutes tasks instead of 1 hour. But here we're "forced" to use only 20 
partitions. To be clear, I'm only referring to potential failures/lags here. 
The job needs at least 20 hours total (on 20 GPUs) no matter what, but if any 
task fails after 50 minutes for example, we have to re-process these 50 minutes 
again. Or if a task/executor lags behind due to environment issues, then 
speculative execution will only trigger another task after 1 hour. These issues 
would be avoided if we used 200 partitions, but then Spark will try to allocate 
more than 20 GPUs.

I hope that was more clear.
Thank you very much for helping.

Shay


From: Tom Graves 
Sent: Friday, November 4, 2022 4:19 PM
To: Tom Graves ; Artemis User 
; user@spark.apache.org ; Shay 
Elbaz 
Subject: [EXTERNAL] Re: Re: Re: Re: Stage level scheduling - lower the number 
of executors when using GPUs


ATTENTION: This email originated from outside of GM.


So I'm not sure I completely follow. Are you asking for a way to change the 
limit without having to do the repartition?  And your DL software doesn't care 
if you got say 30 executors instead of 20?  Normally I would expect the number 
fo partitions at that point to be 200 (or whatever you set for your shuffle 
partitions) unless you are using AQE coalescing partitions functionality and 
then it could change. Are you using the latter?

> Normally I try to aim for anything between 30s-5m per task (failure-wise), 
> depending on the cluster, its stability, etc. But in this case, individual 
> tasks can take 30-60 minutes, if not much more. Any failure during this long 
> time is pretty expensive.

Are you saying when you manually do the repartition your DL tasks take 30-60 
minutes?  so again you want like AQE coalesce partitions to kick in to attempt 
to pick partition sizes for your?


Tom

On Thursday, November 3, 2022 at 03:18:07 PM CDT, Shay Elbaz 
 wrote:


This is exactly what we ended up doing! The only drawback I saw with this 
approach is that the GPU tasks get pretty big (in terms of data and compute 
time), and task failures become expansive. That's why I reached out to the 
mailing list in the first place 🙂
Normally I try to aim for anything between 30s-5m per task (failure-wise), 
depending on the cluster, its stability, etc. But in this case, individual 
tasks can take 30-60 minutes, if not much more. Any failure during this long 
time is pretty expensive.


Shay

From: Tom Graves 
Sent: Thursday, November 3, 2022 7:56 PM
To: Artemis User ; user@spark.apache.org 
; Shay Elbaz 
Subject: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.


Stage level scheduling does not allow you to change configs right now. This is 
something we thought about as follow on but have never implemented.  How many 
tasks on the DL stage are you running?  The typical case is run some etl lots 
of tasks... do mapPartitions and then run your DL stuff, before that 
mapPartitions you could do a repartition if necessary to get to exactly the 
number of tasks you want (20).  That way even if maxExecutors=500 you will only 
ever need 20 or whatever you repartition to and spark isn't going to ask for 
more then that.

Tom

On Thursday, November 3, 2022 at 11:10:31 AM CDT, Shay Elbaz 
 wrote:


Thanks again Artemis, I really appreciate it. I have watched the video but did 
not find an answer.

Please bear with me just one more iteration 🙂

Maybe I'll be more specific:
Suppose I start the application with maxExecutors=500, executors.cores=2, 
because that's the amount of resources needed for the ETL part. But for the DL 
part I only need 20 GPUs. SLS API only allows to set the resources per 
executor/task, so Spark would (try to) allocate up to 500 GPUs, assuming I 
configure the profile with 1 GPU per executor.
So, the question is how do I limit the stage resources to 20 GPUs total?

Thanks again,
Shay


From: Artemis User 
Sent: Thursday, November 3, 2022 5:23 PM

To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.

  Shay,  You may find this video helpful (with some API code samples that you 
are looking for).  https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The 
issue here isn't h

Re: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-04 Thread Tom Graves
 So I'm not sure I completely follow. Are you asking for a way to change the 
limit without having to do the repartition?  And your DL software doesn't care 
if you got say 30 executors instead of 20?  Normally I would expect the number 
fo partitions at that point to be 200 (or whatever you set for your shuffle 
partitions) unless you are using AQE coalescing partitions functionality and 
then it could change. Are you using the latter?
> Normally I try to aim for anything between 30s-5m per task (failure-wise), 
> depending on the cluster, its stability, etc. But in this case, individual 
> tasks can take 30-60 minutes, if not much more. Any failure during this long 
> time is pretty expensive.
Are you saying when you manually do the repartition your DL tasks take 30-60 
minutes?  so again you want like AQE coalesce partitions to kick in to attempt 
to pick partition sizes for your?


Tom

On Thursday, November 3, 2022 at 03:18:07 PM CDT, Shay Elbaz 
 wrote:  
 
 #yiv4404278030 P {margin-top:0;margin-bottom:0;}This is exactly what we ended 
up doing! The only drawback I saw with this approach is that the GPU tasks get 
pretty big (in terms of data and compute time), and task failures become 
expansive. That's why I reached out to the mailing list in the first place 🙂 
Normally I try to aim for anything between 30s-5m per task (failure-wise), 
depending on the cluster, its stability, etc. But in this case, individual 
tasks can take 30-60 minutes, if not much more. Any failure during this long 
time is pretty expensive.

ShayFrom: Tom Graves 
Sent: Thursday, November 3, 2022 7:56 PM
To: Artemis User ; user@spark.apache.org 
; Shay Elbaz 
Subject: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs 


| 
ATTENTION: This email originated from outside of GM.
 |


 Stage level scheduling does not allow you to change configs right now. This is 
something we thought about as follow on but have never implemented.  How many 
tasks on the DL stage are you running?  The typical case is run some etl lots 
of tasks... do mapPartitions and then run your DL stuff, before that 
mapPartitions you could do a repartition if necessary to get to exactly the 
number of tasks you want (20).  That way even if maxExecutors=500 you will only 
ever need 20 or whatever you repartition to and spark isn't going to ask for 
more then that.
Tom

On Thursday, November 3, 2022 at 11:10:31 AM CDT, Shay Elbaz 
 wrote:

Thanks again Artemis, I really appreciate it. 
I have watched the video but did not find an answer.
Please bear with me just one more iteration 🙂
Maybe I'll be more specific:Suppose I start the application with 
maxExecutors=500, executors.cores=2, because that's the amount of resources 
needed for the ETL part. But for the DL part I only need 20 GPUs. SLS API only 
allows to set the resources per executor/task, so Spark would (try to) allocate 
up to 500 GPUs, assuming I configure the profile with 1 GPU per executor. So, 
the question is how do I limit the stage resources to 20 GPUs total? 
Thanks again,Shay
From: Artemis User 
Sent: Thursday, November 3, 2022 5:23 PM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs 


| 
ATTENTION: This email originated from outside of GM.
 |


  Shay,  You may find this video helpful (with some API code samples that you 
are looking for). https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The 
issue here isn't how to limit the number of executors but to request for the 
right GPU-enabled executors dynamically.  Those executors used in pre-GPU 
stages should be returned back to resource managers with dynamic resource 
allocation enabled (and with the right DRA policies).  Hope this helps..

Unfortunately there isn't a lot of detailed docs for this topic since GPU 
acceleration is kind of new in Spark (not straightforward like in TF).   I wish 
the Spark doc team could provide more details in the next release...

On 11/3/22 2:37 AM, Shay Elbaz wrote:

Thanks Artemis. We are not using Rapids, 
but rather using GPUs through the Stage Level Scheduling feature with 
ResourceProfile. In Kubernetes you have to turn on shuffle tracking for dynamic 
allocation, anyhow.The question is how we can limit thenumber of executors when 
building a new ResourceProfile, directly (API) or indirectly (some advanced 
workaround).
Thanks,Shay 
From: Artemis User
Sent: Thursday, November 3, 2022 1:16 AM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Stage level scheduling - lower the number of executors 
when using GPUs 
| 
ATTENTION: This email originated from outside of GM.
 |


  Are you using Rapids for GPU support in Spark?  Couple of options you may 
want to try:
   
   - In addition to dynamic allocation turned on, you may also need to turn on 
external shuffling service.   

   - Sounds like you are using Kubernet

Re: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Shay Elbaz
This is exactly what we ended up doing! The only drawback I saw with this 
approach is that the GPU tasks get pretty big (in terms of data and compute 
time), and task failures become expansive. That's why I reached out to the 
mailing list in the first place 🙂
Normally I try to aim for anything between 30s-5m per task (failure-wise), 
depending on the cluster, its stability, etc. But in this case, individual 
tasks can take 30-60 minutes, if not much more. Any failure during this long 
time is pretty expensive.


Shay

From: Tom Graves 
Sent: Thursday, November 3, 2022 7:56 PM
To: Artemis User ; user@spark.apache.org 
; Shay Elbaz 
Subject: [EXTERNAL] Re: Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.


Stage level scheduling does not allow you to change configs right now. This is 
something we thought about as follow on but have never implemented.  How many 
tasks on the DL stage are you running?  The typical case is run some etl lots 
of tasks... do mapPartitions and then run your DL stuff, before that 
mapPartitions you could do a repartition if necessary to get to exactly the 
number of tasks you want (20).  That way even if maxExecutors=500 you will only 
ever need 20 or whatever you repartition to and spark isn't going to ask for 
more then that.

Tom

On Thursday, November 3, 2022 at 11:10:31 AM CDT, Shay Elbaz 
 wrote:


Thanks again Artemis, I really appreciate it. I have watched the video but did 
not find an answer.

Please bear with me just one more iteration 🙂

Maybe I'll be more specific:
Suppose I start the application with maxExecutors=500, executors.cores=2, 
because that's the amount of resources needed for the ETL part. But for the DL 
part I only need 20 GPUs. SLS API only allows to set the resources per 
executor/task, so Spark would (try to) allocate up to 500 GPUs, assuming I 
configure the profile with 1 GPU per executor.
So, the question is how do I limit the stage resources to 20 GPUs total?

Thanks again,
Shay


From: Artemis User 
Sent: Thursday, November 3, 2022 5:23 PM

To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.

  Shay,  You may find this video helpful (with some API code samples that you 
are looking for).  https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The 
issue here isn't how to limit the number of executors but to request for the 
right GPU-enabled executors dynamically.  Those executors used in pre-GPU 
stages should be returned back to resource managers with dynamic resource 
allocation enabled (and with the right DRA policies).  Hope this helps..

Unfortunately there isn't a lot of detailed docs for this topic since GPU 
acceleration is kind of new in Spark (not straightforward like in TF).   I wish 
the Spark doc team could provide more details in the next release...

On 11/3/22 2:37 AM, Shay Elbaz wrote:
Thanks Artemis. We are not using Rapids, but rather using GPUs through the 
Stage Level Scheduling feature with ResourceProfile. In Kubernetes you have to 
turn on shuffle tracking for dynamic allocation, anyhow.
The question is how we can limit the number of executors when building a new 
ResourceProfile, directly (API) or indirectly (some advanced workaround).

Thanks,
Shay



From: Artemis User <mailto:arte...@dtechspace.com>
Sent: Thursday, November 3, 2022 1:16 AM
To: user@spark.apache.org<mailto:user@spark.apache.org> 
<mailto:user@spark.apache.org>
Subject: [EXTERNAL] Re: Stage level scheduling - lower the number of executors 
when using GPUs


ATTENTION: This email originated from outside of GM.

  Are you using Rapids for GPU support in Spark?  Couple of options you may 
want to try:

  1.  In addition to dynamic allocation turned on, you may also need to turn on 
external shuffling service.
  2.  Sounds like you are using Kubernetes.  In that case, you may also need to 
turn on shuffle tracking.
  3.  The "stages" are controlled by the APIs.  The APIs for dynamic resource 
request (change of stage) do exist, but only for RDDs (e.g. TaskResourceRequest 
and ExecutorResourceRequest).

On 11/2/22 11:30 AM, Shay Elbaz wrote:
Hi,

Our typical applications need less executors for a GPU stage than for a CPU 
stage. We are using dynamic allocation with stage level scheduling, and Spark 
tries to maximize the number of executors also during the GPU stage, causing a 
bit of resources chaos in the cluster. This forces us to use a lower value for 
'maxExecutors' in the first place, at the cost of the CPU stages performance. 
Or try to solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.

Is there a w

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Artemis User
Now I see what you want to do.  If you have access to the cluster 
configuration files, you can modify the spark-env.sh file on the worker 
nodes to specify exactly which node you'd like to link with GPU cores 
and which one not.  This would allow only those nodes configured with 
GPU-resources getting scheduled/acquired for your GPU tasks (see Rapids 
user guide at 
https://nvidia.github.io/spark-rapids/docs/get-started/getting-started-on-prem.html).


We are using Rapids in our on-prem Spark environment with complete 
control of OS, file and network systems, containers and even 
hardware/GPU settings.  I guess you are using one of the cloud services 
so I am not sure if you have access to the low-level cluster config on 
EMR or GCP, which gave you a cookie-cutter type of cluster settings with 
limited configurability.  But under the hood, I believe they do use 
Nvidia Rapids which currently is the only option for GPU acceleration in 
Spark (Spark 3.x.x distribution package doesn't include Rapids or any 
GPU integration libs).  So you may want to dive into the Rapids 
instructions for more configuration and usage info (it does provide 
detailed instructions on how to run Rapids on EMR, Databricks and GCP).


On 11/3/22 12:10 PM, Shay Elbaz wrote:
Thanks again Artemis, I really appreciate it. I have watched the video 
but did not find an answer.


Please bear with me just one more iteration 🙂

Maybe I'll be more specific:
Suppose I start the application with maxExecutors=500, 
executors.cores=2, because that's the amount of resources needed for 
the ETL part. But for the DL part I only need 20 GPUs. SLS API only 
allows to set the resources per executor/task, so Spark would (try to) 
allocate up to 500 GPUs, assuming I configure the profile with 1 GPU 
per executor.

So, the question is how do I limit the stage resources to 20 GPUs total?

Thanks again,
Shay


*From:* Artemis User 
*Sent:* Thursday, November 3, 2022 5:23 PM
*To:* user@spark.apache.org 
*Subject:* [EXTERNAL] Re: Re: Stage level scheduling - lower the 
number of executors when using GPUs


*ATTENTION:*This email originated from outside of GM.


Shay,  You may find this video helpful (with some API code samples 
that you are looking for). 
https://www.youtube.com/watch?v=JNQu-226wUc&t=171s 
<https://www.youtube.com/watch?v=JNQu-226wUc&t=171s>. The issue here 
isn't how to limit the number of executors but to request for the 
right GPU-enabled executors dynamically. Those executors used in 
pre-GPU stages should be returned back to resource managers with 
dynamic resource allocation enabled (and with the right DRA 
policies).  Hope this helps..


Unfortunately there isn't a lot of detailed docs for this topic since 
GPU acceleration is kind of new in Spark (not straightforward like in 
TF).   I wish the Spark doc team could provide more details in the 
next release...


On 11/3/22 2:37 AM, Shay Elbaz wrote:
Thanks Artemis. We are *not* using Rapids, but rather using GPUs 
through the Stage Level Scheduling feature with ResourceProfile. In 
Kubernetes you have to turn on shuffle tracking for dynamic 
allocation, anyhow.
The question is how we can limit the *number of executors *when 
building a new ResourceProfile, directly (API) or indirectly (some 
advanced workaround).


Thanks,
Shay


*From:* Artemis User  
<mailto:arte...@dtechspace.com>

*Sent:* Thursday, November 3, 2022 1:16 AM
*To:* user@spark.apache.org <mailto:user@spark.apache.org> 
 <mailto:user@spark.apache.org>
*Subject:* [EXTERNAL] Re: Stage level scheduling - lower the number 
of executors when using GPUs


*ATTENTION:*This email originated from outside of GM.


Are you using Rapids for GPU support in Spark?  Couple of options you 
may want to try:


 1. In addition to dynamic allocation turned on, you may also need to
turn on external shuffling service.
 2. Sounds like you are using Kubernetes.  In that case, you may also
need to turn on shuffle tracking.
 3. The "stages" are controlled by the APIs.  The APIs for dynamic
resource request (change of stage) do exist, but only for RDDs
(e.g. TaskResourceRequest and ExecutorResourceRequest).


On 11/2/22 11:30 AM, Shay Elbaz wrote:

Hi,

Our typical applications need less *executors* for a GPU stage than 
for a CPU stage. We are using dynamic allocation with stage level 
scheduling, and Spark tries to maximize the number of executors also 
during the GPU stage, causing a bit of resources chaos in the 
cluster. This forces us to use a lower value for 'maxExecutors' in 
the first place, at the cost of the CPU stages performance. Or try 
to solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.


Is there a way to effectively 

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Tom Graves
 Stage level scheduling does not allow you to change configs right now. This is 
something we thought about as follow on but have never implemented.  How many 
tasks on the DL stage are you running?  The typical case is run some etl lots 
of tasks... do mapPartitions and then run your DL stuff, before that 
mapPartitions you could do a repartition if necessary to get to exactly the 
number of tasks you want (20).  That way even if maxExecutors=500 you will only 
ever need 20 or whatever you repartition to and spark isn't going to ask for 
more then that.
Tom

On Thursday, November 3, 2022 at 11:10:31 AM CDT, Shay Elbaz 
 wrote:  
 
 #yiv8086956851 P {margin-top:0;margin-bottom:0;}Thanks again Artemis, I really 
appreciate it. I have watched the video but did not find an answer.
Please bear with me just one more iteration 🙂
Maybe I'll be more specific:Suppose I start the application with 
maxExecutors=500, executors.cores=2, because that's the amount of resources 
needed for the ETL part. But for the DL part I only need 20 GPUs. SLS API only 
allows to set the resources per executor/task, so Spark would (try to) allocate 
up to 500 GPUs, assuming I configure the profile with 1 GPU per executor. So, 
the question is how do I limit the stage resources to 20 GPUs total? 
Thanks again,Shay
From: Artemis User 
Sent: Thursday, November 3, 2022 5:23 PM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs 


| 
ATTENTION: This email originated from outside of GM.
 |


  Shay,  You may find this video helpful (with some API code samples that you 
are looking for). https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The 
issue here isn't how to limit the number of executors but to request for the 
right GPU-enabled executors dynamically.  Those executors used in pre-GPU 
stages should be returned back to resource managers with dynamic resource 
allocation enabled (and with the right DRA policies).  Hope this helps..

Unfortunately there isn't a lot of detailed docs for this topic since GPU 
acceleration is kind of new in Spark (not straightforward like in TF).   I wish 
the Spark doc team could provide more details in the next release...

On 11/3/22 2:37 AM, Shay Elbaz wrote:

#yiv8086956851 #yiv8086956851 --p {margin-top:0;margin-bottom:0;}#yiv8086956851 
Thanks Artemis. We are not using Rapids, but rather using GPUs through the 
Stage Level Scheduling feature with ResourceProfile. In Kubernetes you have to 
turn on shuffle tracking for dynamic allocation, anyhow.The question is how we 
can limit thenumber of executors when building a new ResourceProfile, directly 
(API) or indirectly (some advanced workaround).
Thanks,Shay 
From: Artemis User
Sent: Thursday, November 3, 2022 1:16 AM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Stage level scheduling - lower the number of executors 
when using GPUs 
| 
ATTENTION: This email originated from outside of GM.
 |


  Are you using Rapids for GPU support in Spark?  Couple of options you may 
want to try:
   
   - In addition to dynamic allocation turned on, you may also need to turn on 
external shuffling service.   

   - Sounds like you are using Kubernetes.  In that case, you may also need to 
turn on shuffle tracking.   

   - The "stages" are controlled by the APIs.  The APIs for dynamic resource 
request (change of stage) do exist, but only for RDDs (e.g. TaskResourceRequest 
and ExecutorResourceRequest).

On 11/2/22 11:30 AM, Shay Elbaz wrote:

#yiv8086956851 #yiv8086956851 --p {margin-top:0;margin-bottom:0;}#yiv8086956851 
Hi,
Our typical applications need lessexecutors for a GPU stage than for a CPU 
stage. We are using dynamic allocation with stage level scheduling, and Spark 
tries to maximize the number of executors also during the GPU stage, causing a 
bit of resources chaos in the cluster. This forces us to use a lower value for 
'maxExecutors' in the first place, at the cost of the CPU stages performance. 
Or try to solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.
Is there a way to effectively use less executors in Stage Level Scheduling? The 
API does not seem to include such an option, but maybe there is some more 
advanced workaround?
Thanks,Shay 

 






  

Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Sean Owen
Er, wait, this is what stage-level scheduling is right? this has existed
since 3.1
https://issues.apache.org/jira/browse/SPARK-27495

On Thu, Nov 3, 2022 at 12:10 PM bo yang  wrote:

> Interesting discussion here, looks like Spark does not support configuring
> different number of executors in different stages. Would love to see the
> community come out such a feature.
>
> On Thu, Nov 3, 2022 at 9:10 AM Shay Elbaz  wrote:
>
>> Thanks again Artemis, I really appreciate it. I have watched the video
>> but did not find an answer.
>>
>> Please bear with me just one more iteration 🙂
>>
>> Maybe I'll be more specific:
>> Suppose I start the application with maxExecutors=500, executors.cores=2,
>> because that's the amount of resources needed for the ETL part. But for the
>> DL part I only need 20 GPUs. SLS API only allows to set the resources per
>> executor/task, so Spark would (try to) allocate up to 500 GPUs, assuming I
>> configure the profile with 1 GPU per executor.
>> So, the question is how do I limit the stage resources to 20 GPUs total?
>>
>> Thanks again,
>> Shay
>>
>> --
>> *From:* Artemis User 
>> *Sent:* Thursday, November 3, 2022 5:23 PM
>> *To:* user@spark.apache.org 
>> *Subject:* [EXTERNAL] Re: Re: Stage level scheduling - lower the number
>> of executors when using GPUs
>>
>>
>> *ATTENTION:* This email originated from outside of GM.
>>
>>   Shay,  You may find this video helpful (with some API code samples
>> that you are looking for).
>> https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The issue here
>> isn't how to limit the number of executors but to request for the right
>> GPU-enabled executors dynamically.  Those executors used in pre-GPU stages
>> should be returned back to resource managers with dynamic resource
>> allocation enabled (and with the right DRA policies).  Hope this helps..
>>
>> Unfortunately there isn't a lot of detailed docs for this topic since GPU
>> acceleration is kind of new in Spark (not straightforward like in TF).   I
>> wish the Spark doc team could provide more details in the next release...
>>
>> On 11/3/22 2:37 AM, Shay Elbaz wrote:
>>
>> Thanks Artemis. We are *not* using Rapids, but rather using GPUs through
>> the Stage Level Scheduling feature with ResourceProfile. In Kubernetes
>> you have to turn on shuffle tracking for dynamic allocation, anyhow.
>> The question is how we can limit the *number of executors *when building
>> a new ResourceProfile, directly (API) or indirectly (some advanced
>> workaround).
>>
>> Thanks,
>> Shay
>>
>>
>> --
>> *From:* Artemis User  
>> *Sent:* Thursday, November 3, 2022 1:16 AM
>> *To:* user@spark.apache.org 
>> 
>> *Subject:* [EXTERNAL] Re: Stage level scheduling - lower the number of
>> executors when using GPUs
>>
>>
>> *ATTENTION:* This email originated from outside of GM.
>>
>>   Are you using Rapids for GPU support in Spark?  Couple of options you
>> may want to try:
>>
>>1. In addition to dynamic allocation turned on, you may also need to
>>turn on external shuffling service.
>>2. Sounds like you are using Kubernetes.  In that case, you may also
>>need to turn on shuffle tracking.
>>3. The "stages" are controlled by the APIs.  The APIs for dynamic
>>resource request (change of stage) do exist, but only for RDDs (e.g.
>>TaskResourceRequest and ExecutorResourceRequest).
>>
>>
>> On 11/2/22 11:30 AM, Shay Elbaz wrote:
>>
>> Hi,
>>
>> Our typical applications need less *executors* for a GPU stage than for
>> a CPU stage. We are using dynamic allocation with stage level scheduling,
>> and Spark tries to maximize the number of executors also during the GPU
>> stage, causing a bit of resources chaos in the cluster. This forces us to
>> use a lower value for 'maxExecutors' in the first place, at the cost of the
>> CPU stages performance. Or try to solve this in the Kubernets scheduler
>> level, which is not straightforward and doesn't feel like the right way to
>> go.
>>
>> Is there a way to effectively use less executors in Stage Level
>> Scheduling? The API does not seem to include such an option, but maybe
>> there is some more advanced workaround?
>>
>> Thanks,
>> Shay
>>
>>
>>
>>
>>
>>
>>
>>


Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread bo yang
Interesting discussion here, looks like Spark does not support configuring
different number of executors in different stages. Would love to see the
community come out such a feature.

On Thu, Nov 3, 2022 at 9:10 AM Shay Elbaz  wrote:

> Thanks again Artemis, I really appreciate it. I have watched the video
> but did not find an answer.
>
> Please bear with me just one more iteration 🙂
>
> Maybe I'll be more specific:
> Suppose I start the application with maxExecutors=500, executors.cores=2,
> because that's the amount of resources needed for the ETL part. But for the
> DL part I only need 20 GPUs. SLS API only allows to set the resources per
> executor/task, so Spark would (try to) allocate up to 500 GPUs, assuming I
> configure the profile with 1 GPU per executor.
> So, the question is how do I limit the stage resources to 20 GPUs total?
>
> Thanks again,
> Shay
>
> --
> *From:* Artemis User 
> *Sent:* Thursday, November 3, 2022 5:23 PM
> *To:* user@spark.apache.org 
> *Subject:* [EXTERNAL] Re: Re: Stage level scheduling - lower the number
> of executors when using GPUs
>
>
> *ATTENTION:* This email originated from outside of GM.
>
>   Shay,  You may find this video helpful (with some API code samples that
> you are looking for).  https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.
> The issue here isn't how to limit the number of executors but to request
> for the right GPU-enabled executors dynamically.  Those executors used in
> pre-GPU stages should be returned back to resource managers with dynamic
> resource allocation enabled (and with the right DRA policies).  Hope this
> helps..
>
> Unfortunately there isn't a lot of detailed docs for this topic since GPU
> acceleration is kind of new in Spark (not straightforward like in TF).   I
> wish the Spark doc team could provide more details in the next release...
>
> On 11/3/22 2:37 AM, Shay Elbaz wrote:
>
> Thanks Artemis. We are *not* using Rapids, but rather using GPUs through
> the Stage Level Scheduling feature with ResourceProfile. In Kubernetes
> you have to turn on shuffle tracking for dynamic allocation, anyhow.
> The question is how we can limit the *number of executors *when building
> a new ResourceProfile, directly (API) or indirectly (some advanced
> workaround).
>
> Thanks,
> Shay
>
>
> ----------
> *From:* Artemis User  
> *Sent:* Thursday, November 3, 2022 1:16 AM
> *To:* user@spark.apache.org 
> 
> *Subject:* [EXTERNAL] Re: Stage level scheduling - lower the number of
> executors when using GPUs
>
>
> *ATTENTION:* This email originated from outside of GM.
>
>   Are you using Rapids for GPU support in Spark?  Couple of options you
> may want to try:
>
>1. In addition to dynamic allocation turned on, you may also need to
>turn on external shuffling service.
>2. Sounds like you are using Kubernetes.  In that case, you may also
>need to turn on shuffle tracking.
>3. The "stages" are controlled by the APIs.  The APIs for dynamic
>resource request (change of stage) do exist, but only for RDDs (e.g.
>TaskResourceRequest and ExecutorResourceRequest).
>
>
> On 11/2/22 11:30 AM, Shay Elbaz wrote:
>
> Hi,
>
> Our typical applications need less *executors* for a GPU stage than for a
> CPU stage. We are using dynamic allocation with stage level scheduling, and
> Spark tries to maximize the number of executors also during the GPU stage,
> causing a bit of resources chaos in the cluster. This forces us to use a
> lower value for 'maxExecutors' in the first place, at the cost of the CPU
> stages performance. Or try to solve this in the Kubernets scheduler level,
> which is not straightforward and doesn't feel like the right way to go.
>
> Is there a way to effectively use less executors in Stage Level
> Scheduling? The API does not seem to include such an option, but maybe
> there is some more advanced workaround?
>
> Thanks,
> Shay
>
>
>
>
>
>
>
>


Re: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Shay Elbaz
Thanks again Artemis, I really appreciate it. I have watched the video but did 
not find an answer.

Please bear with me just one more iteration 🙂

Maybe I'll be more specific:
Suppose I start the application with maxExecutors=500, executors.cores=2, 
because that's the amount of resources needed for the ETL part. But for the DL 
part I only need 20 GPUs. SLS API only allows to set the resources per 
executor/task, so Spark would (try to) allocate up to 500 GPUs, assuming I 
configure the profile with 1 GPU per executor.
So, the question is how do I limit the stage resources to 20 GPUs total?

Thanks again,
Shay


From: Artemis User 
Sent: Thursday, November 3, 2022 5:23 PM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Re: Stage level scheduling - lower the number of 
executors when using GPUs


ATTENTION: This email originated from outside of GM.

  Shay,  You may find this video helpful (with some API code samples that you 
are looking for).  https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The 
issue here isn't how to limit the number of executors but to request for the 
right GPU-enabled executors dynamically.  Those executors used in pre-GPU 
stages should be returned back to resource managers with dynamic resource 
allocation enabled (and with the right DRA policies).  Hope this helps..

Unfortunately there isn't a lot of detailed docs for this topic since GPU 
acceleration is kind of new in Spark (not straightforward like in TF).   I wish 
the Spark doc team could provide more details in the next release...

On 11/3/22 2:37 AM, Shay Elbaz wrote:
Thanks Artemis. We are not using Rapids, but rather using GPUs through the 
Stage Level Scheduling feature with ResourceProfile. In Kubernetes you have to 
turn on shuffle tracking for dynamic allocation, anyhow.
The question is how we can limit the number of executors when building a new 
ResourceProfile, directly (API) or indirectly (some advanced workaround).

Thanks,
Shay



From: Artemis User <mailto:arte...@dtechspace.com>
Sent: Thursday, November 3, 2022 1:16 AM
To: user@spark.apache.org<mailto:user@spark.apache.org> 
<mailto:user@spark.apache.org>
Subject: [EXTERNAL] Re: Stage level scheduling - lower the number of executors 
when using GPUs


ATTENTION: This email originated from outside of GM.

  Are you using Rapids for GPU support in Spark?  Couple of options you may 
want to try:

  1.  In addition to dynamic allocation turned on, you may also need to turn on 
external shuffling service.
  2.  Sounds like you are using Kubernetes.  In that case, you may also need to 
turn on shuffle tracking.
  3.  The "stages" are controlled by the APIs.  The APIs for dynamic resource 
request (change of stage) do exist, but only for RDDs (e.g. TaskResourceRequest 
and ExecutorResourceRequest).

On 11/2/22 11:30 AM, Shay Elbaz wrote:
Hi,

Our typical applications need less executors for a GPU stage than for a CPU 
stage. We are using dynamic allocation with stage level scheduling, and Spark 
tries to maximize the number of executors also during the GPU stage, causing a 
bit of resources chaos in the cluster. This forces us to use a lower value for 
'maxExecutors' in the first place, at the cost of the CPU stages performance. 
Or try to solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.

Is there a way to effectively use less executors in Stage Level Scheduling? The 
API does not seem to include such an option, but maybe there is some more 
advanced workaround?

Thanks,
Shay









Re: [EXTERNAL] Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-03 Thread Artemis User
Shay,  You may find this video helpful (with some API code samples that 
you are looking for). 
https://www.youtube.com/watch?v=JNQu-226wUc&t=171s.  The issue here 
isn't how to limit the number of executors but to request for the right 
GPU-enabled executors dynamically.  Those executors used in pre-GPU 
stages should be returned back to resource managers with dynamic 
resource allocation enabled (and with the right DRA policies).  Hope 
this helps..


Unfortunately there isn't a lot of detailed docs for this topic since 
GPU acceleration is kind of new in Spark (not straightforward like in 
TF).   I wish the Spark doc team could provide more details in the next 
release...


On 11/3/22 2:37 AM, Shay Elbaz wrote:
Thanks Artemis. We are *not* using Rapids, but rather using GPUs 
through the Stage Level Scheduling feature with ResourceProfile. In 
Kubernetes you have to turn on shuffle tracking for dynamic 
allocation, anyhow.
The question is how we can limit the *number of executors *when 
building a new ResourceProfile, directly (API) or indirectly (some 
advanced workaround).


Thanks,
Shay


*From:* Artemis User 
*Sent:* Thursday, November 3, 2022 1:16 AM
*To:* user@spark.apache.org 
*Subject:* [EXTERNAL] Re: Stage level scheduling - lower the number of 
executors when using GPUs


*ATTENTION:*This email originated from outside of GM.


Are you using Rapids for GPU support in Spark? Couple of options you 
may want to try:


 1. In addition to dynamic allocation turned on, you may also need to
turn on external shuffling service.
 2. Sounds like you are using Kubernetes.  In that case, you may also
need to turn on shuffle tracking.
 3. The "stages" are controlled by the APIs.  The APIs for dynamic
resource request (change of stage) do exist, but only for RDDs
(e.g. TaskResourceRequest and ExecutorResourceRequest).


On 11/2/22 11:30 AM, Shay Elbaz wrote:

Hi,

Our typical applications need less *executors* for a GPU stage than 
for a CPU stage. We are using dynamic allocation with stage level 
scheduling, and Spark tries to maximize the number of executors also 
during the GPU stage, causing a bit of resources chaos in the 
cluster. This forces us to use a lower value for 'maxExecutors' in 
the first place, at the cost of the CPU stages performance. Or try to 
solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.


Is there a way to effectively use less executors in Stage Level 
Scheduling? The API does not seem to include such an option, but 
maybe there is some more advanced workaround?


Thanks,
Shay









Re: [EXTERNAL] Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-02 Thread Shay Elbaz
Thanks Artemis. We are not using Rapids, but rather using GPUs through the 
Stage Level Scheduling feature with ResourceProfile. In Kubernetes you have to 
turn on shuffle tracking for dynamic allocation, anyhow.
The question is how we can limit the number of executors when building a new 
ResourceProfile, directly (API) or indirectly (some advanced workaround).

Thanks,
Shay



From: Artemis User 
Sent: Thursday, November 3, 2022 1:16 AM
To: user@spark.apache.org 
Subject: [EXTERNAL] Re: Stage level scheduling - lower the number of executors 
when using GPUs


ATTENTION: This email originated from outside of GM.

  Are you using Rapids for GPU support in Spark?  Couple of options you may 
want to try:

  1.  In addition to dynamic allocation turned on, you may also need to turn on 
external shuffling service.
  2.  Sounds like you are using Kubernetes.  In that case, you may also need to 
turn on shuffle tracking.
  3.  The "stages" are controlled by the APIs.  The APIs for dynamic resource 
request (change of stage) do exist, but only for RDDs (e.g. TaskResourceRequest 
and ExecutorResourceRequest).

On 11/2/22 11:30 AM, Shay Elbaz wrote:
Hi,

Our typical applications need less executors for a GPU stage than for a CPU 
stage. We are using dynamic allocation with stage level scheduling, and Spark 
tries to maximize the number of executors also during the GPU stage, causing a 
bit of resources chaos in the cluster. This forces us to use a lower value for 
'maxExecutors' in the first place, at the cost of the CPU stages performance. 
Or try to solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.

Is there a way to effectively use less executors in Stage Level Scheduling? The 
API does not seem to include such an option, but maybe there is some more 
advanced workaround?

Thanks,
Shay








Re: Stage level scheduling - lower the number of executors when using GPUs

2022-11-02 Thread Artemis User
Are you using Rapids for GPU support in Spark?  Couple of options you 
may want to try:


1. In addition to dynamic allocation turned on, you may also need to
   turn on external shuffling service.
2. Sounds like you are using Kubernetes.  In that case, you may also
   need to turn on shuffle tracking.
3. The "stages" are controlled by the APIs.  The APIs for dynamic
   resource request (change of stage) do exist, but only for RDDs (e.g.
   TaskResourceRequest and ExecutorResourceRequest).


On 11/2/22 11:30 AM, Shay Elbaz wrote:

Hi,

Our typical applications need less *executors* for a GPU stage than 
for a CPU stage. We are using dynamic allocation with stage level 
scheduling, and Spark tries to maximize the number of executors also 
during the GPU stage, causing a bit of resources chaos in the cluster. 
This forces us to use a lower value for 'maxExecutors' in the first 
place, at the cost of the CPU stages performance. Or try to solve this 
in the Kubernets scheduler level, which is not straightforward and 
doesn't feel like the right way to go.


Is there a way to effectively use less executors in Stage Level 
Scheduling? The API does not seem to include such an option, but maybe 
there is some more advanced workaround?


Thanks,
Shay







Stage level scheduling - lower the number of executors when using GPUs

2022-11-02 Thread Shay Elbaz
Hi,

Our typical applications need less executors for a GPU stage than for a CPU 
stage. We are using dynamic allocation with stage level scheduling, and Spark 
tries to maximize the number of executors also during the GPU stage, causing a 
bit of resources chaos in the cluster. This forces us to use a lower value for 
'maxExecutors' in the first place, at the cost of the CPU stages performance. 
Or try to solve this in the Kubernets scheduler level, which is not 
straightforward and doesn't feel like the right way to go.

Is there a way to effectively use less executors in Stage Level Scheduling? The 
API does not seem to include such an option, but maybe there is some more 
advanced workaround?

Thanks,
Shay







Spark streaming job not able to launch more number of executors

2020-09-18 Thread Vibhor Banga ( Engineering - VS)
Hi all,

We have a spark streaming job which reads from two kafka topics with 10
partitions each. And we are running the streaming job with 3 concurrent
microbatches. (So total 20 partitions and 3 concurrency)

We have following question:

In our processing DAG, we do a rdd.persist() at one stage, after which we
fork out the DAG into two. Each of the forks has an action (forEach) at the
end. In this case, we are observing that the number of executors is not
exceeding the number of input kafka partitions. Job is not spawning more
than 60 executors (2*10*3). And we see that the tasks from the two actions
and the 3 concurrent microbatches are competing with each other for
resources. So even though the max processing time of a task is 'x', the
overall  processing time of the stage is much greater than 'x'.

Is there a way by which we can ensure that the two forks of the DAG get
processed in parallel by spawning more number of executors?
(We have not put any cap of maxExecutors)

Following are the job configurations:
spark.dynamicAllocation.enabled: true
spark.dynamicAllocation.minExecutors: NOT_SET

Please let us know if you have any ideas that can be useful here.

Thanks,
-Vibhor

-- 



*-*


*This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify the 
system manager. This message contains confidential information and is 
intended only for the individual named. If you are not the named addressee, 
you should not disseminate, distribute or copy this email. Please notify 
the sender immediately by email if you have received this email by mistake 
and delete this email from your system. If you are not the intended 
recipient, you are notified that disclosing, copying, distributing or 
taking any action in reliance on the contents of this information is 
strictly prohibited.*

 

*Any views or opinions presented in this 
email are solely those of the author and do not necessarily represent those 
of the organization. Any information on shares, debentures or similar 
instruments, recommended product pricing, valuations and the like are for 
information purposes only. It is not meant to be an instruction or 
recommendation, as the case may be, to buy or to sell securities, products, 
services nor an offer to buy or sell securities, products or services 
unless specifically stated to be so on behalf of the Flipkart group. 
Employees of the Flipkart group of companies are expressly required not to 
make defamatory statements and not to infringe or authorise any 
infringement of copyright or any other legal right by email communications. 
Any such communication is contrary to organizational policy and outside the 
scope of the employment of the individual concerned. The organization will 
not accept any liability in respect of such communication, and the employee 
responsible will be personally liable for any damages or other liability 
arising.*

 

*Our organization accepts no liability for the 
content of this email, or for the consequences of any actions taken on the 
basis of the information *provided,* unless that information is 
subsequently confirmed in writing. If you are not the intended recipient, 
you are notified that disclosing, copying, distributing or taking any 
action in reliance on the contents of this information is strictly 
prohibited.*


_-_



Re: Databricks - number of executors, shuffle.partitions etc

2019-05-16 Thread Rishi Shah
Thanks Ayan, I wasn't aware of such user group specifically for databricks.
Thanks for the input, much appreciated!

On Wed, May 15, 2019 at 10:07 PM ayan guha  wrote:

> Well its a databricks question so better be asked in their forum.
>
> You can set up cluster level params when you create new cluster or add
> them later. Go to cluster page, ipen one cluster, expand additional config
> section and add your param there as key value pair separated by space.
>
> On Thu, 16 May 2019 at 11:46 am, Rishi Shah 
> wrote:
>
>> Hi All,
>>
>> Any idea?
>>
>> Thanks,
>> -Rishi
>>
>> On Tue, May 14, 2019 at 11:52 PM Rishi Shah 
>> wrote:
>>
>>> Hi All,
>>>
>>> How can we set spark conf parameter in databricks notebook? My cluster
>>> doesn't take into account any spark.conf.set properties... it creates 8
>>> worker nodes (dat executors) but doesn't honor the supplied conf
>>> parameters. Any idea?
>>>
>>> --
>>> Regards,
>>>
>>> Rishi Shah
>>>
>>
>>
>> --
>> Regards,
>>
>> Rishi Shah
>>
> --
> Best Regards,
> Ayan Guha
>


-- 
Regards,

Rishi Shah


Re: Databricks - number of executors, shuffle.partitions etc

2019-05-15 Thread ayan guha
Well its a databricks question so better be asked in their forum.

You can set up cluster level params when you create new cluster or add them
later. Go to cluster page, ipen one cluster, expand additional config
section and add your param there as key value pair separated by space.

On Thu, 16 May 2019 at 11:46 am, Rishi Shah 
wrote:

> Hi All,
>
> Any idea?
>
> Thanks,
> -Rishi
>
> On Tue, May 14, 2019 at 11:52 PM Rishi Shah 
> wrote:
>
>> Hi All,
>>
>> How can we set spark conf parameter in databricks notebook? My cluster
>> doesn't take into account any spark.conf.set properties... it creates 8
>> worker nodes (dat executors) but doesn't honor the supplied conf
>> parameters. Any idea?
>>
>> --
>> Regards,
>>
>> Rishi Shah
>>
>
>
> --
> Regards,
>
> Rishi Shah
>
-- 
Best Regards,
Ayan Guha


Re: Databricks - number of executors, shuffle.partitions etc

2019-05-15 Thread Rishi Shah
Hi All,

Any idea?

Thanks,
-Rishi

On Tue, May 14, 2019 at 11:52 PM Rishi Shah 
wrote:

> Hi All,
>
> How can we set spark conf parameter in databricks notebook? My cluster
> doesn't take into account any spark.conf.set properties... it creates 8
> worker nodes (dat executors) but doesn't honor the supplied conf
> parameters. Any idea?
>
> --
> Regards,
>
> Rishi Shah
>


-- 
Regards,

Rishi Shah


Databricks - number of executors, shuffle.partitions etc

2019-05-14 Thread Rishi Shah
Hi All,

How can we set spark conf parameter in databricks notebook? My cluster
doesn't take into account any spark.conf.set properties... it creates 8
worker nodes (dat executors) but doesn't honor the supplied conf
parameters. Any idea?

-- 
Regards,

Rishi Shah


Re: Spark Streaming - Increasing number of executors slows down processing rate

2017-06-20 Thread Biplob Biswas
Hi Edwin,

I have faced a similar issue as well and this behaviour is very abrupt. I
even created a question on StackOverflow but no solution yet.
https://stackoverflow.com/questions/43496205/spark-job-processing-time-increases-to-4s-without-explanation

For us, we sometimes had this constant delay of 4s (which increases to 8s
if we increase executors) whenever we started the job. But then we observed
something which you can see in the question above. The processing time
increases abruptly.

I read a lot about similar issues but always it was recommended that
something else is causing this delay. Although I am not really sure it
feels its some issue with kafka - spark integration but can't say for sure.

Regards,
Biplob

Thanks & Regards
Biplob Biswas

On Tue, Jun 20, 2017 at 5:42 AM, Mal Edwin 
wrote:

> Hi All,
>
> I am struggling with an odd issue and would like your help in addressing
> it.
>
>
> *Environment*
>
> AWS Cluster (40 Spark Nodes & 4 node Kafka cluster)
>
> Spark Kafka Streaming submitted in Yarn cluster mode
>
> Kafka - Single topic, 400 partitions
>
> Spark 2.1 on Cloudera
>
> Kafka 10.0 on Cloudera
>
>
> We have zero messages in Kafka and starting this spark job with 100
> Executors each with 14GB of RAM and single executor core.
>
> The time to process 0 records(end of each batch) is 5seconds
>
>
> When we increase the executors to 400 and everything else remains the same
> except we reduce memory to 11GB, we see the time to process 0 records(end
> of each batch) increases 10times to  50Second and some cases it goes to 103
> seconds.
>
>
> Spark Streaming configs that we are setting are
>
> Batchwindow = 60 seconds
>
> Backpressure.enabled = true
>
> spark.memory.fraction=0.3 (we store more data in our own data structures)
>
> spark.streaming.kafka.consumer.poll.ms=1
>
>
> Have tried increasing driver memory to 4GB and also increased driver.cores
> to 4.
>
>
> If anybody has faced similar issues please provide some pointers to how to
> address this issue.
>
>
> Thanks a lot for your time.
>
>
> Regards,
>
> Edwin
>
>


Spark Streaming - Increasing number of executors slows down processing rate

2017-06-19 Thread Mal Edwin
Hi All,
I am struggling with an odd issue and would like your help in addressing it.

Environment
AWS Cluster (40 Spark Nodes & 4 node Kafka cluster)
Spark Kafka Streaming submitted in Yarn cluster mode
Kafka - Single topic, 400 partitions
Spark 2.1 on Cloudera
Kafka 10.0 on Cloudera

We have zero messages in Kafka and starting this spark job with 100 Executors 
each with 14GB of RAM and single executor core.
The time to process 0 records(end of each batch) is 5seconds

When we increase the executors to 400 and everything else remains the same 
except we reduce memory to 11GB, we see the time to process 0 records(end of 
each batch) increases 10times to  50Second and some cases it goes to 103 
seconds.

Spark Streaming configs that we are setting are
Batchwindow = 60 seconds
Backpressure.enabled = true
spark.memory.fraction=0.3 (we store more data in our own data structures)
spark.streaming.kafka.consumer.poll.ms=1

Have tried increasing driver memory to 4GB and also increased driver.cores to 4.

If anybody has faced similar issues please provide some pointers to how to 
address this issue.

Thanks a lot for your time.

Regards,
Edwin



Spark streaming uses lesser number of executors

2016-11-08 Thread Aravindh
Hi, I am using spark streaming process some events. It is deployed in
standalone mode with 1 master and 3 workers. I have set number of cores per
executor to 4 and total num of executors to 24. This means totally 6
executors will be spawned. I have set spread-out to true. So each worker
machine get 2 executors. My batch interval is 1 second. While running what I
observe from event timeline is that only 3 of the executors are being used.
The other 3 are not being used. As far as I know, there is no parameter in
spark standalone mode to specify the number of executors. How do I make
spark to use all the available executors? 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-uses-lesser-number-of-executors-tp28042.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



What factors decide the number of executors when doing a Spark SQL insert in Mesos?

2016-05-20 Thread SRK
Hi,

What factors decide the number of executors when doing a Spark SQL insert?
Right now when I submit my job in Mesos I see only 2 executors getting
allocated all the time.

Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/What-factors-decide-the-number-of-executors-when-doing-a-Spark-SQL-insert-in-Mesos-tp26990.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Number of executors change during job running

2016-05-02 Thread Vikash Pareek
Hi Bill,

You can try DirectStream and increase # of partition to kafka. then input
Dstream will have the partitions as per kafka topic without using
re-partitioning.

Can you please share your event timeline chart from spark ui. You need to
tune your configuration as per computation. Spark ui will give deeper
understanding of the problem.

Thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Number-of-executors-change-during-job-running-tp9243p26866.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Number of executors in spark-1.6 and spark-1.5

2016-04-10 Thread Vikash Pareek
Hi Talebzadeh,

Thank for your quick response.

>>in 1.6, how many executors do you see for each node?
I have1 executor for 1 node with SPARK_WORKER_INSTANCES=1.

>>in standalone mode how are you increasing the number of worker instances.
Are you starting another slave on each node?
No, I am not starting another slave node, I just changed *spark-env.sh *for
each slave node i.e. set SPARK_WORKER_INSTANCES=2.





Best Regards,


Vikash Pareek
Software Developer, *InfoObjects Inc.*
m: +918800206898 a: E5, Jhalana Institutional Area, Jaipur
s: vikaspareek1991 e: vikash.par...@infoobjects.com



On Sun, Apr 10, 2016 at 3:00 PM, Mich Talebzadeh 
wrote:

> Hi,
>
> in 1.6, how many executors do you see for each node?
> in standalone mode how are you increasing the number of worker instances.
> Are you starting another slave on each node?
>
> HTH
>
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 10 April 2016 at 08:26, Vikash Pareek 
> wrote:
>
>> Hi,
>>
>> I have upgraded 5 node spark cluster from spark-1.5 to spark-1.6 (to use
>> mapWithState function).
>> After using spark-1.6, I am getting a strange behaviour of spark, jobs are
>> not using multiple executors of different nodes at a time means there is
>> no
>> parallel processing if each node having single worker and executor.
>> I am running jobs in spark standalone mode.
>>
>> I observed following points related to this issue.
>> 1. If I run same job with spark-1.5 then this will use multiple executors
>> across different nodes at a time.
>> 2. In Spark-1.6, If I increase no of cores(spark.cores.max) then jobs are
>> running in parallel thread but within same executor.
>> 3. In Spark-1.6, If I increase no of worker instances on each node then
>> jobs
>> are running in parallel as no of workers but within same executor.
>>
>> Can anyone suggest, why spark 1.6 can not use multiple executors across
>> different node at a time for parallel processing.
>> Your suggestion will be highly appreciated.
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Number-of-executors-in-spark-1-6-and-spark-1-5-tp26733.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: Number of executors in spark-1.6 and spark-1.5

2016-04-10 Thread Mich Talebzadeh
Hi,

in 1.6, how many executors do you see for each node?
in standalone mode how are you increasing the number of worker instances.
Are you starting another slave on each node?

HTH




Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 10 April 2016 at 08:26, Vikash Pareek 
wrote:

> Hi,
>
> I have upgraded 5 node spark cluster from spark-1.5 to spark-1.6 (to use
> mapWithState function).
> After using spark-1.6, I am getting a strange behaviour of spark, jobs are
> not using multiple executors of different nodes at a time means there is no
> parallel processing if each node having single worker and executor.
> I am running jobs in spark standalone mode.
>
> I observed following points related to this issue.
> 1. If I run same job with spark-1.5 then this will use multiple executors
> across different nodes at a time.
> 2. In Spark-1.6, If I increase no of cores(spark.cores.max) then jobs are
> running in parallel thread but within same executor.
> 3. In Spark-1.6, If I increase no of worker instances on each node then
> jobs
> are running in parallel as no of workers but within same executor.
>
> Can anyone suggest, why spark 1.6 can not use multiple executors across
> different node at a time for parallel processing.
> Your suggestion will be highly appreciated.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Number-of-executors-in-spark-1-6-and-spark-1-5-tp26733.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Number of executors in spark-1.6 and spark-1.5

2016-04-10 Thread Vikash Pareek
Hi,

I have upgraded 5 node spark cluster from spark-1.5 to spark-1.6 (to use
mapWithState function).
After using spark-1.6, I am getting a strange behaviour of spark, jobs are
not using multiple executors of different nodes at a time means there is no
parallel processing if each node having single worker and executor.
I am running jobs in spark standalone mode.

I observed following points related to this issue.
1. If I run same job with spark-1.5 then this will use multiple executors
across different nodes at a time.
2. In Spark-1.6, If I increase no of cores(spark.cores.max) then jobs are
running in parallel thread but within same executor.
3. In Spark-1.6, If I increase no of worker instances on each node then jobs
are running in parallel as no of workers but within same executor.

Can anyone suggest, why spark 1.6 can not use multiple executors across
different node at a time for parallel processing.
Your suggestion will be highly appreciated.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Number-of-executors-in-spark-1-6-and-spark-1-5-tp26733.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: reasonable number of executors

2016-02-24 Thread Alex Dzhagriev
Hi Igor,

That's a great talk and an exact answer to my question. Thank you.

Cheers, Alex.

On Tue, Feb 23, 2016 at 8:27 PM, Igor Berman  wrote:

>
> http://www.slideshare.net/cloudera/top-5-mistakes-to-avoid-when-writing-apache-spark-applications
>
> there is a section that is connected to your question
>
> On 23 February 2016 at 16:49, Alex Dzhagriev  wrote:
>
>> Hello all,
>>
>> Can someone please advise me on the pros and cons on how to allocate the
>> resources: many small heap machines with 1 core or few machines with big
>> heaps and many cores? I'm sure that depends on the data flow and there is
>> no best practise solution. E.g. with bigger heap I can perform map-side
>> join with bigger table. What other considerations should I keep in mind in
>> order to choose the right configuration?
>>
>> Thanks, Alex.
>>
>
>


Re: reasonable number of executors

2016-02-23 Thread Igor Berman
http://www.slideshare.net/cloudera/top-5-mistakes-to-avoid-when-writing-apache-spark-applications

there is a section that is connected to your question

On 23 February 2016 at 16:49, Alex Dzhagriev  wrote:

> Hello all,
>
> Can someone please advise me on the pros and cons on how to allocate the
> resources: many small heap machines with 1 core or few machines with big
> heaps and many cores? I'm sure that depends on the data flow and there is
> no best practise solution. E.g. with bigger heap I can perform map-side
> join with bigger table. What other considerations should I keep in mind in
> order to choose the right configuration?
>
> Thanks, Alex.
>


Re: reasonable number of executors

2016-02-23 Thread Jorge Machado
Hi Alex, 

take a look here : 
https://blogs.aws.amazon.com/bigdata/post/Tx3RD6EISZGHQ1C/The-Impact-of-Using-Latest-Generation-Instances-for-Your-Amazon-EMR-Job
 


Basically it depends of your type of workload. Will you need Cache ? 



Jorge Machado
www.jmachado.me


> On 23/02/2016, at 15:49, Alex Dzhagriev  wrote:
> 
> Hello all,
> 
> Can someone please advise me on the pros and cons on how to allocate the 
> resources: many small heap machines with 1 core or few machines with big 
> heaps and many cores? I'm sure that depends on the data flow and there is no 
> best practise solution. E.g. with bigger heap I can perform map-side join 
> with bigger table. What other considerations should I keep in mind in order 
> to choose the right configuration?
> 
> Thanks, Alex.



reasonable number of executors

2016-02-23 Thread Alex Dzhagriev
Hello all,

Can someone please advise me on the pros and cons on how to allocate the
resources: many small heap machines with 1 core or few machines with big
heaps and many cores? I'm sure that depends on the data flow and there is
no best practise solution. E.g. with bigger heap I can perform map-side
join with bigger table. What other considerations should I keep in mind in
order to choose the right configuration?

Thanks, Alex.


Re: Specify number of executors in standalone cluster mode

2016-02-21 Thread Hemant Bhanawat
Max number of cores per executor can be controlled using
spark.executor.cores. And maximum number of executors on a single worker
can be determined by environment variable: SPARK_WORKER_INSTANCES.

However, to ensure that all available cores are used, you will have to take
care of how the stream is partitioned. Copy pasting help text of Spark.



*The number of tasks per receiver per batch will be approximately (batch
interval / block interval). For example, block interval of 200 ms will
create 10 tasks per 2 second batches. If the number of tasks is too low
(that is, less than the number of cores per machine), then it will be
inefficient as all available cores will not be used to process the data. To
increase the number of tasks for a given batch interval, reduce the block
interval. However, the recommended minimum value of block interval is about
50 ms, below which the task launching overheads may be a problem.An
alternative to receiving data with multiple input streams / receivers is to
explicitly repartition the input data stream (using
inputStream.repartition()). This distributes the
received batches of data across the specified number of machines in the
cluster before further processing.*

Hemant Bhanawat <https://www.linkedin.com/in/hemant-bhanawat-92a3811>
www.snappydata.io

On Sun, Feb 21, 2016 at 11:01 PM, Saiph Kappa  wrote:

> Hi,
>
> I'm running a spark streaming application onto a spark cluster that spans
> 6 machines/workers. I'm using spark cluster standalone mode. Each machine
> has 8 cores. Is there any way to specify that I want to run my application
> on all 6 machines and just use 2 cores on each machine?
>
> Thanks
>


Specify number of executors in standalone cluster mode

2016-02-21 Thread Saiph Kappa
Hi,

I'm running a spark streaming application onto a spark cluster that spans 6
machines/workers. I'm using spark cluster standalone mode. Each machine has
8 cores. Is there any way to specify that I want to run my application on
all 6 machines and just use 2 cores on each machine?

Thanks


Re: Number of executors in Spark - Kafka

2016-01-21 Thread Cody Koeninger
6 kafka partitions will result in 6 spark partitions, not 6 spark rdds.

The question of whether you will have a backlog isn't just a matter of
having 1 executor per partition.  If a single executor can process all of
the partitions fast enough to complete a batch in under the required time,
you won't have a backlog.

On Thu, Jan 21, 2016 at 5:35 AM, Guillermo Ortiz 
wrote:

>
> I'm using Spark Streaming and Kafka with Direct Approach. I have created a
> topic with 6 partitions so when I execute Spark there are six RDD. I
> understand than ideally it should have six executors to process each one
> one RDD. To do it, when I execute spark-submit (I use  YARN) I specific the
> number executors to six.
> If I don't specific anything it just create one executor. Looking for
> information I have read:
>
> "The --num-executors command-line flag or spark.executor.instances 
> configuration
> property control the number of executors requested. Starting in CDH
> 5.4/Spark 1.3, you will be able to avoid setting this property by turning
> on dynamic allocation
> <https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation>
>  with
> thespark.dynamicAllocation.enabled property. Dynamic allocation enables a
> Spark application to request executors when there is a backlog of pending
> tasks and free up executors when idle."
>
> I have this parameter enabled, I understand than if I don't set the
> parameter --num-executors it must create six executors or am I wrong?
>


Number of executors in Spark - Kafka

2016-01-21 Thread Guillermo Ortiz
I'm using Spark Streaming and Kafka with Direct Approach. I have created a
topic with 6 partitions so when I execute Spark there are six RDD. I
understand than ideally it should have six executors to process each one
one RDD. To do it, when I execute spark-submit (I use  YARN) I specific the
number executors to six.
If I don't specific anything it just create one executor. Looking for
information I have read:

"The --num-executors command-line flag or spark.executor.instances
configuration
property control the number of executors requested. Starting in CDH
5.4/Spark 1.3, you will be able to avoid setting this property by turning
on dynamic allocation
<https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation>
with
thespark.dynamicAllocation.enabled property. Dynamic allocation enables a
Spark application to request executors when there is a backlog of pending
tasks and free up executors when idle."

I have this parameter enabled, I understand than if I don't set the
parameter --num-executors it must create six executors or am I wrong?


Re: number of executors in sparkR.init()

2015-12-25 Thread Franc Carter
Thanks, that works

cheers

On 26 December 2015 at 16:53, Felix Cheung 
wrote:

> The equivalent for spark-submit --num-executors should be
> spark.executor.instances
> When use in SparkConf?
> http://spark.apache.org/docs/latest/running-on-yarn.html
>
> Could you try setting that with sparkR.init()?
>
>
> _
> From: Franc Carter 
> Sent: Friday, December 25, 2015 9:23 PM
> Subject: number of executors in sparkR.init()
> To: 
>
>
>
> Hi,
>
> I'm having trouble working out how to get the number of executors set when
> using sparkR.init().
>
> If I start sparkR with
>
>   sparkR  --master yarn --num-executors 6
>
> then I get 6 executors
>
> However, if start sparkR with
>
>   sparkR
>
> followed by
>
>   sc <- sparkR.init(master="yarn-client",
> sparkEnvir=list(spark.num.executors='6'))
>
> then I only get 2 executors.
>
> Can anyone point me in the direction of what I might doing wrong ? I need
> to initialise this was so that rStudio can hook in to SparkR
>
> thanks
>
> --
> Franc
>
>
>


-- 
Franc


Re: number of executors in sparkR.init()

2015-12-25 Thread Felix Cheung
The equivalent for spark-submit --num-executors should be 
spark.executor.instancesWhen use in 
SparkConf?http://spark.apache.org/docs/latest/running-on-yarn.html
Could you try setting that with sparkR.init()?


_
From: Franc Carter 
Sent: Friday, December 25, 2015 9:23 PM
Subject: number of executors in sparkR.init()
To:  


   Hi,   
  I'm having trouble working out how to get the number of executors set 
when using sparkR.init().  
  If I start sparkR with  
    sparkR  --master yarn --num-executors 6   
  then I get 6 executors  
  However, if start sparkR with  
    sparkR
 
 followed by 
   sc <- sparkR.init(master="yarn-client",   
sparkEnvir=list(spark.num.executors='6')) 
 then I only get 2 executors. 
 Can anyone point me in the direction of what I might doing wrong ? 
I need to initialise this was so that rStudio can hook in to SparkR 
 thanks  
   --
   Franc


  

number of executors in sparkR.init()

2015-12-25 Thread Franc Carter
Hi,

I'm having trouble working out how to get the number of executors set when
using sparkR.init().

If I start sparkR with

  sparkR  --master yarn --num-executors 6

then I get 6 executors

However, if start sparkR with

  sparkR

followed by

  sc <- sparkR.init(master="yarn-client",
sparkEnvir=list(spark.num.executors='6'))

then I only get 2 executors.

Can anyone point me in the direction of what I might doing wrong ? I need
to initialise this was so that rStudio can hook in to SparkR

thanks

-- 
Franc


RE: ideal number of executors per machine

2015-12-16 Thread Bui, Tri
Article below gives a good idea.

http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/

Play around with two configuration (large number of executor with small core, 
and small executor with large core) .  Calculated value have to be conservative 
or it will make the spark jobs unstable.

Thx
tri

From: Veljko Skarich [mailto:veljko.skar...@gmail.com]
Sent: Tuesday, December 15, 2015 3:08 PM
To: user@spark.apache.org
Subject: ideal number of executors per machine

Hi,

I'm looking for suggestions on the ideal number of executors per machine. I run 
my jobs on 64G 32 core machines, and at the moment I have one executor running 
per machine, on the spark standalone cluster.

 I could not find many guidelines for figuring out the ideal number of 
executors; the Spark official documentation merely recommends not having more 
than 64G per executor to avoid GC issues. Anyone have and advice on this?

thank you.


Re: ideal number of executors per machine

2015-12-16 Thread Sean Owen
I don't think it has anything to do with using all the cores, since 1
executor can run as many tasks as you like. Yes, you'd want them to
request all cores in this case. YARN vs Mesos does not matter here.

On Wed, Dec 16, 2015 at 1:58 PM, Michael Segel
 wrote:
> Hmmm.
> This would go against the grain.
>
> I have to ask how you came to that conclusion…
>
> There are a lot of factors…  e.g. Yarn vs Mesos?
>
> What you’re suggesting would mean a loss of parallelism.
>
>
>> On Dec 16, 2015, at 12:22 AM, Sean Owen  wrote:
>>
>> 1 per machine is the right number. If you are running very large heaps
>> (>64GB) you may consider multiple per machine just to make sure each's
>> GC pauses aren't excessive, but even this might be better mitigated
>> with GC tuning.
>>
>> On Tue, Dec 15, 2015 at 9:07 PM, Veljko Skarich
>>  wrote:
>>> Hi,
>>>
>>> I'm looking for suggestions on the ideal number of executors per machine. I
>>> run my jobs on 64G 32 core machines, and at the moment I have one executor
>>> running per machine, on the spark standalone cluster.
>>>
>>> I could not find many guidelines for figuring out the ideal number of
>>> executors; the Spark official documentation merely recommends not having
>>> more than 64G per executor to avoid GC issues. Anyone have and advice on
>>> this?
>>>
>>> thank you.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: ideal number of executors per machine

2015-12-15 Thread Sean Owen
1 per machine is the right number. If you are running very large heaps
(>64GB) you may consider multiple per machine just to make sure each's
GC pauses aren't excessive, but even this might be better mitigated
with GC tuning.

On Tue, Dec 15, 2015 at 9:07 PM, Veljko Skarich
 wrote:
> Hi,
>
> I'm looking for suggestions on the ideal number of executors per machine. I
> run my jobs on 64G 32 core machines, and at the moment I have one executor
> running per machine, on the spark standalone cluster.
>
>  I could not find many guidelines for figuring out the ideal number of
> executors; the Spark official documentation merely recommends not having
> more than 64G per executor to avoid GC issues. Anyone have and advice on
> this?
>
> thank you.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: ideal number of executors per machine

2015-12-15 Thread Jerry Lam
Hi Veljko,

I usually ask the following questions: “how many memory per task?” then "How 
many cpu per task?” then I calculate based on the memory and cpu requirements 
per task. You might be surprise (maybe not you, but at least I am :) ) that 
many OOM issues are actually because of this. 

Best Regards,

Jerry

> On Dec 15, 2015, at 5:18 PM, Jakob Odersky  wrote:
> 
> Hi Veljko,
> I would assume keeping the number of executors per machine to a minimum is 
> best for performance (as long as you consider memory requirements as well).
> Each executor is a process that can run tasks in multiple threads. On a 
> kernel/hardware level, thread switches are much cheaper than process switches 
> and therefore having a single executor with multiple threads gives a better 
> over-all performance that multiple executors with less threads.
> 
> --Jakob
> 
> On 15 December 2015 at 13:07, Veljko Skarich  <mailto:veljko.skar...@gmail.com>> wrote:
> Hi, 
> 
> I'm looking for suggestions on the ideal number of executors per machine. I 
> run my jobs on 64G 32 core machines, and at the moment I have one executor 
> running per machine, on the spark standalone cluster.
> 
>  I could not find many guidelines for figuring out the ideal number of 
> executors; the Spark official documentation merely recommends not having more 
> than 64G per executor to avoid GC issues. Anyone have and advice on this?
> 
> thank you. 
> 



Re: ideal number of executors per machine

2015-12-15 Thread Jakob Odersky
Hi Veljko,
I would assume keeping the number of executors per machine to a minimum is
best for performance (as long as you consider memory requirements as well).
Each executor is a process that can run tasks in multiple threads. On a
kernel/hardware level, thread switches are much cheaper than process
switches and therefore having a single executor with multiple threads gives
a better over-all performance that multiple executors with less threads.

--Jakob

On 15 December 2015 at 13:07, Veljko Skarich 
wrote:

> Hi,
>
> I'm looking for suggestions on the ideal number of executors per machine.
> I run my jobs on 64G 32 core machines, and at the moment I have one
> executor running per machine, on the spark standalone cluster.
>
>  I could not find many guidelines for figuring out the ideal number of
> executors; the Spark official documentation merely recommends not having
> more than 64G per executor to avoid GC issues. Anyone have and advice on
> this?
>
> thank you.
>


ideal number of executors per machine

2015-12-15 Thread Veljko Skarich
Hi,

I'm looking for suggestions on the ideal number of executors per machine. I
run my jobs on 64G 32 core machines, and at the moment I have one executor
running per machine, on the spark standalone cluster.

 I could not find many guidelines for figuring out the ideal number of
executors; the Spark official documentation merely recommends not having
more than 64G per executor to avoid GC issues. Anyone have and advice on
this?

thank you.


Re: How to set the number of executors and tasks in a Spark Streaming job in Mesos

2015-08-27 Thread Akhil Das
How many mesos slaves are you having? and how many cores are you having in
total?

  sparkConf.set("spark.mesos.coarse", "true")
  sparkConf.set("spark.cores.max", "128")

These two configurations are sufficient. Now regarding the active tasks,
how many partitions are you seeing for that job? You can try doing a
dstream.repartition to see if it increase from 11 to a higher number.


Thanks
Best Regards

On Thu, Aug 20, 2015 at 2:28 AM, swetha  wrote:

> Hi,
>
> How to set the number of executors and tasks in a Spark Streaming job in
> Mesos? I have the following settings but my job still shows me 11 active
> tasks and 11 executors. Any idea as to why this is happening
> ?
>
>  sparkConf.set("spark.mesos.coarse", "true")
>   sparkConf.set("spark.cores.max", "128")
>   sparkConf.set("spark.default.parallelism", "100")
>   //sparkConf.set("spark.locality.wait", "0")
>   sparkConf.set("spark.executor.memory", "32g")
>   sparkConf.set("spark.streaming.unpersist", "true")
>   sparkConf.set("spark.shuffle.io.numConnectionsPerPeer", "1")
>   sparkConf.set("spark.rdd.compress", "true")
>   sparkConf.set("spark.shuffle.memoryFraction", ".6")
>   sparkConf.set("spark.storage.memoryFraction", ".2")
>   sparkConf.set("spark.shuffle.spill", "true")
>   sparkConf.set("spark.shuffle.spill.compress", "true")
>   sparkConf.set("spark.streaming.receiver.writeAheadLog.enable",
> "true")
>   sparkConf.set("spark.streaming.blockInterval", "400")
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-the-number-of-executors-and-tasks-in-a-Spark-Streaming-job-in-Mesos-tp24348.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Finding the number of executors.

2015-08-26 Thread Virgil Palanciuc
As I was writing a long-ish message to explain how it doesn't work, it
dawned on me that maybe driver connects to executors only after there's
some work to do (while I was trying to find the number of executors BEFORE
starting the actual work).

So the solution was to simply execute a dummy task (
sparkContext.parallelize(1 until 1000, 200).reduce(_+_) ) before attempting
to retrieve the executors. It works now :)

Virgil.

On Sat, Aug 22, 2015 at 12:44 AM, Du Li  wrote:

> Following is a method that retrieves the list of executors registered to a
> spark context. It worked perfectly with spark-submit in standalone mode for
> my project.
>
> /**
>* A simplified method that just returns the current active/registered
> executors
>* excluding the driver.
>* @param sc
>*   The spark context to retrieve registered executors.
>* @return
>* A list of executors each in the form of host:port.
>*/
>   def currentActiveExecutors(sc: SparkContext): Seq[String] = {
> val allExecutors = sc.getExecutorMemoryStatus.map(_._1)
> val driverHost: String = sc.getConf.get("spark.driver.host")
> allExecutors.filter(! _.split(":")(0).equals(driverHost)).toList
>   }
>
>
>
>
> On Friday, August 21, 2015 1:53 PM, Virgil Palanciuc 
> wrote:
>
>
> Hi Akhil,
>
> I'm using spark 1.4.1.
> Number of executors is not in the command line, not in the 
> getExecutorMemoryStatus
> (I already mentioned that I tried that, works in spark-shell but not when
> executed via spark-submit). I tried looking at "defaultParallelism" too,
> it's 112 (7 executors * 16 cores) when ran via spark-shell, but just 2 when
> ran via spark-submit.
>
> But the scheduler obviously knows this information. It *must* know it. How
> can I access it? Other that parsing the HTML of the WebUI, that is...
> that's pretty much guaranteed to work, and maybe I'll do that, but it's
> extremely convoluted.
>
> Regards,
> Virgil.
>
> On Fri, Aug 21, 2015 at 11:35 PM, Akhil Das 
> wrote:
>
> Which version spark are you using? There was a discussion happened over
> here
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Determine-number-of-running-executors-td19453.html
>
> http://mail-archives.us.apache.org/mod_mbox/spark-user/201411.mbox/%3ccacbyxk+ya1rbbnkwjheekpnbsbh10rykuzt-laqgpdanvhm...@mail.gmail.com%3E
> On Aug 21, 2015 7:42 AM, "Virgil Palanciuc"  wrote:
>
> Is there any reliable way to find out the number of executors
> programatically - regardless of how the job  is run? A method that
> preferably works for spark-standalone, yarn, mesos, regardless whether the
> code runs from the shell or not?
>
> Things that I tried and don't work:
> - sparkContext.getExecutorMemoryStatus.size - 1 // works from the shell,
> does not work if task submitted via  spark-submit
> - sparkContext.getConf.getInt("spark.executor.instances", 1) - doesn't
> work unless explicitly configured
> - call to http://master:8080/json (this used to work, but doesn't
> anymore?)
>
> I guess I could parse the output html from the Spark UI... but that seems
> dumb. is there really no better way?
>
> Thanks,
> Virgil.
>
>
>
>
>
>


Re: Finding the number of executors.

2015-08-21 Thread Du Li
Following is a method that retrieves the list of executors registered to a 
spark context. It worked perfectly with spark-submit in standalone mode for my 
project.
/**   * A simplified method that just returns the current active/registered 
executors   * excluding the driver.   * @param sc   *           The spark 
context to retrieve registered executors.   * @return   *         A list of 
executors each in the form of host:port.   */  def currentActiveExecutors(sc: 
SparkContext): Seq[String] = {    val allExecutors = 
sc.getExecutorMemoryStatus.map(_._1)    val driverHost: String = 
sc.getConf.get("spark.driver.host")    allExecutors.filter(! 
_.split(":")(0).equals(driverHost)).toList  }
 


 On Friday, August 21, 2015 1:53 PM, Virgil Palanciuc  
wrote:
   

 Hi Akhil,
I'm using spark 1.4.1. Number of executors is not in the command line, not in 
the getExecutorMemoryStatus (I already mentioned that I tried that, works in 
spark-shell but not when executed via spark-submit). I tried looking at 
"defaultParallelism" too, it's 112 (7 executors * 16 cores) when ran via 
spark-shell, but just 2 when ran via spark-submit.
But the scheduler obviously knows this information. It *must* know it. How can 
I access it? Other that parsing the HTML of the WebUI, that is... that's pretty 
much guaranteed to work, and maybe I'll do that, but it's extremely convoluted.
Regards,Virgil.
On Fri, Aug 21, 2015 at 11:35 PM, Akhil Das  wrote:

Which version spark are you using? There was a discussion happened over here 
http://apache-spark-user-list.1001560.n3.nabble.com/Determine-number-of-running-executors-td19453.htmlhttp://mail-archives.us.apache.org/mod_mbox/spark-user/201411.mbox/%3ccacbyxk+ya1rbbnkwjheekpnbsbh10rykuzt-laqgpdanvhm...@mail.gmail.com%3EOn
 Aug 21, 2015 7:42 AM, "Virgil Palanciuc"  wrote:

Is there any reliable way to find out the number of executors programatically - 
regardless of how the job  is run? A method that preferably works for 
spark-standalone, yarn, mesos, regardless whether the code runs from the shell 
or not?
Things that I tried and don't work:- sparkContext.getExecutorMemoryStatus.size 
- 1 // works from the shell, does not work if task submitted via  spark-submit- 
sparkContext.getConf.getInt("spark.executor.instances", 1) - doesn't work 
unless explicitly configured- call to http://master:8080/json (this used to 
work, but doesn't anymore?)
I guess I could parse the output html from the Spark UI... but that seems dumb. 
is there really no better way?
Thanks,Virgil.






  

Re: Finding the number of executors.

2015-08-21 Thread Virgil Palanciuc
Hi Akhil,

I'm using spark 1.4.1.
Number of executors is not in the command line, not in the
getExecutorMemoryStatus
(I already mentioned that I tried that, works in spark-shell but not when
executed via spark-submit). I tried looking at "defaultParallelism" too,
it's 112 (7 executors * 16 cores) when ran via spark-shell, but just 2 when
ran via spark-submit.

But the scheduler obviously knows this information. It *must* know it. How
can I access it? Other that parsing the HTML of the WebUI, that is...
that's pretty much guaranteed to work, and maybe I'll do that, but it's
extremely convoluted.

Regards,
Virgil.

On Fri, Aug 21, 2015 at 11:35 PM, Akhil Das 
wrote:

> Which version spark are you using? There was a discussion happened over
> here
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Determine-number-of-running-executors-td19453.html
>
>
> http://mail-archives.us.apache.org/mod_mbox/spark-user/201411.mbox/%3ccacbyxk+ya1rbbnkwjheekpnbsbh10rykuzt-laqgpdanvhm...@mail.gmail.com%3E
> On Aug 21, 2015 7:42 AM, "Virgil Palanciuc"  wrote:
>
>> Is there any reliable way to find out the number of executors
>> programatically - regardless of how the job  is run? A method that
>> preferably works for spark-standalone, yarn, mesos, regardless whether the
>> code runs from the shell or not?
>>
>> Things that I tried and don't work:
>> - sparkContext.getExecutorMemoryStatus.size - 1 // works from the shell,
>> does not work if task submitted via  spark-submit
>> - sparkContext.getConf.getInt("spark.executor.instances", 1) - doesn't
>> work unless explicitly configured
>> - call to http://master:8080/json (this used to work, but doesn't
>> anymore?)
>>
>> I guess I could parse the output html from the Spark UI... but that seems
>> dumb. is there really no better way?
>>
>> Thanks,
>> Virgil.
>>
>>
>>


Re: Finding the number of executors.

2015-08-21 Thread Akhil Das
Which version spark are you using? There was a discussion happened over
here
http://apache-spark-user-list.1001560.n3.nabble.com/Determine-number-of-running-executors-td19453.html

http://mail-archives.us.apache.org/mod_mbox/spark-user/201411.mbox/%3ccacbyxk+ya1rbbnkwjheekpnbsbh10rykuzt-laqgpdanvhm...@mail.gmail.com%3E
On Aug 21, 2015 7:42 AM, "Virgil Palanciuc"  wrote:

> Is there any reliable way to find out the number of executors
> programatically - regardless of how the job  is run? A method that
> preferably works for spark-standalone, yarn, mesos, regardless whether the
> code runs from the shell or not?
>
> Things that I tried and don't work:
> - sparkContext.getExecutorMemoryStatus.size - 1 // works from the shell,
> does not work if task submitted via  spark-submit
> - sparkContext.getConf.getInt("spark.executor.instances", 1) - doesn't
> work unless explicitly configured
> - call to http://master:8080/json (this used to work, but doesn't
> anymore?)
>
> I guess I could parse the output html from the Spark UI... but that seems
> dumb. is there really no better way?
>
> Thanks,
> Virgil.
>
>
>


Finding the number of executors.

2015-08-21 Thread Virgil Palanciuc
Is there any reliable way to find out the number of executors
programatically - regardless of how the job  is run? A method that
preferably works for spark-standalone, yarn, mesos, regardless whether the
code runs from the shell or not?

Things that I tried and don't work:
- sparkContext.getExecutorMemoryStatus.size - 1 // works from the shell,
does not work if task submitted via  spark-submit
- sparkContext.getConf.getInt("spark.executor.instances", 1) - doesn't work
unless explicitly configured
- call to http://master:8080/json (this used to work, but doesn't anymore?)

I guess I could parse the output html from the Spark UI... but that seems
dumb. is there really no better way?

Thanks,
Virgil.


How to set the number of executors and tasks in a Spark Streaming job in Mesos

2015-08-19 Thread swetha
Hi,

How to set the number of executors and tasks in a Spark Streaming job in
Mesos? I have the following settings but my job still shows me 11 active
tasks and 11 executors. Any idea as to why this is happening
?

 sparkConf.set("spark.mesos.coarse", "true")
  sparkConf.set("spark.cores.max", "128")
  sparkConf.set("spark.default.parallelism", "100")
  //sparkConf.set("spark.locality.wait", "0")
  sparkConf.set("spark.executor.memory", "32g")
  sparkConf.set("spark.streaming.unpersist", "true")
  sparkConf.set("spark.shuffle.io.numConnectionsPerPeer", "1")
  sparkConf.set("spark.rdd.compress", "true")
  sparkConf.set("spark.shuffle.memoryFraction", ".6")
  sparkConf.set("spark.storage.memoryFraction", ".2")
  sparkConf.set("spark.shuffle.spill", "true")
  sparkConf.set("spark.shuffle.spill.compress", "true")
  sparkConf.set("spark.streaming.receiver.writeAheadLog.enable", "true")
  sparkConf.set("spark.streaming.blockInterval", "400")



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-the-number-of-executors-and-tasks-in-a-Spark-Streaming-job-in-Mesos-tp24348.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Controlling number of executors on Mesos vs YARN

2015-08-13 Thread Ajay Singal
Tim,

The ability to specify fine-grain configuration could be useful for many
reasons.  Let's take an example of a node with 32 cores.  All of first, as
per my understanding, having 5 executors each with 6 cores will almost
always perform better than having a single executor with 30 cores .  Also,
these 5 executors could be a) used by the same application, or b) shared
amongst multiple applications.  In case of single executor with 30 cores,
some of the slots/core could be wasted if there are less number of Tasks
(from a single application) to be executed.

As I said, applications can specify desirable number of executors.  If not
available, Mesos (in a simple implementation) can provide/offer whatever is
available. In a slightly complex implementation, we can build a simple
protocol to negotiate.

Regards,
Ajay

On Wed, Aug 12, 2015 at 5:51 PM, Tim Chen  wrote:

> You're referring to both fine grain and coarse grain?
>
> Desirable number of executors per node could be interesting but it can't
> be guaranteed (or we could try to and when failed abort the job).
>
> How would you imagine this new option to actually work?
>
>
> Tim
>
> On Wed, Aug 12, 2015 at 11:48 AM, Ajay Singal  wrote:
>
>> Hi Tim,
>>
>> An option like spark.mesos.executor.max to cap the number of executors
>> per node/application would be very useful.  However, having an option like 
>> spark.mesos.executor.num
>> to specify desirable number of executors per node would provide even/much
>> better control.
>>
>> Thanks,
>> Ajay
>>
>> On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen  wrote:
>>
>>> Yes the options are not that configurable yet but I think it's not hard
>>> to change it.
>>>
>>> I have a patch out actually specifically able to configure amount of
>>> cpus per executor in coarse grain mode, and hopefully merged next release.
>>>
>>> I think the open question now is for fine grain mode can we limit the
>>> number of maximum concurrent executors, and I think we can definitely just
>>> add a new option like spark.mesos.executor.max to cap it.
>>>
>>> I'll file a jira and hopefully to get this change in soon too.
>>>
>>> Tim
>>>
>>>
>>>
>>> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
>>> aharipriy...@gmail.com> wrote:
>>>
>>>> Spark evolved as an example framework for Mesos - thats how I know it.
>>>> It is surprising to see that the options provided by mesos in this case are
>>>> less. Tweaking the source code, haven't done it yet but I would love to see
>>>> what options could be there!
>>>>
>>>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam 
>>>> wrote:
>>>>
>>>>> My experience with Mesos + Spark is not great. I saw one executor with
>>>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>>>> configure it without some tweaking at the source code.
>>>>>
>>>>> Sent from my iPad
>>>>>
>>>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>>>> aharipriy...@gmail.com> wrote:
>>>>>
>>>>> Hi Tim,
>>>>>
>>>>> Spark on Yarn allows us to do it using --num-executors and
>>>>> --executor_cores commandline arguments. I just got a chance to look at a
>>>>> similar spark user list mail, but no answer yet. So does mesos allow
>>>>> setting the number of executors and cores? Is there a default number it
>>>>> assumes?
>>>>>
>>>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
>>>>>
>>>>>> Forgot to hit reply-all.
>>>>>>
>>>>>> -- Forwarded message --
>>>>>> From: Tim Chen 
>>>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>>>> To: mvle 
>>>>>>
>>>>>>
>>>>>> Hi Mike,
>>>>>>
>>>>>> You're correct there is no such setting in for Mesos coarse grain
>>>>>> mode, since the assumption is that each node is launched with one 
>>>>>> container
>>>>>> and Spark is launching multiple tasks in that container.
>>>>>>
>>>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>>>> will launch an ex

Re: Controlling number of executors on Mesos vs YARN

2015-08-13 Thread Ajay Singal
Hi Tim,

An option like spark.mesos.executor.max to cap the number of executors per
node/application would be very useful.  However, having an option like
spark.mesos.executor.num
to specify desirable number of executors per node would provide even/much
better control.

Thanks,
Ajay

On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen  wrote:

> Yes the options are not that configurable yet but I think it's not hard to
> change it.
>
> I have a patch out actually specifically able to configure amount of cpus
> per executor in coarse grain mode, and hopefully merged next release.
>
> I think the open question now is for fine grain mode can we limit the
> number of maximum concurrent executors, and I think we can definitely just
> add a new option like spark.mesos.executor.max to cap it.
>
> I'll file a jira and hopefully to get this change in soon too.
>
> Tim
>
>
>
> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
> aharipriy...@gmail.com> wrote:
>
>> Spark evolved as an example framework for Mesos - thats how I know it. It
>> is surprising to see that the options provided by mesos in this case are
>> less. Tweaking the source code, haven't done it yet but I would love to see
>> what options could be there!
>>
>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam  wrote:
>>
>>> My experience with Mesos + Spark is not great. I saw one executor with
>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>> configure it without some tweaking at the source code.
>>>
>>> Sent from my iPad
>>>
>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>> aharipriy...@gmail.com> wrote:
>>>
>>> Hi Tim,
>>>
>>> Spark on Yarn allows us to do it using --num-executors and
>>> --executor_cores commandline arguments. I just got a chance to look at a
>>> similar spark user list mail, but no answer yet. So does mesos allow
>>> setting the number of executors and cores? Is there a default number it
>>> assumes?
>>>
>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
>>>
>>>> Forgot to hit reply-all.
>>>>
>>>> -- Forwarded message --
>>>> From: Tim Chen 
>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>> To: mvle 
>>>>
>>>>
>>>> Hi Mike,
>>>>
>>>> You're correct there is no such setting in for Mesos coarse grain mode,
>>>> since the assumption is that each node is launched with one container and
>>>> Spark is launching multiple tasks in that container.
>>>>
>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>> will launch an executor as long as it satisfies the minimum container
>>>> resource requirement.
>>>>
>>>> I've created a JIRA earlier about capping the number of executors or
>>>> better distribute the # of executors launched in each node. Since the
>>>> decision of choosing what node to launch containers is all in the Spark
>>>> scheduler side, it's very easy to modify it.
>>>>
>>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>>
>>>> Thanks,
>>>>
>>>> Tim
>>>>
>>>>
>>>>
>>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
>>>>
>>>>> I'm trying to compare the performance of Spark running on Mesos vs
>>>>> YARN.
>>>>> However, I am having problems being able to configure the Spark
>>>>> workload to
>>>>> run in a similar way on Mesos and YARN.
>>>>>
>>>>> When running Spark on YARN, you can specify the number of executors per
>>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on
>>>>> that
>>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>>> equivalent
>>>>> way to specify this. In Mesos, you can somewhat force this by
>>>>> specifying the
>>>>> number of CPU resources to be 6 when running the slave daemon.
>>>>> However, this
>>>>> seems to be a static configuration of the Mesos cluster rather
>>>>> something
>>>>> that can be configured in the Spark framework.
>>>>>
>>>>> So here is my question:
>

Re: Controlling number of executors on Mesos vs YARN

2015-08-12 Thread Tim Chen
You're referring to both fine grain and coarse grain?

Desirable number of executors per node could be interesting but it can't be
guaranteed (or we could try to and when failed abort the job).

How would you imagine this new option to actually work?


Tim

On Wed, Aug 12, 2015 at 11:48 AM, Ajay Singal  wrote:

> Hi Tim,
>
> An option like spark.mesos.executor.max to cap the number of executors
> per node/application would be very useful.  However, having an option like 
> spark.mesos.executor.num
> to specify desirable number of executors per node would provide even/much
> better control.
>
> Thanks,
> Ajay
>
> On Wed, Aug 12, 2015 at 4:18 AM, Tim Chen  wrote:
>
>> Yes the options are not that configurable yet but I think it's not hard
>> to change it.
>>
>> I have a patch out actually specifically able to configure amount of cpus
>> per executor in coarse grain mode, and hopefully merged next release.
>>
>> I think the open question now is for fine grain mode can we limit the
>> number of maximum concurrent executors, and I think we can definitely just
>> add a new option like spark.mesos.executor.max to cap it.
>>
>> I'll file a jira and hopefully to get this change in soon too.
>>
>> Tim
>>
>>
>>
>> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
>> aharipriy...@gmail.com> wrote:
>>
>>> Spark evolved as an example framework for Mesos - thats how I know it.
>>> It is surprising to see that the options provided by mesos in this case are
>>> less. Tweaking the source code, haven't done it yet but I would love to see
>>> what options could be there!
>>>
>>> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam  wrote:
>>>
>>>> My experience with Mesos + Spark is not great. I saw one executor with
>>>> 30 CPU and the other executor with 6. So I don't think you can easily
>>>> configure it without some tweaking at the source code.
>>>>
>>>> Sent from my iPad
>>>>
>>>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>>>> aharipriy...@gmail.com> wrote:
>>>>
>>>> Hi Tim,
>>>>
>>>> Spark on Yarn allows us to do it using --num-executors and
>>>> --executor_cores commandline arguments. I just got a chance to look at a
>>>> similar spark user list mail, but no answer yet. So does mesos allow
>>>> setting the number of executors and cores? Is there a default number it
>>>> assumes?
>>>>
>>>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
>>>>
>>>>> Forgot to hit reply-all.
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Tim Chen 
>>>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>>>> To: mvle 
>>>>>
>>>>>
>>>>> Hi Mike,
>>>>>
>>>>> You're correct there is no such setting in for Mesos coarse grain
>>>>> mode, since the assumption is that each node is launched with one 
>>>>> container
>>>>> and Spark is launching multiple tasks in that container.
>>>>>
>>>>> In fine-grain mode there isn't a setting like that, as it currently
>>>>> will launch an executor as long as it satisfies the minimum container
>>>>> resource requirement.
>>>>>
>>>>> I've created a JIRA earlier about capping the number of executors or
>>>>> better distribute the # of executors launched in each node. Since the
>>>>> decision of choosing what node to launch containers is all in the Spark
>>>>> scheduler side, it's very easy to modify it.
>>>>>
>>>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
>>>>>
>>>>>> I'm trying to compare the performance of Spark running on Mesos vs
>>>>>> YARN.
>>>>>> However, I am having problems being able to configure the Spark
>>>>>> workload to
>>>>>> run in a similar way on Mesos and YARN.
>>>>>>
>>>>>> When running Spark on YARN, you can specify the number 

Re: Controlling number of executors on Mesos vs YARN

2015-08-12 Thread Jerry Lam
Great stuff Tim. This definitely will make Mesos users life easier

Sent from my iPad

On 2015-08-12, at 11:52, Haripriya Ayyalasomayajula  
wrote:

> Thanks Tim, Jerry.
> 
> On Wed, Aug 12, 2015 at 1:18 AM, Tim Chen  wrote:
> Yes the options are not that configurable yet but I think it's not hard to 
> change it.
> 
> I have a patch out actually specifically able to configure amount of cpus per 
> executor in coarse grain mode, and hopefully merged next release.
> 
> I think the open question now is for fine grain mode can we limit the number 
> of maximum concurrent executors, and I think we can definitely just add a new 
> option like spark.mesos.executor.max to cap it. 
> 
> I'll file a jira and hopefully to get this change in soon too.
> 
> Tim
> 
> 
> 
> On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula 
>  wrote:
> Spark evolved as an example framework for Mesos - thats how I know it. It is 
> surprising to see that the options provided by mesos in this case are less. 
> Tweaking the source code, haven't done it yet but I would love to see what 
> options could be there! 
> 
> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam  wrote:
> My experience with Mesos + Spark is not great. I saw one executor with 30 CPU 
> and the other executor with 6. So I don't think you can easily configure it 
> without some tweaking at the source code.
> 
> Sent from my iPad
> 
> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula  
> wrote:
> 
>> Hi Tim,
>> 
>> Spark on Yarn allows us to do it using --num-executors and --executor_cores 
>> commandline arguments. I just got a chance to look at a similar spark user 
>> list mail, but no answer yet. So does mesos allow setting the number of 
>> executors and cores? Is there a default number it assumes?
>> 
>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
>> Forgot to hit reply-all.
>> 
>> -- Forwarded message --
>> From: Tim Chen 
>> Date: Sun, Jan 4, 2015 at 10:46 PM
>> Subject: Re: Controlling number of executors on Mesos vs YARN
>> To: mvle 
>> 
>> 
>> Hi Mike,
>> 
>> You're correct there is no such setting in for Mesos coarse grain mode, 
>> since the assumption is that each node is launched with one container and 
>> Spark is launching multiple tasks in that container.
>> 
>> In fine-grain mode there isn't a setting like that, as it currently will 
>> launch an executor as long as it satisfies the minimum container resource 
>> requirement.
>> 
>> I've created a JIRA earlier about capping the number of executors or better 
>> distribute the # of executors launched in each node. Since the decision of 
>> choosing what node to launch containers is all in the Spark scheduler side, 
>> it's very easy to modify it.
>> 
>> Btw, what's the configuration to set the # of executors on YARN side?
>> 
>> Thanks,
>> 
>> Tim
>> 
>> 
>> 
>> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>> However, I am having problems being able to configure the Spark workload to
>> run in a similar way on Mesos and YARN.
>> 
>> When running Spark on YARN, you can specify the number of executors per
>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>> node. When running Spark on Mesos, there doesn't seem to be an equivalent
>> way to specify this. In Mesos, you can somewhat force this by specifying the
>> number of CPU resources to be 6 when running the slave daemon. However, this
>> seems to be a static configuration of the Mesos cluster rather something
>> that can be configured in the Spark framework.
>> 
>> So here is my question:
>> 
>> For Spark on Mesos, am I correct that there is no way to control the number
>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>> coarse-grained mode, there is a way to specify max_cores but that is still
>> not equivalent to specifying the number of executors per node as when Spark
>> is run on YARN.
>> 
>> If I am correct, then it seems Spark might be at a disadvantage running on
>> Mesos compared to YARN (since it lacks the fine tuning ability provided by
>> YARN).
>> 
>> Thanks,
>> Mike
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>> 
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Regards,
>> Haripriya Ayyalasomayajula 
>> 
> 
> 
> 
> -- 
> Regards,
> Haripriya Ayyalasomayajula 
> 
> 
> 
> 
> 
> -- 
> Regards,
> Haripriya Ayyalasomayajula 
> 


Re: Controlling number of executors on Mesos vs YARN

2015-08-12 Thread Tim Chen
Yes the options are not that configurable yet but I think it's not hard to
change it.

I have a patch out actually specifically able to configure amount of cpus
per executor in coarse grain mode, and hopefully merged next release.

I think the open question now is for fine grain mode can we limit the
number of maximum concurrent executors, and I think we can definitely just
add a new option like spark.mesos.executor.max to cap it.

I'll file a jira and hopefully to get this change in soon too.

Tim



On Tue, Aug 11, 2015 at 6:21 AM, Haripriya Ayyalasomayajula <
aharipriy...@gmail.com> wrote:

> Spark evolved as an example framework for Mesos - thats how I know it. It
> is surprising to see that the options provided by mesos in this case are
> less. Tweaking the source code, haven't done it yet but I would love to see
> what options could be there!
>
> On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam  wrote:
>
>> My experience with Mesos + Spark is not great. I saw one executor with 30
>> CPU and the other executor with 6. So I don't think you can easily
>> configure it without some tweaking at the source code.
>>
>> Sent from my iPad
>>
>> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula <
>> aharipriy...@gmail.com> wrote:
>>
>> Hi Tim,
>>
>> Spark on Yarn allows us to do it using --num-executors and
>> --executor_cores commandline arguments. I just got a chance to look at a
>> similar spark user list mail, but no answer yet. So does mesos allow
>> setting the number of executors and cores? Is there a default number it
>> assumes?
>>
>> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
>>
>>> Forgot to hit reply-all.
>>>
>>> -- Forwarded message --
>>> From: Tim Chen 
>>> Date: Sun, Jan 4, 2015 at 10:46 PM
>>> Subject: Re: Controlling number of executors on Mesos vs YARN
>>> To: mvle 
>>>
>>>
>>> Hi Mike,
>>>
>>> You're correct there is no such setting in for Mesos coarse grain mode,
>>> since the assumption is that each node is launched with one container and
>>> Spark is launching multiple tasks in that container.
>>>
>>> In fine-grain mode there isn't a setting like that, as it currently will
>>> launch an executor as long as it satisfies the minimum container resource
>>> requirement.
>>>
>>> I've created a JIRA earlier about capping the number of executors or
>>> better distribute the # of executors launched in each node. Since the
>>> decision of choosing what node to launch containers is all in the Spark
>>> scheduler side, it's very easy to modify it.
>>>
>>> Btw, what's the configuration to set the # of executors on YARN side?
>>>
>>> Thanks,
>>>
>>> Tim
>>>
>>>
>>>
>>> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
>>>
>>>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>>>> However, I am having problems being able to configure the Spark
>>>> workload to
>>>> run in a similar way on Mesos and YARN.
>>>>
>>>> When running Spark on YARN, you can specify the number of executors per
>>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>>>> node. When running Spark on Mesos, there doesn't seem to be an
>>>> equivalent
>>>> way to specify this. In Mesos, you can somewhat force this by
>>>> specifying the
>>>> number of CPU resources to be 6 when running the slave daemon. However,
>>>> this
>>>> seems to be a static configuration of the Mesos cluster rather something
>>>> that can be configured in the Spark framework.
>>>>
>>>> So here is my question:
>>>>
>>>> For Spark on Mesos, am I correct that there is no way to control the
>>>> number
>>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>>> coarse-grained mode, there is a way to specify max_cores but that is
>>>> still
>>>> not equivalent to specifying the number of executors per node as when
>>>> Spark
>>>> is run on YARN.
>>>>
>>>> If I am correct, then it seems Spark might be at a disadvantage running
>>>> on
>>>> Mesos compared to YARN (since it lacks the fine tuning ability provided
>>>> by
>>>> YARN).
>>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>>> Sent from the Apache Spark User List mailing list archive at Nabble.com
>>>> .
>>>>
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> Regards,
>> Haripriya Ayyalasomayajula
>>
>>
>
>
> --
> Regards,
> Haripriya Ayyalasomayajula
>
>


Re: Controlling number of executors on Mesos vs YARN

2015-08-11 Thread Haripriya Ayyalasomayajula
Spark evolved as an example framework for Mesos - thats how I know it. It
is surprising to see that the options provided by mesos in this case are
less. Tweaking the source code, haven't done it yet but I would love to see
what options could be there!

On Tue, Aug 11, 2015 at 5:42 AM, Jerry Lam  wrote:

> My experience with Mesos + Spark is not great. I saw one executor with 30
> CPU and the other executor with 6. So I don't think you can easily
> configure it without some tweaking at the source code.
>
> Sent from my iPad
>
> On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula 
> wrote:
>
> Hi Tim,
>
> Spark on Yarn allows us to do it using --num-executors and
> --executor_cores commandline arguments. I just got a chance to look at a
> similar spark user list mail, but no answer yet. So does mesos allow
> setting the number of executors and cores? Is there a default number it
> assumes?
>
> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
>
>> Forgot to hit reply-all.
>>
>> -- Forwarded message --
>> From: Tim Chen 
>> Date: Sun, Jan 4, 2015 at 10:46 PM
>> Subject: Re: Controlling number of executors on Mesos vs YARN
>> To: mvle 
>>
>>
>> Hi Mike,
>>
>> You're correct there is no such setting in for Mesos coarse grain mode,
>> since the assumption is that each node is launched with one container and
>> Spark is launching multiple tasks in that container.
>>
>> In fine-grain mode there isn't a setting like that, as it currently will
>> launch an executor as long as it satisfies the minimum container resource
>> requirement.
>>
>> I've created a JIRA earlier about capping the number of executors or
>> better distribute the # of executors launched in each node. Since the
>> decision of choosing what node to launch containers is all in the Spark
>> scheduler side, it's very easy to modify it.
>>
>> Btw, what's the configuration to set the # of executors on YARN side?
>>
>> Thanks,
>>
>> Tim
>>
>>
>>
>> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
>>
>>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>>> However, I am having problems being able to configure the Spark workload
>>> to
>>> run in a similar way on Mesos and YARN.
>>>
>>> When running Spark on YARN, you can specify the number of executors per
>>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>>> node. When running Spark on Mesos, there doesn't seem to be an equivalent
>>> way to specify this. In Mesos, you can somewhat force this by specifying
>>> the
>>> number of CPU resources to be 6 when running the slave daemon. However,
>>> this
>>> seems to be a static configuration of the Mesos cluster rather something
>>> that can be configured in the Spark framework.
>>>
>>> So here is my question:
>>>
>>> For Spark on Mesos, am I correct that there is no way to control the
>>> number
>>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>>> coarse-grained mode, there is a way to specify max_cores but that is
>>> still
>>> not equivalent to specifying the number of executors per node as when
>>> Spark
>>> is run on YARN.
>>>
>>> If I am correct, then it seems Spark might be at a disadvantage running
>>> on
>>> Mesos compared to YARN (since it lacks the fine tuning ability provided
>>> by
>>> YARN).
>>>
>>> Thanks,
>>> Mike
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>>
>
>
> --
> Regards,
> Haripriya Ayyalasomayajula
>
>


-- 
Regards,
Haripriya Ayyalasomayajula


Re: Controlling number of executors on Mesos vs YARN

2015-08-11 Thread Jerry Lam
My experience with Mesos + Spark is not great. I saw one executor with 30 CPU 
and the other executor with 6. So I don't think you can easily configure it 
without some tweaking at the source code.

Sent from my iPad

On 2015-08-11, at 2:38, Haripriya Ayyalasomayajula  
wrote:

> Hi Tim,
> 
> Spark on Yarn allows us to do it using --num-executors and --executor_cores 
> commandline arguments. I just got a chance to look at a similar spark user 
> list mail, but no answer yet. So does mesos allow setting the number of 
> executors and cores? Is there a default number it assumes?
> 
> On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:
> Forgot to hit reply-all.
> 
> -- Forwarded message --
> From: Tim Chen 
> Date: Sun, Jan 4, 2015 at 10:46 PM
> Subject: Re: Controlling number of executors on Mesos vs YARN
> To: mvle 
> 
> 
> Hi Mike,
> 
> You're correct there is no such setting in for Mesos coarse grain mode, since 
> the assumption is that each node is launched with one container and Spark is 
> launching multiple tasks in that container.
> 
> In fine-grain mode there isn't a setting like that, as it currently will 
> launch an executor as long as it satisfies the minimum container resource 
> requirement.
> 
> I've created a JIRA earlier about capping the number of executors or better 
> distribute the # of executors launched in each node. Since the decision of 
> choosing what node to launch containers is all in the Spark scheduler side, 
> it's very easy to modify it.
> 
> Btw, what's the configuration to set the # of executors on YARN side?
> 
> Thanks,
> 
> Tim
> 
> 
> 
> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
> I'm trying to compare the performance of Spark running on Mesos vs YARN.
> However, I am having problems being able to configure the Spark workload to
> run in a similar way on Mesos and YARN.
> 
> When running Spark on YARN, you can specify the number of executors per
> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
> node. When running Spark on Mesos, there doesn't seem to be an equivalent
> way to specify this. In Mesos, you can somewhat force this by specifying the
> number of CPU resources to be 6 when running the slave daemon. However, this
> seems to be a static configuration of the Mesos cluster rather something
> that can be configured in the Spark framework.
> 
> So here is my question:
> 
> For Spark on Mesos, am I correct that there is no way to control the number
> of executors per node (assuming an idle cluster)? For Spark on Mesos
> coarse-grained mode, there is a way to specify max_cores but that is still
> not equivalent to specifying the number of executors per node as when Spark
> is run on YARN.
> 
> If I am correct, then it seems Spark might be at a disadvantage running on
> Mesos compared to YARN (since it lacks the fine tuning ability provided by
> YARN).
> 
> Thanks,
> Mike
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 
> 
> 
> 
> 
> 
> -- 
> Regards,
> Haripriya Ayyalasomayajula 
> 


Re: Controlling number of executors on Mesos vs YARN

2015-08-10 Thread Haripriya Ayyalasomayajula
Hi Tim,

Spark on Yarn allows us to do it using --num-executors and --executor_cores
commandline arguments. I just got a chance to look at a similar spark user
list mail, but no answer yet. So does mesos allow setting the number of
executors and cores? Is there a default number it assumes?

On Mon, Jan 5, 2015 at 5:07 PM, Tim Chen  wrote:

> Forgot to hit reply-all.
>
> -- Forwarded message --
> From: Tim Chen 
> Date: Sun, Jan 4, 2015 at 10:46 PM
> Subject: Re: Controlling number of executors on Mesos vs YARN
> To: mvle 
>
>
> Hi Mike,
>
> You're correct there is no such setting in for Mesos coarse grain mode,
> since the assumption is that each node is launched with one container and
> Spark is launching multiple tasks in that container.
>
> In fine-grain mode there isn't a setting like that, as it currently will
> launch an executor as long as it satisfies the minimum container resource
> requirement.
>
> I've created a JIRA earlier about capping the number of executors or
> better distribute the # of executors launched in each node. Since the
> decision of choosing what node to launch containers is all in the Spark
> scheduler side, it's very easy to modify it.
>
> Btw, what's the configuration to set the # of executors on YARN side?
>
> Thanks,
>
> Tim
>
>
>
> On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:
>
>> I'm trying to compare the performance of Spark running on Mesos vs YARN.
>> However, I am having problems being able to configure the Spark workload
>> to
>> run in a similar way on Mesos and YARN.
>>
>> When running Spark on YARN, you can specify the number of executors per
>> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
>> node. When running Spark on Mesos, there doesn't seem to be an equivalent
>> way to specify this. In Mesos, you can somewhat force this by specifying
>> the
>> number of CPU resources to be 6 when running the slave daemon. However,
>> this
>> seems to be a static configuration of the Mesos cluster rather something
>> that can be configured in the Spark framework.
>>
>> So here is my question:
>>
>> For Spark on Mesos, am I correct that there is no way to control the
>> number
>> of executors per node (assuming an idle cluster)? For Spark on Mesos
>> coarse-grained mode, there is a way to specify max_cores but that is still
>> not equivalent to specifying the number of executors per node as when
>> Spark
>> is run on YARN.
>>
>> If I am correct, then it seems Spark might be at a disadvantage running on
>> Mesos compared to YARN (since it lacks the fine tuning ability provided by
>> YARN).
>>
>> Thanks,
>> Mike
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>


-- 
Regards,
Haripriya Ayyalasomayajula


Re: Determining number of executors within RDD

2015-06-10 Thread Nishkam Ravi
This PR adds support for multiple executors per worker:
https://github.com/apache/spark/pull/731 and should be available in 1.4.

Thanks,
Nishkam

On Wed, Jun 10, 2015 at 1:35 PM, Evo Eftimov  wrote:

> We/i were discussing STANDALONE mode, besides maxdml had already
> summarized what is available and possible under yarn
>
> So let me recap - for standalone mode if you need more than 1 executor per
> physical host e.g. to partition its sys resources more finley (especialy
> RAM per jvm instance) you need to got for what is essentialy a bit of a
> hack ie runn8ng more than 1 workers per machine
>
>
> Sent from Samsung Mobile
>
>
>  Original message 
> From: Sandy Ryza
> Date:2015/06/10 21:31 (GMT+00:00)
> To: Evo Eftimov
> Cc: maxdml ,user@spark.apache.org
> Subject: Re: Determining number of executors within RDD
>
> On YARN, there is no concept of a Spark Worker.  Multiple executors will
> be run per node without any effort required by the user, as long as all the
> executors fit within each node's resource limits.
>
> -Sandy
>
> On Wed, Jun 10, 2015 at 3:24 PM, Evo Eftimov 
> wrote:
>
>> Yes  i think it is ONE worker ONE executor as executor is nothing but jvm
>> instance spawned by the worker
>>
>> To run more executors ie jvm instances on the same physical cluster node
>> you need to run more than one worker on that node and then allocate only
>> part of the sys resourced to that worker/executot
>>
>>
>> Sent from Samsung Mobile
>>
>>
>> ---- Original message 
>> From: maxdml
>> Date:2015/06/10 19:56 (GMT+00:00)
>> To: user@spark.apache.org
>> Subject: Re: Determining number of executors within RDD
>>
>> Actually this is somehow confusing for two reasons:
>>
>> - First, the option 'spark.executor.instances', which seems to be only
>> dealt
>> with in the case of YARN in the source code of SparkSubmit.scala, is also
>> present in the conf/spark-env.sh file under the standalone section, which
>> would indicate that it is also available for this mode
>>
>> - Second, a post from Andrew Or states that this properties define the
>> number of workers in the cluster, not the number of executors on a given
>> worker.
>> (
>> http://apache-spark-user-list.1001560.n3.nabble.com/clarification-for-some-spark-on-yarn-configuration-options-td13692.html
>> )
>>
>> Could anyone clarify this? :-)
>>
>> Thanks.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23262.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: Determining number of executors within RDD

2015-06-10 Thread Evo Eftimov
We/i were discussing STANDALONE mode, besides maxdml had already summarized 
what is available and possible under yarn

So let me recap - for standalone mode if you need more than 1 executor per 
physical host e.g. to partition its sys resources more finley (especialy RAM 
per jvm instance) you need to got for what is essentialy a bit of a hack ie 
runn8ng more than 1 workers per machine


Sent from Samsung Mobile

 Original message From: Sandy Ryza 
 Date:2015/06/10  21:31  (GMT+00:00) 
To: Evo Eftimov  Cc: maxdml 
,user@spark.apache.org Subject: Re: Determining 
number of executors within RDD 
On YARN, there is no concept of a Spark Worker.  Multiple executors will 
be run per node without any effort required by the user, as long as all the 
executors fit within each node's resource limits.

-Sandy

On Wed, Jun 10, 2015 at 3:24 PM, Evo Eftimov  wrote:
Yes  i think it is ONE worker ONE executor as executor is nothing but jvm 
instance spawned by the worker 

To run more executors ie jvm instances on the same physical cluster node you 
need to run more than one worker on that node and then allocate only part of 
the sys resourced to that worker/executot


Sent from Samsung Mobile


 Original message 
From: maxdml
Date:2015/06/10 19:56 (GMT+00:00)
To: user@spark.apache.org
Subject: Re: Determining number of executors within RDD

Actually this is somehow confusing for two reasons:

- First, the option 'spark.executor.instances', which seems to be only dealt
with in the case of YARN in the source code of SparkSubmit.scala, is also
present in the conf/spark-env.sh file under the standalone section, which
would indicate that it is also available for this mode

- Second, a post from Andrew Or states that this properties define the
number of workers in the cluster, not the number of executors on a given
worker.
(http://apache-spark-user-list.1001560.n3.nabble.com/clarification-for-some-spark-on-yarn-configuration-options-td13692.html)

Could anyone clarify this? :-)

Thanks.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23262.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org




Re: Determining number of executors within RDD

2015-06-10 Thread Sandy Ryza
On YARN, there is no concept of a Spark Worker.  Multiple executors will be
run per node without any effort required by the user, as long as all the
executors fit within each node's resource limits.

-Sandy

On Wed, Jun 10, 2015 at 3:24 PM, Evo Eftimov  wrote:

> Yes  i think it is ONE worker ONE executor as executor is nothing but jvm
> instance spawned by the worker
>
> To run more executors ie jvm instances on the same physical cluster node
> you need to run more than one worker on that node and then allocate only
> part of the sys resourced to that worker/executot
>
>
> Sent from Samsung Mobile
>
>
>  Original message 
> From: maxdml
> Date:2015/06/10 19:56 (GMT+00:00)
> To: user@spark.apache.org
> Subject: Re: Determining number of executors within RDD
>
> Actually this is somehow confusing for two reasons:
>
> - First, the option 'spark.executor.instances', which seems to be only
> dealt
> with in the case of YARN in the source code of SparkSubmit.scala, is also
> present in the conf/spark-env.sh file under the standalone section, which
> would indicate that it is also available for this mode
>
> - Second, a post from Andrew Or states that this properties define the
> number of workers in the cluster, not the number of executors on a given
> worker.
> (
> http://apache-spark-user-list.1001560.n3.nabble.com/clarification-for-some-spark-on-yarn-configuration-options-td13692.html
> )
>
> Could anyone clarify this? :-)
>
> Thanks.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23262.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Determining number of executors within RDD

2015-06-10 Thread Evo Eftimov
Yes  i think it is ONE worker ONE executor as executor is nothing but jvm 
instance spawned by the worker 

To run more executors ie jvm instances on the same physical cluster node you 
need to run more than one worker on that node and then allocate only part of 
the sys resourced to that worker/executot


Sent from Samsung Mobile

 Original message From: maxdml 
 Date:2015/06/10  19:56  (GMT+00:00) 
To: user@spark.apache.org Subject: Re: Determining number 
of executors within RDD 
Actually this is somehow confusing for two reasons:

- First, the option 'spark.executor.instances', which seems to be only dealt
with in the case of YARN in the source code of SparkSubmit.scala, is also
present in the conf/spark-env.sh file under the standalone section, which
would indicate that it is also available for this mode

- Second, a post from Andrew Or states that this properties define the
number of workers in the cluster, not the number of executors on a given
worker.
(http://apache-spark-user-list.1001560.n3.nabble.com/clarification-for-some-spark-on-yarn-configuration-options-td13692.html)

Could anyone clarify this? :-)

Thanks.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23262.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Determining number of executors within RDD

2015-06-10 Thread maxdml
Actually this is somehow confusing for two reasons:

- First, the option 'spark.executor.instances', which seems to be only dealt
with in the case of YARN in the source code of SparkSubmit.scala, is also
present in the conf/spark-env.sh file under the standalone section, which
would indicate that it is also available for this mode

- Second, a post from Andrew Or states that this properties define the
number of workers in the cluster, not the number of executors on a given
worker.
(http://apache-spark-user-list.1001560.n3.nabble.com/clarification-for-some-spark-on-yarn-configuration-options-td13692.html)

Could anyone clarify this? :-)

Thanks.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23262.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Determining number of executors within RDD

2015-06-10 Thread maxdml
Note that this property is only available for YARN



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23256.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Determining number of executors within RDD

2015-06-10 Thread Himanshu Mehra
Hi Akshat,

I assume what you want is to make sure the number of partitions in your RDD,
which is easily achievable by passing numSlices and minSplits argument at
the time of RDD creation. example :
val someRDD = sc.parallelize(someCollection, numSlices) /
val someRDD = sc.textFile(pathToFile, minSplits)

you can check the number of partition your RDD has by
'someRDD.partitions.size'. And if you want to reduce or increase the number
of partitions you can call 'repartition(numPartition)' method which which
reshuffle the data and partition it in 'numPartition' partitions. 

And of course if you want you can determine the number of executor as well
by setting 'spark.executor.instances' property in 'sparkConf' object.

Thank you.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23241.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Determining number of executors within RDD

2015-06-09 Thread maxdml
You should try, from the SparkConf object, to issue a get.

I don't have the exact name for the matching key, but from reading the code
in SparkSubmit.scala, it should be something like:

conf.get("spark.executor.instances")



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Determining-number-of-executors-within-RDD-tp15554p23234.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: number of executors

2015-05-18 Thread xiaohe lan
Yeah, I read that page before, but it does not mention the options should
come before the application jar. Actually, if I put the --class option
before the application jar, I will get  ClassNotFoundException.

Anyway, thanks again Sandy.

On Tue, May 19, 2015 at 11:06 AM, Sandy Ryza 
wrote:

> Awesome!
>
> It's documented here:
> https://spark.apache.org/docs/latest/submitting-applications.html
>
> -Sandy
>
> On Mon, May 18, 2015 at 8:03 PM, xiaohe lan 
> wrote:
>
>> Hi Sandy,
>>
>> Thanks for your information. Yes, spark-submit --master yarn
>> --num-executors 5 --executor-cores 4
>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp is
>> working awesomely. Is there any documentations pointing to this ?
>>
>> Thanks,
>> Xiaohe
>>
>> On Tue, May 19, 2015 at 12:07 AM, Sandy Ryza 
>> wrote:
>>
>>> Hi Xiaohe,
>>>
>>> The all Spark options must go before the jar or they won't take effect.
>>>
>>> -Sandy
>>>
>>> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan 
>>> wrote:
>>>
 Sorry, them both are assigned task actually.

 Aggregated Metrics by Executor
 Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
 Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
 Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 
 121007701630.4
 MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 
 109269121646.6
 MB304.8 MB

 On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
 wrote:

> bash-4.1$ ps aux | grep SparkSubmit
> xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
> /scratch/xilan/jdk1.8.0_45/bin/java -cp
> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
> --num-executors 5 --executor-cores 4
> xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
> --color SparkSubmit
>
>
> When look at the sparkui, I see the following:
> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
> MB / 28089782host2:49970 ms00063.4 MB / 1810945
>
> So executor 2 is not even assigned a task ? Maybe I have some problems
> in my setting, but I don't know what could be the possible settings I set
> wrong or have not set.
>
>
> Thanks,
> Xiaohe
>
> On Sun, May 17, 2015 at 11:16 PM, Akhil Das <
> ak...@sigmoidanalytics.com> wrote:
>
>> Did you try --executor-cores param? While you submit the job, do a ps
>> aux | grep spark-submit and see the exact command parameters.
>>
>> Thanks
>> Best Regards
>>
>> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
>> wrote:
>>
>>> Hi,
>>>
>>> I have a 5 nodes yarn cluster, I used spark-submit to submit a
>>> simple app.
>>>
>>>  spark-submit --master yarn
>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>>> --num-executors 5
>>>
>>> I have set the number of executor to 5, but from sparkui I could see
>>> only two executors and it ran very slow. What did I miss ?
>>>
>>> Thanks,
>>> Xiaohe
>>>
>>
>>
>

>>>
>>
>


Re: number of executors

2015-05-18 Thread Sandy Ryza
Awesome!

It's documented here:
https://spark.apache.org/docs/latest/submitting-applications.html

-Sandy

On Mon, May 18, 2015 at 8:03 PM, xiaohe lan  wrote:

> Hi Sandy,
>
> Thanks for your information. Yes, spark-submit --master yarn
> --num-executors 5 --executor-cores 4
> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp is
> working awesomely. Is there any documentations pointing to this ?
>
> Thanks,
> Xiaohe
>
> On Tue, May 19, 2015 at 12:07 AM, Sandy Ryza 
> wrote:
>
>> Hi Xiaohe,
>>
>> The all Spark options must go before the jar or they won't take effect.
>>
>> -Sandy
>>
>> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan 
>> wrote:
>>
>>> Sorry, them both are assigned task actually.
>>>
>>> Aggregated Metrics by Executor
>>> Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
>>> Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
>>> Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 
>>> 121007701630.4
>>> MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6
>>> MB304.8 MB
>>>
>>> On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
>>> wrote:
>>>
 bash-4.1$ ps aux | grep SparkSubmit
 xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
 /scratch/xilan/jdk1.8.0_45/bin/java -cp
 /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
 -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
 target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
 --num-executors 5 --executor-cores 4
 xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
 --color SparkSubmit


 When look at the sparkui, I see the following:
 Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
 TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
 MB / 28089782host2:49970 ms00063.4 MB / 1810945

 So executor 2 is not even assigned a task ? Maybe I have some problems
 in my setting, but I don't know what could be the possible settings I set
 wrong or have not set.


 Thanks,
 Xiaohe

 On Sun, May 17, 2015 at 11:16 PM, Akhil Das >>> > wrote:

> Did you try --executor-cores param? While you submit the job, do a ps
> aux | grep spark-submit and see the exact command parameters.
>
> Thanks
> Best Regards
>
> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
> wrote:
>
>> Hi,
>>
>> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple
>> app.
>>
>>  spark-submit --master yarn
>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>> --num-executors 5
>>
>> I have set the number of executor to 5, but from sparkui I could see
>> only two executors and it ran very slow. What did I miss ?
>>
>> Thanks,
>> Xiaohe
>>
>
>

>>>
>>
>


Re: number of executors

2015-05-18 Thread xiaohe lan
Hi Sandy,

Thanks for your information. Yes, spark-submit --master yarn
--num-executors 5 --executor-cores 4
target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp is
working awesomely. Is there any documentations pointing to this ?

Thanks,
Xiaohe

On Tue, May 19, 2015 at 12:07 AM, Sandy Ryza 
wrote:

> Hi Xiaohe,
>
> The all Spark options must go before the jar or they won't take effect.
>
> -Sandy
>
> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan 
> wrote:
>
>> Sorry, them both are assigned task actually.
>>
>> Aggregated Metrics by Executor
>> Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
>> Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
>> Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 121007701630.4
>> MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6
>> MB304.8 MB
>>
>> On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
>> wrote:
>>
>>> bash-4.1$ ps aux | grep SparkSubmit
>>> xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
>>> /scratch/xilan/jdk1.8.0_45/bin/java -cp
>>> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
>>> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>>> --num-executors 5 --executor-cores 4
>>> xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
>>> --color SparkSubmit
>>>
>>>
>>> When look at the sparkui, I see the following:
>>> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
>>> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
>>> MB / 28089782host2:49970 ms00063.4 MB / 1810945
>>>
>>> So executor 2 is not even assigned a task ? Maybe I have some problems
>>> in my setting, but I don't know what could be the possible settings I set
>>> wrong or have not set.
>>>
>>>
>>> Thanks,
>>> Xiaohe
>>>
>>> On Sun, May 17, 2015 at 11:16 PM, Akhil Das 
>>> wrote:
>>>
 Did you try --executor-cores param? While you submit the job, do a ps
 aux | grep spark-submit and see the exact command parameters.

 Thanks
 Best Regards

 On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
 wrote:

> Hi,
>
> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple
> app.
>
>  spark-submit --master yarn
> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
> --num-executors 5
>
> I have set the number of executor to 5, but from sparkui I could see
> only two executors and it ran very slow. What did I miss ?
>
> Thanks,
> Xiaohe
>


>>>
>>
>


Re: number of executors

2015-05-18 Thread edward cui
Oh BTW, it's spark 1.3.1 on hadoop 2.4. AIM 3.6.

Sorry for lefting out this information.

Appreciate for any help!

Ed

2015-05-18 12:53 GMT-04:00 edward cui :

> I actually have the same problem, but I am not sure whether it is a spark
> problem or a Yarn problem.
>
> I set up a five nodes cluster on aws emr, start yarn daemon on the master
> (The node manager will not be started on default on the master, I don't
> want to waste any resource since I have to pay). And submit the spark task
> through yarn-cluster mode. The command is:
> ./spark/bin/spark-submit --master yearn-cluster --num-executors 5
> --exectutor-cores 4 --propertifies-file spark-application.conf myapp.py
>
> But the yarn resource manager only created 4 containers on 4 nodes, and
> one node was completely on idle.
>
> More details about the setup:
> EMR node:
> m3.xlarge: 16g ram 4 cores 40g ssd (HDFS on EBS?)
>
> Yarn-site.xml:
> yarn.scheduler.maximum-allocation-mb=11520
> yarn.nodemanager.resource.memory-mb=11520
>
> Spark-conf:
>
> spark.executor.memory 10g
>
> spark.storage.memoryFraction  0.2
>
> spark.python.worker.memory1500mspark.akka.frameSize   
>  200spark.shuffle.memoryFraction0.1
>
> spark.driver.memory 10g
>
>
> Hadoop behavior observed:
> Create 4 containers on four nodes including emr master but one emr slave
> on idle (memory consumption around 2g and 0% cpu occupation)
> Spark use one container for driver on emr slave node (make sense since I
> required that much of memory)
> Use the other three node for computing the tasks.
>
>
> If yarn can't use all the nodes and I have to pay for the node, it's just a 
> big waste : p
>
>
> Any thoughts on this?
>
>
> Great thanks,
>
> Ed
>
>
>
> 2015-05-18 12:07 GMT-04:00 Sandy Ryza :
>
> *All
>>
>> On Mon, May 18, 2015 at 9:07 AM, Sandy Ryza 
>> wrote:
>>
>>> Hi Xiaohe,
>>>
>>> The all Spark options must go before the jar or they won't take effect.
>>>
>>> -Sandy
>>>
>>> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan 
>>> wrote:
>>>
 Sorry, them both are assigned task actually.

 Aggregated Metrics by Executor
 Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
 Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
 Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 
 121007701630.4
 MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 
 109269121646.6
 MB304.8 MB

 On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
 wrote:

> bash-4.1$ ps aux | grep SparkSubmit
> xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
> /scratch/xilan/jdk1.8.0_45/bin/java -cp
> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
> --num-executors 5 --executor-cores 4
> xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
> --color SparkSubmit
>
>
> When look at the sparkui, I see the following:
> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
> MB / 28089782host2:49970 ms00063.4 MB / 1810945
>
> So executor 2 is not even assigned a task ? Maybe I have some problems
> in my setting, but I don't know what could be the possible settings I set
> wrong or have not set.
>
>
> Thanks,
> Xiaohe
>
> On Sun, May 17, 2015 at 11:16 PM, Akhil Das <
> ak...@sigmoidanalytics.com> wrote:
>
>> Did you try --executor-cores param? While you submit the job, do a ps
>> aux | grep spark-submit and see the exact command parameters.
>>
>> Thanks
>> Best Regards
>>
>> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
>> wrote:
>>
>>> Hi,
>>>
>>> I have a 5 nodes yarn cluster, I used spark-submit to submit a
>>> simple app.
>>>
>>>  spark-submit --master yarn
>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>>> --num-executors 5
>>>
>>> I have set the number of executor to 5, but from sparkui I could see
>>> only two executors and it ran very slow. What did I miss ?
>>>
>>> Thanks,
>>> Xiaohe
>>>
>>
>>
>

>>>
>>
>


Re: number of executors

2015-05-18 Thread edward cui
I actually have the same problem, but I am not sure whether it is a spark
problem or a Yarn problem.

I set up a five nodes cluster on aws emr, start yarn daemon on the master
(The node manager will not be started on default on the master, I don't
want to waste any resource since I have to pay). And submit the spark task
through yarn-cluster mode. The command is:
./spark/bin/spark-submit --master yearn-cluster --num-executors 5
--exectutor-cores 4 --propertifies-file spark-application.conf myapp.py

But the yarn resource manager only created 4 containers on 4 nodes, and one
node was completely on idle.

More details about the setup:
EMR node:
m3.xlarge: 16g ram 4 cores 40g ssd (HDFS on EBS?)

Yarn-site.xml:
yarn.scheduler.maximum-allocation-mb=11520
yarn.nodemanager.resource.memory-mb=11520

Spark-conf:

spark.executor.memory   10g

spark.storage.memoryFraction0.2

spark.python.worker.memory  1500mspark.akka.frameSize
 200spark.shuffle.memoryFraction0.1

spark.driver.memory 10g


Hadoop behavior observed:
Create 4 containers on four nodes including emr master but one emr slave on
idle (memory consumption around 2g and 0% cpu occupation)
Spark use one container for driver on emr slave node (make sense since I
required that much of memory)
Use the other three node for computing the tasks.


If yarn can't use all the nodes and I have to pay for the node, it's
just a big waste : p


Any thoughts on this?


Great thanks,

Ed



2015-05-18 12:07 GMT-04:00 Sandy Ryza :

> *All
>
> On Mon, May 18, 2015 at 9:07 AM, Sandy Ryza 
> wrote:
>
>> Hi Xiaohe,
>>
>> The all Spark options must go before the jar or they won't take effect.
>>
>> -Sandy
>>
>> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan 
>> wrote:
>>
>>> Sorry, them both are assigned task actually.
>>>
>>> Aggregated Metrics by Executor
>>> Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
>>> Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
>>> Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 
>>> 121007701630.4
>>> MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6
>>> MB304.8 MB
>>>
>>> On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
>>> wrote:
>>>
 bash-4.1$ ps aux | grep SparkSubmit
 xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
 /scratch/xilan/jdk1.8.0_45/bin/java -cp
 /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
 -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
 target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
 --num-executors 5 --executor-cores 4
 xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
 --color SparkSubmit


 When look at the sparkui, I see the following:
 Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
 TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
 MB / 28089782host2:49970 ms00063.4 MB / 1810945

 So executor 2 is not even assigned a task ? Maybe I have some problems
 in my setting, but I don't know what could be the possible settings I set
 wrong or have not set.


 Thanks,
 Xiaohe

 On Sun, May 17, 2015 at 11:16 PM, Akhil Das >>> > wrote:

> Did you try --executor-cores param? While you submit the job, do a ps
> aux | grep spark-submit and see the exact command parameters.
>
> Thanks
> Best Regards
>
> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
> wrote:
>
>> Hi,
>>
>> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple
>> app.
>>
>>  spark-submit --master yarn
>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>> --num-executors 5
>>
>> I have set the number of executor to 5, but from sparkui I could see
>> only two executors and it ran very slow. What did I miss ?
>>
>> Thanks,
>> Xiaohe
>>
>
>

>>>
>>
>


Re: number of executors

2015-05-18 Thread Sandy Ryza
*All

On Mon, May 18, 2015 at 9:07 AM, Sandy Ryza  wrote:

> Hi Xiaohe,
>
> The all Spark options must go before the jar or they won't take effect.
>
> -Sandy
>
> On Sun, May 17, 2015 at 8:59 AM, xiaohe lan 
> wrote:
>
>> Sorry, them both are assigned task actually.
>>
>> Aggregated Metrics by Executor
>> Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
>> Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
>> Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 121007701630.4
>> MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6
>> MB304.8 MB
>>
>> On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
>> wrote:
>>
>>> bash-4.1$ ps aux | grep SparkSubmit
>>> xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
>>> /scratch/xilan/jdk1.8.0_45/bin/java -cp
>>> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
>>> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>>> --num-executors 5 --executor-cores 4
>>> xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
>>> --color SparkSubmit
>>>
>>>
>>> When look at the sparkui, I see the following:
>>> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
>>> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
>>> MB / 28089782host2:49970 ms00063.4 MB / 1810945
>>>
>>> So executor 2 is not even assigned a task ? Maybe I have some problems
>>> in my setting, but I don't know what could be the possible settings I set
>>> wrong or have not set.
>>>
>>>
>>> Thanks,
>>> Xiaohe
>>>
>>> On Sun, May 17, 2015 at 11:16 PM, Akhil Das 
>>> wrote:
>>>
 Did you try --executor-cores param? While you submit the job, do a ps
 aux | grep spark-submit and see the exact command parameters.

 Thanks
 Best Regards

 On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
 wrote:

> Hi,
>
> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple
> app.
>
>  spark-submit --master yarn
> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
> --num-executors 5
>
> I have set the number of executor to 5, but from sparkui I could see
> only two executors and it ran very slow. What did I miss ?
>
> Thanks,
> Xiaohe
>


>>>
>>
>


Re: number of executors

2015-05-18 Thread Sandy Ryza
Hi Xiaohe,

The all Spark options must go before the jar or they won't take effect.

-Sandy

On Sun, May 17, 2015 at 8:59 AM, xiaohe lan  wrote:

> Sorry, them both are assigned task actually.
>
> Aggregated Metrics by Executor
> Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput
> Size / RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle
> Spill (Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 121007701630.4
> MB295.4 MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6
> MB304.8 MB
>
> On Sun, May 17, 2015 at 11:50 PM, xiaohe lan 
> wrote:
>
>> bash-4.1$ ps aux | grep SparkSubmit
>> xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
>> /scratch/xilan/jdk1.8.0_45/bin/java -cp
>> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
>> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>> --num-executors 5 --executor-cores 4
>> xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
>> --color SparkSubmit
>>
>>
>> When look at the sparkui, I see the following:
>> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
>> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1
>> MB / 28089782host2:49970 ms00063.4 MB / 1810945
>>
>> So executor 2 is not even assigned a task ? Maybe I have some problems in
>> my setting, but I don't know what could be the possible settings I set
>> wrong or have not set.
>>
>>
>> Thanks,
>> Xiaohe
>>
>> On Sun, May 17, 2015 at 11:16 PM, Akhil Das 
>> wrote:
>>
>>> Did you try --executor-cores param? While you submit the job, do a ps
>>> aux | grep spark-submit and see the exact command parameters.
>>>
>>> Thanks
>>> Best Regards
>>>
>>> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
>>> wrote:
>>>
 Hi,

 I have a 5 nodes yarn cluster, I used spark-submit to submit a simple
 app.

  spark-submit --master yarn
 target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
 --num-executors 5

 I have set the number of executor to 5, but from sparkui I could see
 only two executors and it ran very slow. What did I miss ?

 Thanks,
 Xiaohe

>>>
>>>
>>
>


Re: number of executors

2015-05-17 Thread xiaohe lan
Sorry, them both are assigned task actually.

Aggregated Metrics by Executor
Executor IDAddressTask TimeTotal TasksFailed TasksSucceeded TasksInput Size
/ RecordsShuffle Write Size / RecordsShuffle Spill (Memory)Shuffle Spill
(Disk)1host1:61841.7 min505640.0 MB / 12318400382.3 MB / 121007701630.4 MB295.4
MB2host2:620721.7 min505640.0 MB / 12014510386.0 MB / 109269121646.6 MB304.8
MB

On Sun, May 17, 2015 at 11:50 PM, xiaohe lan  wrote:

> bash-4.1$ ps aux | grep SparkSubmit
> xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
> /scratch/xilan/jdk1.8.0_45/bin/java -cp
> /scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
> -Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
> --num-executors 5 --executor-cores 4
> xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
> --color SparkSubmit
>
>
> When look at the sparkui, I see the following:
> Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
> TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1 MB
> / 28089782host2:49970 ms00063.4 MB / 1810945
>
> So executor 2 is not even assigned a task ? Maybe I have some problems in
> my setting, but I don't know what could be the possible settings I set
> wrong or have not set.
>
>
> Thanks,
> Xiaohe
>
> On Sun, May 17, 2015 at 11:16 PM, Akhil Das 
> wrote:
>
>> Did you try --executor-cores param? While you submit the job, do a ps aux
>> | grep spark-submit and see the exact command parameters.
>>
>> Thanks
>> Best Regards
>>
>> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
>> wrote:
>>
>>> Hi,
>>>
>>> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple
>>> app.
>>>
>>>  spark-submit --master yarn
>>> target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
>>> --num-executors 5
>>>
>>> I have set the number of executor to 5, but from sparkui I could see
>>> only two executors and it ran very slow. What did I miss ?
>>>
>>> Thanks,
>>> Xiaohe
>>>
>>
>>
>


Re: number of executors

2015-05-17 Thread xiaohe lan
bash-4.1$ ps aux | grep SparkSubmit
xilan 1704 13.2  1.2 5275520 380244 pts/0  Sl+  08:39   0:13
/scratch/xilan/jdk1.8.0_45/bin/java -cp
/scratch/xilan/spark/conf:/scratch/xilan/spark/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/scratch/xilan/spark/lib/datanucleus-core-3.2.10.jar:/scratch/xilan/spark/lib/datanucleus-api-jdo-3.2.6.jar:/scratch/xilan/spark/lib/datanucleus-rdbms-3.2.9.jar:/scratch/xilan/hadoop/etc/hadoop
-Xms512m -Xmx512m org.apache.spark.deploy.SparkSubmit --master yarn
target/scala-2.10/simple-project_2.10-1.0.jar --class scala.SimpleApp
--num-executors 5 --executor-cores 4
xilan 1949  0.0  0.0 103292   800 pts/1S+   08:40   0:00 grep
--color SparkSubmit


When look at the sparkui, I see the following:
Aggregated Metrics by ExecutorExecutor IDAddressTask TimeTotal TasksFailed
TasksSucceeded TasksShuffle Read Size / Records1host1:304836 s101127.1 MB /
28089782host2:49970 ms00063.4 MB / 1810945

So executor 2 is not even assigned a task ? Maybe I have some problems in
my setting, but I don't know what could be the possible settings I set
wrong or have not set.


Thanks,
Xiaohe

On Sun, May 17, 2015 at 11:16 PM, Akhil Das 
wrote:

> Did you try --executor-cores param? While you submit the job, do a ps aux
> | grep spark-submit and see the exact command parameters.
>
> Thanks
> Best Regards
>
> On Sat, May 16, 2015 at 12:31 PM, xiaohe lan 
> wrote:
>
>> Hi,
>>
>> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple app.
>>
>>  spark-submit --master yarn target/scala-2.10/simple-project_2.10-1.0.jar
>> --class scala.SimpleApp --num-executors 5
>>
>> I have set the number of executor to 5, but from sparkui I could see only
>> two executors and it ran very slow. What did I miss ?
>>
>> Thanks,
>> Xiaohe
>>
>
>


Re: number of executors

2015-05-17 Thread Akhil Das
Did you try --executor-cores param? While you submit the job, do a ps aux |
grep spark-submit and see the exact command parameters.

Thanks
Best Regards

On Sat, May 16, 2015 at 12:31 PM, xiaohe lan  wrote:

> Hi,
>
> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple app.
>
>  spark-submit --master yarn target/scala-2.10/simple-project_2.10-1.0.jar
> --class scala.SimpleApp --num-executors 5
>
> I have set the number of executor to 5, but from sparkui I could see only
> two executors and it ran very slow. What did I miss ?
>
> Thanks,
> Xiaohe
>


Re: number of executors

2015-05-16 Thread Ted Yu
What Spark release are you using ?

Can you check driver log to see if there is some clue there ?

Thanks

On Sat, May 16, 2015 at 12:01 AM, xiaohe lan  wrote:

> Hi,
>
> I have a 5 nodes yarn cluster, I used spark-submit to submit a simple app.
>
>  spark-submit --master yarn target/scala-2.10/simple-project_2.10-1.0.jar
> --class scala.SimpleApp --num-executors 5
>
> I have set the number of executor to 5, but from sparkui I could see only
> two executors and it ran very slow. What did I miss ?
>
> Thanks,
> Xiaohe
>


number of executors

2015-05-16 Thread xiaohe lan
Hi,

I have a 5 nodes yarn cluster, I used spark-submit to submit a simple app.

 spark-submit --master yarn target/scala-2.10/simple-project_2.10-1.0.jar
--class scala.SimpleApp --num-executors 5

I have set the number of executor to 5, but from sparkui I could see only
two executors and it ran very slow. What did I miss ?

Thanks,
Xiaohe


Re: Number of Executors per worker process

2015-03-02 Thread Spico Florin
Hello!
  Thank you very much for your response. In the book "Learning Spark" I
found out the following sentence:

"Each application will have at most one executor on each worker"

So worker can have one or none executor process spawned (perhaps the number
depends on the workload distribution).


Best regards,

 Florin

On Thu, Feb 26, 2015 at 1:04 PM, Jeffrey Jedele 
wrote:

> Hi Spico,
>
> Yes, I think an "executor core" in Spark is basically a thread in a worker
> pool. It's recommended to have one executor core per physical core on your
> machine for best performance, but I think in theory you can create as many
> threads as your OS allows.
>
> For deployment:
> There seems to be the actual worker JVM which coordinates the work on a
> worker node. I don't think the actual thread pool lives in there, but a
> separate JVM is created for each application that has cores allocated on
> the node. Otherwise it would be rather hard to impose memory limits on
> application level and it would have serious disadvantages regarding
> stability.
>
> You can check this behavior by looing at the processes on your machine:
> ps aux | grep spark.deploy => will show  master, worker (coordinator) and
> driver JVMs
> ps aux | grep spark.executor => will show the actual worker JVMs
>
> 2015-02-25 14:23 GMT+01:00 Spico Florin :
>
>> Hello!
>>  I've read the documentation about the spark architecture, I have the
>> following questions:
>> 1: how many executors can be on a single worker process (JMV)?
>> 2:Should I think executor like a Java Thread Executor where the pool size
>> is equal with the number of the given cores (set up by the
>> SPARK_WORKER_CORES)?
>> 3. If the worker can have many executors, how this is handled by the
>> Spark? Or can I handle by myself to set up the number of executors per JVM?
>>
>> I look forward for your answers.
>>   Regards,
>>   Florin
>>
>
>


Re: Number of Executors per worker process

2015-02-26 Thread Jeffrey Jedele
Hi Spico,

Yes, I think an "executor core" in Spark is basically a thread in a worker
pool. It's recommended to have one executor core per physical core on your
machine for best performance, but I think in theory you can create as many
threads as your OS allows.

For deployment:
There seems to be the actual worker JVM which coordinates the work on a
worker node. I don't think the actual thread pool lives in there, but a
separate JVM is created for each application that has cores allocated on
the node. Otherwise it would be rather hard to impose memory limits on
application level and it would have serious disadvantages regarding
stability.

You can check this behavior by looing at the processes on your machine:
ps aux | grep spark.deploy => will show  master, worker (coordinator) and
driver JVMs
ps aux | grep spark.executor => will show the actual worker JVMs

2015-02-25 14:23 GMT+01:00 Spico Florin :

> Hello!
>  I've read the documentation about the spark architecture, I have the
> following questions:
> 1: how many executors can be on a single worker process (JMV)?
> 2:Should I think executor like a Java Thread Executor where the pool size
> is equal with the number of the given cores (set up by the
> SPARK_WORKER_CORES)?
> 3. If the worker can have many executors, how this is handled by the
> Spark? Or can I handle by myself to set up the number of executors per JVM?
>
> I look forward for your answers.
>   Regards,
>   Florin
>


Number of Executors per worker process

2015-02-25 Thread Spico Florin
Hello!
 I've read the documentation about the spark architecture, I have the
following questions:
1: how many executors can be on a single worker process (JMV)?
2:Should I think executor like a Java Thread Executor where the pool size
is equal with the number of the given cores (set up by the
SPARK_WORKER_CORES)?
3. If the worker can have many executors, how this is handled by the Spark?
Or can I handle by myself to set up the number of executors per JVM?

I look forward for your answers.
  Regards,
  Florin


Re: Setting the number of executors in standalone mode

2015-02-20 Thread Kelvin Chu
Hi,

Currently, there is only one executor per worker. There is jira ticket to
relax this:

https://issues.apache.org/jira/browse/SPARK-1706

But, if you want to use more cores, maybe, you can try increasing
SPARK_WORKER_INSTANCES. It increases the number of workers per machine.
Take a look here:
http://spark.apache.org/docs/1.2.0/spark-standalone.html

Hope this help!
Kelvin


On Fri, Feb 20, 2015 at 10:08 AM, Mohammed Guller 
wrote:

>  ASFAIK, in stand-alone mode, each Spark application gets one executor on
> each worker. You could run multiple workers on a machine though.
>
>
>
> Mohammed
>
>
>
> *From:* Yiannis Gkoufas [mailto:johngou...@gmail.com]
> *Sent:* Friday, February 20, 2015 9:48 AM
> *To:* Mohammed Guller
> *Cc:* user@spark.apache.org
> *Subject:* Re: Setting the number of executors in standalone mode
>
>
>
> Hi Mohammed,
>
>
>
> thanks a lot for the reply.
>
> Ok, so from what I understand I cannot control the number of executors per
> worker in standalone cluster mode.
>
> Is that correct?
>
>
>
> BR
>
>
>
> On 20 February 2015 at 17:46, Mohammed Guller 
> wrote:
>
> SPARK_WORKER_MEMORY=8g
>
> Will allocate 8GB memory to Spark on each worker node. Nothing to do with
> # of executors.
>
>
>
>
>
> Mohammed
>
>
>
> *From:* Yiannis Gkoufas [mailto:johngou...@gmail.com]
> *Sent:* Friday, February 20, 2015 4:55 AM
> *To:* user@spark.apache.org
> *Subject:* Setting the number of executors in standalone mode
>
>
>
> Hi there,
>
>
>
> I try to increase the number of executors per worker in the standalone
> mode and I have failed to achieve that.
>
> I followed a bit the instructions of this thread:
> http://stackoverflow.com/questions/26645293/spark-configuration-memory-instance-cores
>
>
>
> and did that:
>
> spark.executor.memory   1g
>
> SPARK_WORKER_MEMORY=8g
>
>
>
> hoping to get 8 executors per worker but its still 1.
>
> And the option num-executors is not available in the standalone mode.
>
>
>
> Thanks a lot!
>
>
>


RE: Setting the number of executors in standalone mode

2015-02-20 Thread Mohammed Guller
ASFAIK, in stand-alone mode, each Spark application gets one executor on each 
worker. You could run multiple workers on a machine though.

Mohammed

From: Yiannis Gkoufas [mailto:johngou...@gmail.com]
Sent: Friday, February 20, 2015 9:48 AM
To: Mohammed Guller
Cc: user@spark.apache.org
Subject: Re: Setting the number of executors in standalone mode

Hi Mohammed,

thanks a lot for the reply.
Ok, so from what I understand I cannot control the number of executors per 
worker in standalone cluster mode.
Is that correct?

BR

On 20 February 2015 at 17:46, Mohammed Guller 
mailto:moham...@glassbeam.com>> wrote:
SPARK_WORKER_MEMORY=8g
Will allocate 8GB memory to Spark on each worker node. Nothing to do with # of 
executors.


Mohammed

From: Yiannis Gkoufas [mailto:johngou...@gmail.com<mailto:johngou...@gmail.com>]
Sent: Friday, February 20, 2015 4:55 AM
To: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Setting the number of executors in standalone mode

Hi there,

I try to increase the number of executors per worker in the standalone mode and 
I have failed to achieve that.
I followed a bit the instructions of this thread: 
http://stackoverflow.com/questions/26645293/spark-configuration-memory-instance-cores

and did that:
spark.executor.memory   1g
SPARK_WORKER_MEMORY=8g

hoping to get 8 executors per worker but its still 1.
And the option num-executors is not available in the standalone mode.

Thanks a lot!



RE: Setting the number of executors in standalone mode

2015-02-20 Thread Mohammed Guller
SPARK_WORKER_MEMORY=8g
Will allocate 8GB memory to Spark on each worker node. Nothing to do with # of 
executors.


Mohammed

From: Yiannis Gkoufas [mailto:johngou...@gmail.com]
Sent: Friday, February 20, 2015 4:55 AM
To: user@spark.apache.org
Subject: Setting the number of executors in standalone mode

Hi there,

I try to increase the number of executors per worker in the standalone mode and 
I have failed to achieve that.
I followed a bit the instructions of this thread: 
http://stackoverflow.com/questions/26645293/spark-configuration-memory-instance-cores

and did that:
spark.executor.memory   1g
SPARK_WORKER_MEMORY=8g

hoping to get 8 executors per worker but its still 1.
And the option num-executors is not available in the standalone mode.

Thanks a lot!


Re: Setting the number of executors in standalone mode

2015-02-20 Thread Yiannis Gkoufas
Hi Mohammed,

thanks a lot for the reply.
Ok, so from what I understand I cannot control the number of executors per
worker in standalone cluster mode.
Is that correct?

BR

On 20 February 2015 at 17:46, Mohammed Guller 
wrote:

>  SPARK_WORKER_MEMORY=8g
>
> Will allocate 8GB memory to Spark on each worker node. Nothing to do with
> # of executors.
>
>
>
>
>
> Mohammed
>
>
>
> *From:* Yiannis Gkoufas [mailto:johngou...@gmail.com]
> *Sent:* Friday, February 20, 2015 4:55 AM
> *To:* user@spark.apache.org
> *Subject:* Setting the number of executors in standalone mode
>
>
>
> Hi there,
>
>
>
> I try to increase the number of executors per worker in the standalone
> mode and I have failed to achieve that.
>
> I followed a bit the instructions of this thread:
> http://stackoverflow.com/questions/26645293/spark-configuration-memory-instance-cores
>
>
>
> and did that:
>
> spark.executor.memory   1g
>
> SPARK_WORKER_MEMORY=8g
>
>
>
> hoping to get 8 executors per worker but its still 1.
>
> And the option num-executors is not available in the standalone mode.
>
>
>
> Thanks a lot!
>


Setting the number of executors in standalone mode

2015-02-20 Thread Yiannis Gkoufas
Hi there,

I try to increase the number of executors per worker in the standalone mode
and I have failed to achieve that.
I followed a bit the instructions of this thread:
http://stackoverflow.com/questions/26645293/spark-configuration-memory-instance-cores

and did that:
spark.executor.memory 1g
SPARK_WORKER_MEMORY=8g

hoping to get 8 executors per worker but its still 1.
And the option num-executors is not available in the standalone mode.

Thanks a lot!


Re: Problems with GC and time to execute with different number of executors.

2015-02-06 Thread Guillermo Ortiz
If I launch more executors, GC gets worse.

2015-02-06 10:47 GMT+01:00 Guillermo Ortiz :
> This is an execution with 80 executors
>
> MetricMin25th percentileMedian75th percentileMax
> Duration 31s 44s 50s 1.1min 2.6 min
> GC Time 70ms 0.1s 0.3s 4s 53 s
> Input 128.0MB 128.0MB 128.0MB 128.0MB 128.0MB
>
> I executed as well with 40 executors
> MetricMin25th percentileMedian75th percentileMax
> Duration 26s 28s 28s 30s 35s
> GC Time 54ms 60ms 66ms 80ms 0.4 s
> Input 128.0MB 128.0MB 128.0MB 128.0MB 128.0 MB
>
> I checked the %iowait and %steal in a worker it's all right in both of them
> I understand the value of yarn.nodemanager.resource.memory-mb is for
> each worker in the cluster and not the total value for YARN. it's
> configured at 196GB right now. (I have 5 workers)
> 80executors x 4Gb = 320Gb, it shouldn't be a problem.
>
>
> 2015-02-06 10:03 GMT+01:00 Sandy Ryza :
>> Yes, having many more cores than disks and all writing at the same time can
>> definitely cause performance issues.  Though that wouldn't explain the high
>> GC.  What percent of task time does the web UI report that tasks are
>> spending in GC?
>>
>> On Fri, Feb 6, 2015 at 12:56 AM, Guillermo Ortiz 
>> wrote:
>>>
>>> Yes, It's surpressing to me as well
>>>
>>> I tried to execute it with different configurations,
>>>
>>> sudo -u hdfs spark-submit  --master yarn-client --class
>>> com.mycompany.app.App --num-executors 40 --executor-memory 4g
>>> Example-1.0-SNAPSHOT.jar hdfs://ip:8020/tmp/sparkTest/ file22.bin
>>> parameters
>>>
>>> This is what I executed with different values in num-executors and
>>> executor-memory.
>>> What do you think there are too many executors for those HDDs? Could
>>> it be the reason because of each executor takes more time?
>>>
>>> 2015-02-06 9:36 GMT+01:00 Sandy Ryza :
>>> > That's definitely surprising to me that you would be hitting a lot of GC
>>> > for
>>> > this scenario.  Are you setting --executor-cores and --executor-memory?
>>> > What are you setting them to?
>>> >
>>> > -Sandy
>>> >
>>> > On Thu, Feb 5, 2015 at 10:17 AM, Guillermo Ortiz 
>>> > wrote:
>>> >>
>>> >> Any idea why if I use more containers I get a lot of stopped because
>>> >> GC?
>>> >>
>>> >> 2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
>>> >> > I'm not caching the data. with "each iteration I mean,, each 128mb
>>> >> > that a executor has to process.
>>> >> >
>>> >> > The code is pretty simple.
>>> >> >
>>> >> > final Conversor c = new Conversor(null, null, null,
>>> >> > longFields,typeFields);
>>> >> > SparkConf conf = new SparkConf().setAppName("Simple Application");
>>> >> > JavaSparkContext sc = new JavaSparkContext(conf);
>>> >> > JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());
>>> >> >
>>> >> >  JavaRDD rddString = rdd.map(new Function() {
>>> >> >  @Override
>>> >> >   public String call(byte[] arg0) throws Exception {
>>> >> >  String result = c.parse(arg0).toString();
>>> >> >   return result;
>>> >> > }
>>> >> >  });
>>> >> > rddString.saveAsTextFile(url + "/output/" +
>>> >> > System.currentTimeMillis()+
>>> >> > "/");
>>> >> >
>>> >> > The parse function just takes an array of bytes and applies some
>>> >> > transformations like,,,
>>> >> > [0..3] an integer, [4...20] an String, [21..27] another String and so
>>> >> > on.
>>> >> >
>>> >> > It's just a test code, I'd like to understand what it's happeing.
>>> >> >
>>> >> > 2015-02-04 18:57 GMT+01:00 Sandy Ryza :
>>> >> >> Hi Guillermo,
>>> >> >>
>>> >> >> What exactly do you mean by "each iteration"?  Are you caching data
>>> >> >> in
>>> >> >> memory?
>>> >> >>
>>> >> >> -Sandy
>>> >> >>
>>> >> >> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz
>>> >> >> 
>>> >&g

Re: Problems with GC and time to execute with different number of executors.

2015-02-06 Thread Guillermo Ortiz
This is an execution with 80 executors

MetricMin25th percentileMedian75th percentileMax
Duration 31s 44s 50s 1.1min 2.6 min
GC Time 70ms 0.1s 0.3s 4s 53 s
Input 128.0MB 128.0MB 128.0MB 128.0MB 128.0MB

I executed as well with 40 executors
MetricMin25th percentileMedian75th percentileMax
Duration 26s 28s 28s 30s 35s
GC Time 54ms 60ms 66ms 80ms 0.4 s
Input 128.0MB 128.0MB 128.0MB 128.0MB 128.0 MB

I checked the %iowait and %steal in a worker it's all right in both of them
I understand the value of yarn.nodemanager.resource.memory-mb is for
each worker in the cluster and not the total value for YARN. it's
configured at 196GB right now. (I have 5 workers)
80executors x 4Gb = 320Gb, it shouldn't be a problem.


2015-02-06 10:03 GMT+01:00 Sandy Ryza :
> Yes, having many more cores than disks and all writing at the same time can
> definitely cause performance issues.  Though that wouldn't explain the high
> GC.  What percent of task time does the web UI report that tasks are
> spending in GC?
>
> On Fri, Feb 6, 2015 at 12:56 AM, Guillermo Ortiz 
> wrote:
>>
>> Yes, It's surpressing to me as well
>>
>> I tried to execute it with different configurations,
>>
>> sudo -u hdfs spark-submit  --master yarn-client --class
>> com.mycompany.app.App --num-executors 40 --executor-memory 4g
>> Example-1.0-SNAPSHOT.jar hdfs://ip:8020/tmp/sparkTest/ file22.bin
>> parameters
>>
>> This is what I executed with different values in num-executors and
>> executor-memory.
>> What do you think there are too many executors for those HDDs? Could
>> it be the reason because of each executor takes more time?
>>
>> 2015-02-06 9:36 GMT+01:00 Sandy Ryza :
>> > That's definitely surprising to me that you would be hitting a lot of GC
>> > for
>> > this scenario.  Are you setting --executor-cores and --executor-memory?
>> > What are you setting them to?
>> >
>> > -Sandy
>> >
>> > On Thu, Feb 5, 2015 at 10:17 AM, Guillermo Ortiz 
>> > wrote:
>> >>
>> >> Any idea why if I use more containers I get a lot of stopped because
>> >> GC?
>> >>
>> >> 2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
>> >> > I'm not caching the data. with "each iteration I mean,, each 128mb
>> >> > that a executor has to process.
>> >> >
>> >> > The code is pretty simple.
>> >> >
>> >> > final Conversor c = new Conversor(null, null, null,
>> >> > longFields,typeFields);
>> >> > SparkConf conf = new SparkConf().setAppName("Simple Application");
>> >> > JavaSparkContext sc = new JavaSparkContext(conf);
>> >> > JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());
>> >> >
>> >> >  JavaRDD rddString = rdd.map(new Function() {
>> >> >  @Override
>> >> >   public String call(byte[] arg0) throws Exception {
>> >> >  String result = c.parse(arg0).toString();
>> >> >   return result;
>> >> > }
>> >> >  });
>> >> > rddString.saveAsTextFile(url + "/output/" +
>> >> > System.currentTimeMillis()+
>> >> > "/");
>> >> >
>> >> > The parse function just takes an array of bytes and applies some
>> >> > transformations like,,,
>> >> > [0..3] an integer, [4...20] an String, [21..27] another String and so
>> >> > on.
>> >> >
>> >> > It's just a test code, I'd like to understand what it's happeing.
>> >> >
>> >> > 2015-02-04 18:57 GMT+01:00 Sandy Ryza :
>> >> >> Hi Guillermo,
>> >> >>
>> >> >> What exactly do you mean by "each iteration"?  Are you caching data
>> >> >> in
>> >> >> memory?
>> >> >>
>> >> >> -Sandy
>> >> >>
>> >> >> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz
>> >> >> 
>> >> >> wrote:
>> >> >>>
>> >> >>> I execute a job in Spark where I'm processing a file of 80Gb in
>> >> >>> HDFS.
>> >> >>> I have 5 slaves:
>> >> >>> (32cores /256Gb / 7physical disks) x 5
>> >> >>>
>> >> >>> I have been trying many different configu

Re: Problems with GC and time to execute with different number of executors.

2015-02-06 Thread Sandy Ryza
Yes, having many more cores than disks and all writing at the same time can
definitely cause performance issues.  Though that wouldn't explain the high
GC.  What percent of task time does the web UI report that tasks are
spending in GC?

On Fri, Feb 6, 2015 at 12:56 AM, Guillermo Ortiz 
wrote:

> Yes, It's surpressing to me as well
>
> I tried to execute it with different configurations,
>
> sudo -u hdfs spark-submit  --master yarn-client --class
> com.mycompany.app.App --num-executors 40 --executor-memory 4g
> Example-1.0-SNAPSHOT.jar hdfs://ip:8020/tmp/sparkTest/ file22.bin
> parameters
>
> This is what I executed with different values in num-executors and
> executor-memory.
> What do you think there are too many executors for those HDDs? Could
> it be the reason because of each executor takes more time?
>
> 2015-02-06 9:36 GMT+01:00 Sandy Ryza :
> > That's definitely surprising to me that you would be hitting a lot of GC
> for
> > this scenario.  Are you setting --executor-cores and --executor-memory?
> > What are you setting them to?
> >
> > -Sandy
> >
> > On Thu, Feb 5, 2015 at 10:17 AM, Guillermo Ortiz 
> > wrote:
> >>
> >> Any idea why if I use more containers I get a lot of stopped because GC?
> >>
> >> 2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
> >> > I'm not caching the data. with "each iteration I mean,, each 128mb
> >> > that a executor has to process.
> >> >
> >> > The code is pretty simple.
> >> >
> >> > final Conversor c = new Conversor(null, null, null,
> >> > longFields,typeFields);
> >> > SparkConf conf = new SparkConf().setAppName("Simple Application");
> >> > JavaSparkContext sc = new JavaSparkContext(conf);
> >> > JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());
> >> >
> >> >  JavaRDD rddString = rdd.map(new Function() {
> >> >  @Override
> >> >   public String call(byte[] arg0) throws Exception {
> >> >  String result = c.parse(arg0).toString();
> >> >   return result;
> >> > }
> >> >  });
> >> > rddString.saveAsTextFile(url + "/output/" +
> System.currentTimeMillis()+
> >> > "/");
> >> >
> >> > The parse function just takes an array of bytes and applies some
> >> > transformations like,,,
> >> > [0..3] an integer, [4...20] an String, [21..27] another String and so
> >> > on.
> >> >
> >> > It's just a test code, I'd like to understand what it's happeing.
> >> >
> >> > 2015-02-04 18:57 GMT+01:00 Sandy Ryza :
> >> >> Hi Guillermo,
> >> >>
> >> >> What exactly do you mean by "each iteration"?  Are you caching data
> in
> >> >> memory?
> >> >>
> >> >> -Sandy
> >> >>
> >> >> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz <
> konstt2...@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> I execute a job in Spark where I'm processing a file of 80Gb in
> HDFS.
> >> >>> I have 5 slaves:
> >> >>> (32cores /256Gb / 7physical disks) x 5
> >> >>>
> >> >>> I have been trying many different configurations with YARN.
> >> >>> yarn.nodemanager.resource.memory-mb 196Gb
> >> >>> yarn.nodemanager.resource.cpu-vcores 24
> >> >>>
> >> >>> I have tried to execute the job with different number of executors a
> >> >>> memory (1-4g)
> >> >>> With 20 executors takes 25s each iteration (128mb) and it never has
> a
> >> >>> really long time waiting because GC.
> >> >>>
> >> >>> When I execute around 60 executors the process time it's about 45s
> and
> >> >>> some tasks take until one minute because GC.
> >> >>>
> >> >>> I have no idea why it's calling GC when I execute more executors
> >> >>> simultaneously.
> >> >>> The another question it's why it takes more time to execute each
> >> >>> block. My theory about the this it's because there're only 7
> physical
> >> >>> disks and it's not the same 5 processes writing than 20.
> >> >>>
> >> >>> The code is pretty simple, it's just a map function which parse a
> line
> >> >>> and write the output in HDFS. There're a lot of substrings inside of
> >> >>> the function what it could cause GC.
> >> >>>
> >> >>> Any theory about?
> >> >>>
> >> >>>
> -
> >> >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> >> >>> For additional commands, e-mail: user-h...@spark.apache.org
> >> >>>
> >> >>
> >
> >
>


Re: Problems with GC and time to execute with different number of executors.

2015-02-06 Thread Guillermo Ortiz
Yes, It's surpressing to me as well

I tried to execute it with different configurations,

sudo -u hdfs spark-submit  --master yarn-client --class
com.mycompany.app.App --num-executors 40 --executor-memory 4g
Example-1.0-SNAPSHOT.jar hdfs://ip:8020/tmp/sparkTest/ file22.bin
parameters

This is what I executed with different values in num-executors and
executor-memory.
What do you think there are too many executors for those HDDs? Could
it be the reason because of each executor takes more time?

2015-02-06 9:36 GMT+01:00 Sandy Ryza :
> That's definitely surprising to me that you would be hitting a lot of GC for
> this scenario.  Are you setting --executor-cores and --executor-memory?
> What are you setting them to?
>
> -Sandy
>
> On Thu, Feb 5, 2015 at 10:17 AM, Guillermo Ortiz 
> wrote:
>>
>> Any idea why if I use more containers I get a lot of stopped because GC?
>>
>> 2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
>> > I'm not caching the data. with "each iteration I mean,, each 128mb
>> > that a executor has to process.
>> >
>> > The code is pretty simple.
>> >
>> > final Conversor c = new Conversor(null, null, null,
>> > longFields,typeFields);
>> > SparkConf conf = new SparkConf().setAppName("Simple Application");
>> > JavaSparkContext sc = new JavaSparkContext(conf);
>> > JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());
>> >
>> >  JavaRDD rddString = rdd.map(new Function() {
>> >  @Override
>> >   public String call(byte[] arg0) throws Exception {
>> >  String result = c.parse(arg0).toString();
>> >   return result;
>> > }
>> >  });
>> > rddString.saveAsTextFile(url + "/output/" + System.currentTimeMillis()+
>> > "/");
>> >
>> > The parse function just takes an array of bytes and applies some
>> > transformations like,,,
>> > [0..3] an integer, [4...20] an String, [21..27] another String and so
>> > on.
>> >
>> > It's just a test code, I'd like to understand what it's happeing.
>> >
>> > 2015-02-04 18:57 GMT+01:00 Sandy Ryza :
>> >> Hi Guillermo,
>> >>
>> >> What exactly do you mean by "each iteration"?  Are you caching data in
>> >> memory?
>> >>
>> >> -Sandy
>> >>
>> >> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz 
>> >> wrote:
>> >>>
>> >>> I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
>> >>> I have 5 slaves:
>> >>> (32cores /256Gb / 7physical disks) x 5
>> >>>
>> >>> I have been trying many different configurations with YARN.
>> >>> yarn.nodemanager.resource.memory-mb 196Gb
>> >>> yarn.nodemanager.resource.cpu-vcores 24
>> >>>
>> >>> I have tried to execute the job with different number of executors a
>> >>> memory (1-4g)
>> >>> With 20 executors takes 25s each iteration (128mb) and it never has a
>> >>> really long time waiting because GC.
>> >>>
>> >>> When I execute around 60 executors the process time it's about 45s and
>> >>> some tasks take until one minute because GC.
>> >>>
>> >>> I have no idea why it's calling GC when I execute more executors
>> >>> simultaneously.
>> >>> The another question it's why it takes more time to execute each
>> >>> block. My theory about the this it's because there're only 7 physical
>> >>> disks and it's not the same 5 processes writing than 20.
>> >>>
>> >>> The code is pretty simple, it's just a map function which parse a line
>> >>> and write the output in HDFS. There're a lot of substrings inside of
>> >>> the function what it could cause GC.
>> >>>
>> >>> Any theory about?
>> >>>
>> >>> -
>> >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> >>> For additional commands, e-mail: user-h...@spark.apache.org
>> >>>
>> >>
>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Problems with GC and time to execute with different number of executors.

2015-02-06 Thread Sandy Ryza
That's definitely surprising to me that you would be hitting a lot of GC
for this scenario.  Are you setting --executor-cores and
--executor-memory?  What are you setting them to?

-Sandy

On Thu, Feb 5, 2015 at 10:17 AM, Guillermo Ortiz 
wrote:

> Any idea why if I use more containers I get a lot of stopped because GC?
>
> 2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
> > I'm not caching the data. with "each iteration I mean,, each 128mb
> > that a executor has to process.
> >
> > The code is pretty simple.
> >
> > final Conversor c = new Conversor(null, null, null,
> longFields,typeFields);
> > SparkConf conf = new SparkConf().setAppName("Simple Application");
> > JavaSparkContext sc = new JavaSparkContext(conf);
> > JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());
> >
> >  JavaRDD rddString = rdd.map(new Function() {
> >  @Override
> >   public String call(byte[] arg0) throws Exception {
> >  String result = c.parse(arg0).toString();
> >   return result;
> > }
> >  });
> > rddString.saveAsTextFile(url + "/output/" + System.currentTimeMillis()+
> "/");
> >
> > The parse function just takes an array of bytes and applies some
> > transformations like,,,
> > [0..3] an integer, [4...20] an String, [21..27] another String and so on.
> >
> > It's just a test code, I'd like to understand what it's happeing.
> >
> > 2015-02-04 18:57 GMT+01:00 Sandy Ryza :
> >> Hi Guillermo,
> >>
> >> What exactly do you mean by "each iteration"?  Are you caching data in
> >> memory?
> >>
> >> -Sandy
> >>
> >> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz 
> >> wrote:
> >>>
> >>> I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
> >>> I have 5 slaves:
> >>> (32cores /256Gb / 7physical disks) x 5
> >>>
> >>> I have been trying many different configurations with YARN.
> >>> yarn.nodemanager.resource.memory-mb 196Gb
> >>> yarn.nodemanager.resource.cpu-vcores 24
> >>>
> >>> I have tried to execute the job with different number of executors a
> >>> memory (1-4g)
> >>> With 20 executors takes 25s each iteration (128mb) and it never has a
> >>> really long time waiting because GC.
> >>>
> >>> When I execute around 60 executors the process time it's about 45s and
> >>> some tasks take until one minute because GC.
> >>>
> >>> I have no idea why it's calling GC when I execute more executors
> >>> simultaneously.
> >>> The another question it's why it takes more time to execute each
> >>> block. My theory about the this it's because there're only 7 physical
> >>> disks and it's not the same 5 processes writing than 20.
> >>>
> >>> The code is pretty simple, it's just a map function which parse a line
> >>> and write the output in HDFS. There're a lot of substrings inside of
> >>> the function what it could cause GC.
> >>>
> >>> Any theory about?
> >>>
> >>> -
> >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> >>> For additional commands, e-mail: user-h...@spark.apache.org
> >>>
> >>
>


Re: Problems with GC and time to execute with different number of executors.

2015-02-05 Thread Guillermo Ortiz
Any idea why if I use more containers I get a lot of stopped because GC?

2015-02-05 8:59 GMT+01:00 Guillermo Ortiz :
> I'm not caching the data. with "each iteration I mean,, each 128mb
> that a executor has to process.
>
> The code is pretty simple.
>
> final Conversor c = new Conversor(null, null, null, longFields,typeFields);
> SparkConf conf = new SparkConf().setAppName("Simple Application");
> JavaSparkContext sc = new JavaSparkContext(conf);
> JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());
>
>  JavaRDD rddString = rdd.map(new Function() {
>  @Override
>   public String call(byte[] arg0) throws Exception {
>  String result = c.parse(arg0).toString();
>   return result;
> }
>  });
> rddString.saveAsTextFile(url + "/output/" + System.currentTimeMillis()+ "/");
>
> The parse function just takes an array of bytes and applies some
> transformations like,,,
> [0..3] an integer, [4...20] an String, [21..27] another String and so on.
>
> It's just a test code, I'd like to understand what it's happeing.
>
> 2015-02-04 18:57 GMT+01:00 Sandy Ryza :
>> Hi Guillermo,
>>
>> What exactly do you mean by "each iteration"?  Are you caching data in
>> memory?
>>
>> -Sandy
>>
>> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz 
>> wrote:
>>>
>>> I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
>>> I have 5 slaves:
>>> (32cores /256Gb / 7physical disks) x 5
>>>
>>> I have been trying many different configurations with YARN.
>>> yarn.nodemanager.resource.memory-mb 196Gb
>>> yarn.nodemanager.resource.cpu-vcores 24
>>>
>>> I have tried to execute the job with different number of executors a
>>> memory (1-4g)
>>> With 20 executors takes 25s each iteration (128mb) and it never has a
>>> really long time waiting because GC.
>>>
>>> When I execute around 60 executors the process time it's about 45s and
>>> some tasks take until one minute because GC.
>>>
>>> I have no idea why it's calling GC when I execute more executors
>>> simultaneously.
>>> The another question it's why it takes more time to execute each
>>> block. My theory about the this it's because there're only 7 physical
>>> disks and it's not the same 5 processes writing than 20.
>>>
>>> The code is pretty simple, it's just a map function which parse a line
>>> and write the output in HDFS. There're a lot of substrings inside of
>>> the function what it could cause GC.
>>>
>>> Any theory about?
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Problems with GC and time to execute with different number of executors.

2015-02-05 Thread Guillermo Ortiz
I'm not caching the data. with "each iteration I mean,, each 128mb
that a executor has to process.

The code is pretty simple.

final Conversor c = new Conversor(null, null, null, longFields,typeFields);
SparkConf conf = new SparkConf().setAppName("Simple Application");
JavaSparkContext sc = new JavaSparkContext(conf);
JavaRDD rdd = sc.binaryRecords(path, c.calculaLongBlock());

 JavaRDD rddString = rdd.map(new Function() {
 @Override
  public String call(byte[] arg0) throws Exception {
 String result = c.parse(arg0).toString();
  return result;
}
 });
rddString.saveAsTextFile(url + "/output/" + System.currentTimeMillis()+ "/");

The parse function just takes an array of bytes and applies some
transformations like,,,
[0..3] an integer, [4...20] an String, [21..27] another String and so on.

It's just a test code, I'd like to understand what it's happeing.

2015-02-04 18:57 GMT+01:00 Sandy Ryza :
> Hi Guillermo,
>
> What exactly do you mean by "each iteration"?  Are you caching data in
> memory?
>
> -Sandy
>
> On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz 
> wrote:
>>
>> I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
>> I have 5 slaves:
>> (32cores /256Gb / 7physical disks) x 5
>>
>> I have been trying many different configurations with YARN.
>> yarn.nodemanager.resource.memory-mb 196Gb
>> yarn.nodemanager.resource.cpu-vcores 24
>>
>> I have tried to execute the job with different number of executors a
>> memory (1-4g)
>> With 20 executors takes 25s each iteration (128mb) and it never has a
>> really long time waiting because GC.
>>
>> When I execute around 60 executors the process time it's about 45s and
>> some tasks take until one minute because GC.
>>
>> I have no idea why it's calling GC when I execute more executors
>> simultaneously.
>> The another question it's why it takes more time to execute each
>> block. My theory about the this it's because there're only 7 physical
>> disks and it's not the same 5 processes writing than 20.
>>
>> The code is pretty simple, it's just a map function which parse a line
>> and write the output in HDFS. There're a lot of substrings inside of
>> the function what it could cause GC.
>>
>> Any theory about?
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Problems with GC and time to execute with different number of executors.

2015-02-04 Thread Sandy Ryza
Hi Guillermo,

What exactly do you mean by "each iteration"?  Are you caching data in
memory?

-Sandy

On Wed, Feb 4, 2015 at 5:02 AM, Guillermo Ortiz 
wrote:

> I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
> I have 5 slaves:
> (32cores /256Gb / 7physical disks) x 5
>
> I have been trying many different configurations with YARN.
> yarn.nodemanager.resource.memory-mb 196Gb
> yarn.nodemanager.resource.cpu-vcores 24
>
> I have tried to execute the job with different number of executors a
> memory (1-4g)
> With 20 executors takes 25s each iteration (128mb) and it never has a
> really long time waiting because GC.
>
> When I execute around 60 executors the process time it's about 45s and
> some tasks take until one minute because GC.
>
> I have no idea why it's calling GC when I execute more executors
> simultaneously.
> The another question it's why it takes more time to execute each
> block. My theory about the this it's because there're only 7 physical
> disks and it's not the same 5 processes writing than 20.
>
> The code is pretty simple, it's just a map function which parse a line
> and write the output in HDFS. There're a lot of substrings inside of
> the function what it could cause GC.
>
> Any theory about?
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Problems with GC and time to execute with different number of executors.

2015-02-04 Thread Guillermo Ortiz
I execute a job in Spark where I'm processing a file of 80Gb in HDFS.
I have 5 slaves:
(32cores /256Gb / 7physical disks) x 5

I have been trying many different configurations with YARN.
yarn.nodemanager.resource.memory-mb 196Gb
yarn.nodemanager.resource.cpu-vcores 24

I have tried to execute the job with different number of executors a
memory (1-4g)
With 20 executors takes 25s each iteration (128mb) and it never has a
really long time waiting because GC.

When I execute around 60 executors the process time it's about 45s and
some tasks take until one minute because GC.

I have no idea why it's calling GC when I execute more executors simultaneously.
The another question it's why it takes more time to execute each
block. My theory about the this it's because there're only 7 physical
disks and it's not the same 5 processes writing than 20.

The code is pretty simple, it's just a map function which parse a line
and write the output in HDFS. There're a lot of substrings inside of
the function what it could cause GC.

Any theory about?

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Fwd: Controlling number of executors on Mesos vs YARN

2015-01-05 Thread Tim Chen
Forgot to hit reply-all.

-- Forwarded message --
From: Tim Chen 
Date: Sun, Jan 4, 2015 at 10:46 PM
Subject: Re: Controlling number of executors on Mesos vs YARN
To: mvle 


Hi Mike,

You're correct there is no such setting in for Mesos coarse grain mode,
since the assumption is that each node is launched with one container and
Spark is launching multiple tasks in that container.

In fine-grain mode there isn't a setting like that, as it currently will
launch an executor as long as it satisfies the minimum container resource
requirement.

I've created a JIRA earlier about capping the number of executors or better
distribute the # of executors launched in each node. Since the decision of
choosing what node to launch containers is all in the Spark scheduler side,
it's very easy to modify it.

Btw, what's the configuration to set the # of executors on YARN side?

Thanks,

Tim



On Sun, Jan 4, 2015 at 9:37 PM, mvle  wrote:

> I'm trying to compare the performance of Spark running on Mesos vs YARN.
> However, I am having problems being able to configure the Spark workload to
> run in a similar way on Mesos and YARN.
>
> When running Spark on YARN, you can specify the number of executors per
> node. So if I have a node with 4 CPUs, I can specify 6 executors on that
> node. When running Spark on Mesos, there doesn't seem to be an equivalent
> way to specify this. In Mesos, you can somewhat force this by specifying
> the
> number of CPU resources to be 6 when running the slave daemon. However,
> this
> seems to be a static configuration of the Mesos cluster rather something
> that can be configured in the Spark framework.
>
> So here is my question:
>
> For Spark on Mesos, am I correct that there is no way to control the number
> of executors per node (assuming an idle cluster)? For Spark on Mesos
> coarse-grained mode, there is a way to specify max_cores but that is still
> not equivalent to specifying the number of executors per node as when Spark
> is run on YARN.
>
> If I am correct, then it seems Spark might be at a disadvantage running on
> Mesos compared to YARN (since it lacks the fine tuning ability provided by
> YARN).
>
> Thanks,
> Mike
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Controlling number of executors on Mesos vs YARN

2015-01-04 Thread mvle
I'm trying to compare the performance of Spark running on Mesos vs YARN.
However, I am having problems being able to configure the Spark workload to
run in a similar way on Mesos and YARN.

When running Spark on YARN, you can specify the number of executors per
node. So if I have a node with 4 CPUs, I can specify 6 executors on that
node. When running Spark on Mesos, there doesn't seem to be an equivalent
way to specify this. In Mesos, you can somewhat force this by specifying the
number of CPU resources to be 6 when running the slave daemon. However, this
seems to be a static configuration of the Mesos cluster rather something
that can be configured in the Spark framework. 

So here is my question:

For Spark on Mesos, am I correct that there is no way to control the number
of executors per node (assuming an idle cluster)? For Spark on Mesos
coarse-grained mode, there is a way to specify max_cores but that is still
not equivalent to specifying the number of executors per node as when Spark
is run on YARN.

If I am correct, then it seems Spark might be at a disadvantage running on
Mesos compared to YARN (since it lacks the fine tuning ability provided by
YARN).

Thanks,
Mike



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Controlling-number-of-executors-on-Mesos-vs-YARN-tp20966.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Specifying number of executors in Mesos

2014-12-11 Thread Andrew Ash
Gerard,

Are you familiar with spark.deploy.spreadOut
 in Standalone
mode?  It sounds like you want the same thing in Mesos mode.

On Thu, Dec 11, 2014 at 6:48 AM, Tathagata Das 
wrote:

> Not that I am aware of. Spark will try to spread the tasks evenly
> across executors, its not aware of the workers at all. So if the
> executors to worker allocation is uneven, I am not sure what can be
> done. Maybe others can get smoe ideas.
>
> On Tue, Dec 9, 2014 at 6:20 AM, Gerard Maas  wrote:
> > Hi,
> >
> > We've a number of Spark Streaming /Kafka jobs that would benefit of an
> even
> > spread of consumers over physical hosts in order to maximize network
> usage.
> > As far as I can see, the Spark Mesos scheduler accepts resource offers
> until
> > all required Mem + CPU allocation has been satisfied.
> >
> > This basic resource allocation policy results in large executors spread
> over
> > few nodes, resulting in many Kafka consumers in a single node (e.g. from
> 12
> > consumers, I've seen allocations of 7/3/2)
> >
> > Is there a way to tune this behavior to achieve executor allocation on a
> > given number of hosts?
> >
> > -kr, Gerard.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


  1   2   >