Re: Apache Spark Executor - number of threads

2015-03-17 Thread Sameer Farooqui
Hi Igor & Nirandap,

There is a setting in Spark called "cores" or "num_cores" that you should
look into. This # will set the # of threads running in each Executor JVM.
The name of the setting is a bit misleading. You don't have to match the
num_cores of the Executor to the actual number of CPU cores on the machine.
You can, but you don't have to. In general, it is best to start by
oversubscribing by a factor of 2x or 3x. So if you have 16 cores on the
machine, set this between 32 to 48 to start with.

Note that there are other internal threads in the Executor JVM used for
things like shuffle. There's about 15 - 20 of them, I think. These internal
threads are usually sitting idle and are only used when needed. But these
are not the threads you're setting with num_cores for the Executor. The
threads allocated by num_core are for the user tasks (like map or reduce)
that you run for your transformations.

Think of the num_cores as the # of slots (from the old MapReduce world).



On Wed, Mar 18, 2015 at 12:32 AM, nirandap  wrote:

> Hi devs,
>
> I would like to know this as well. It would be great if someone could
> provide this information.
>
> cheers
>
>
> On Tue, Mar 17, 2015 at 3:06 PM, Igor Petrov [via Apache Spark User List]
> <[hidden email] <http:///user/SendEmail.jtp?type=node&node=22110&i=0>>
> wrote:
>
>> Hello,
>>
>> is it possible to set number of threads in the Executor's pool?
>> I see no such setting in the docs. The reason we want to try it: we want
>> to see performance impact with different level of parallelism (having one
>> thread per CPU, two threads per CPU, N threads per CPU).
>>
>> Thank You
>>
>> ----------
>>  If you reply to this email, your message will be added to the
>> discussion below:
>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Executor-number-of-threads-tp22095.html
>>  To start a new topic under Apache Spark User List, email [hidden email]
>> <http:///user/SendEmail.jtp?type=node&node=22110&i=1>
>> To unsubscribe from Apache Spark User List, click here.
>> NAML
>> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>>
>
>
>
> --
> Niranda
>
> --
> View this message in context: Re: Apache Spark Executor - number of
> threads
> <http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Executor-number-of-threads-tp22095p22110.html>
>
> Sent from the Apache Spark User List mailing list archive
> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>


Re: Apache Spark Executor - number of threads

2015-03-17 Thread nirandap
Hi devs,

I would like to know this as well. It would be great if someone could
provide this information.

cheers


On Tue, Mar 17, 2015 at 3:06 PM, Igor Petrov [via Apache Spark User List] <
ml-node+s1001560n22095...@n3.nabble.com> wrote:

> Hello,
>
> is it possible to set number of threads in the Executor's pool?
> I see no such setting in the docs. The reason we want to try it: we want
> to see performance impact with different level of parallelism (having one
> thread per CPU, two threads per CPU, N threads per CPU).
>
> Thank You
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Executor-number-of-threads-tp22095.html
>  To start a new topic under Apache Spark User List, email
> ml-node+s1001560n1...@n3.nabble.com
> To unsubscribe from Apache Spark User List, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=bmlyYW5kYS5wZXJlcmFAZ21haWwuY29tfDF8NjAxMDUyMzU5>
> .
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>



-- 
Niranda




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Executor-number-of-threads-tp22095p22110.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Apache Spark Executor - number of threads

2015-03-17 Thread Igor Petrov
Hello,

is it possible to set number of threads in the Executor's pool?
I see no such setting in the docs. The reason we want to try it: we want to
see performance impact with different level of parallelism (having one
thread per CPU, two threads per CPU, N threads per CPU).

Thank You



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Executor-number-of-threads-tp22095.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org