Re: How to set Spark to perform only one map at once at each cluster node

Yanbo Liang Tue, 28 Oct 2014 04:47:06 -0700

It's not very difficult to implement by properly set parameter of
application.
Some basic knowledge you should know:
An application can have only one executor at each machine or container
(YARN).
So you just set executor-cores as 1, then each executor will make only one
task at once.



2014-10-28 19:00 GMT+08:00 <jan.zi...@centrum.cz>:

> But I guess that this makes only one task over all the clusters nodes. I
> would like to run several tasks, but I would like Spark to not run more
> than one map at each of my nodes at one time. That means I would like to
> let's say have 4 different tasks and 2 nodes where each node has 2 cores.
> Currently hadoop runs 2 maps in parallel at each node (all the 4 tasks in
> parallel), but I would like to somehow force it to run only 1 task at each
> node and to give it another task after the first task will finish.
>
>
>
> ______________________________________________________________
>
>  The number of tasks is decided by the input partition numbers.
> If you want only one map or flatMap at once, just call coalesce() or
> repartition() to associate data into one partition.
> However, this is not recommend because it was not executed parallel
> efficiently.
>
> 2014-10-28 17:27 GMT+08:00 <jan.zi...@centrum.cz>:
>
>> Hi,
>>
>> I am currently struggling with how to properly set Spark to perform only
>> one map, flatMap, etc at once. In other words my map uses multi core
>> algorithm so I would like to have only one map running to be able to use
>> all the machine cores.
>>
>> Thank you in advance for advices and replies.
>>
>> Jan
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>

Re: How to set Spark to perform only one map at once at each cluster node

Reply via email to