It's not very difficult to implement by properly set parameter of application. Some basic knowledge you should know: An application can have only one executor at each machine or container (YARN). So you just set executor-cores as 1, then each executor will make only one task at once.
2014-10-28 19:00 GMT+08:00 <jan.zi...@centrum.cz>: > But I guess that this makes only one task over all the clusters nodes. I > would like to run several tasks, but I would like Spark to not run more > than one map at each of my nodes at one time. That means I would like to > let's say have 4 different tasks and 2 nodes where each node has 2 cores. > Currently hadoop runs 2 maps in parallel at each node (all the 4 tasks in > parallel), but I would like to somehow force it to run only 1 task at each > node and to give it another task after the first task will finish. > > > > ______________________________________________________________ > > The number of tasks is decided by the input partition numbers. > If you want only one map or flatMap at once, just call coalesce() or > repartition() to associate data into one partition. > However, this is not recommend because it was not executed parallel > efficiently. > > 2014-10-28 17:27 GMT+08:00 <jan.zi...@centrum.cz>: > >> Hi, >> >> I am currently struggling with how to properly set Spark to perform only >> one map, flatMap, etc at once. In other words my map uses multi core >> algorithm so I would like to have only one map running to be able to use >> all the machine cores. >> >> Thank you in advance for advices and replies. >> >> Jan >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >>