Hi Paolo, You might want to check Ignite spark module code:
https://github.com/apache/ignite/blob/1.6.0/modules/spark/src/main/scala/org/apache/ignite/spark/IgniteContext.scala Basically, they use sc.parallelize to execute the function that does the ignite node start. Regards, Luis On 6 July 2016 at 14:20, Paolo Di Tommaso <[email protected]> wrote: > Hi, > > I'm using the Ignite embedded deployment to run an Ignite workload in a > Spark cluster. > > In my use case it's required to deploy exactly an Ignite worker for each > node in the Spark cluster. However I haven't found a way to do that. > > > Take in consideration this scenario: I'm running a 3 nodes Spark cluster > on AWS (1 driver, 2 workers, each node with 3 cores). I would run 2 Ignite > workers, one for each Spark worker. > > I'm using the following script: > > https://gist.github.com/pditommaso/660cbee09755b2b880099ab3bf2c609a > > > I've set `spark.executor.instances = 2` in order to deploy two Ignite > workers, indeed in the main log I can read the following: > > > 16/07/06 18:58:02 INFO spark.IgniteContext: Will start Ignite nodes on 2 > workers > > > > However what happens is that Ignite is launched only in one Spark node. > > > Looking in the same log, the following line seems to suggest the reason: > > 16/07/06 18:58:05 INFO scheduler.TaskSetManager: Starting task 0.0 in > stage 0.0 (TID 0, *ip-10-37-175-68*.eu-west-1.compute.internal, partition > 0,PROCESS_LOCAL, 2137 bytes) > > 16/07/06 18:58:05 INFO scheduler.TaskSetManager: Starting task 1.0 in > stage 0.0 (TID 1, *ip-10-37-175-68*.eu-west-1.compute.internal, partition > 1,PROCESS_LOCAL, 2194 bytes) > > > Spark is running two tasks to deploy the Ignite workers, but both of them > in the same node (*ip-10-37-175-68*). > > Is there any workaround to avoid this? or more in general, is it possible > to deploy exactly one Ignite worker for each node in the Spark cluster ? > > > Thanks a lot. > > Cheers, > Paolo > >
