I remember asking this question on Spark user list and parallelize() was
the suggested option to run a closure on all Spark workers. Paolo, I like
the idea with foreachPartition() - maybe we can crete a fake RDD with
partition number equal to the number of Spark workers and then map each
partition to the corresponding worker.​

Reply via email to