Assuming I also want to run n concurrent jobs of the following type: each RDD is of the same form (JavaPairRDD), and I would like to run the same transformation on all RDDs. The brute force way would be to instantiate n threads and submit a job from each thread.
Would this way be valid as well ? create a new RDD which is a combination of the n RDDs (something like a group by for multiple RDDs).
Is there a way to implement this using the existing java API ? Yadid
