Hi there, I am new to Spark and new to scala, although have lots of experience on the Java side. I am experimenting with Spark for a new project where it seems like it could be a good fit. As I go through the examples, there is one case scenario that I am trying to figure out, comparing the contents of an RDD to itself to result in a new RDD.
In an overly simply example, I have: JavaSparkContext sc = new JavaSparkContext ... JavaRDD<String> data = sc.parallelize(buildData()); I then want to compare each entry in data to other entries and end up with: JavaPairRDD<String, List<String>> mapped = data.??? Is this something easily handled by Spark? My apologies if this is a stupid question, I have spent less than 10 hours tinkering with Spark and am trying to come up to speed. -- Jared Rodriguez