Hi there,

I am new to Spark and new to scala, although have lots of experience on the
Java side.  I am experimenting with Spark for a new project where it seems
like it could be a good fit.  As I go through the examples, there is one
case scenario that I am trying to figure out, comparing the contents of an
RDD to itself to result in a new RDD.

In an overly simply example, I have:

JavaSparkContext sc = new JavaSparkContext ...
JavaRDD<String> data = sc.parallelize(buildData());

I then want to compare each entry in data to other entries and end up with:

JavaPairRDD<String, List<String>> mapped = data.???

Is this something easily handled by Spark?  My apologies if this is a
stupid question, I have spent less than 10 hours tinkering with Spark and
am trying to come up to speed.


-- 
Jared Rodriguez

Reply via email to