Wow, it really was that easy! The implicit joining works a treat. Many thanks, Jon
On 13 October 2014 22:58, Stephen Boesch <java...@gmail.com> wrote: > is the following what you are looking for? > > > scala > sc.parallelize(myMap.map{ case (k,v) => (k,v) }.toSeq) > res2: org.apache.spark.rdd.RDD[(String, Int)] = ParallelCollectionRDD[0] > at parallelize at <console>:21 > > > > 2014-10-13 14:02 GMT-07:00 jon.g.massey <jon.g.mas...@gmail.com>: > >> Hi guys, >> Just starting out with Spark and following through a few tutorials, it >> seems >> the easiest way to get ones source data into an RDD is using the >> sc.parallelize function. Unfortunately, my local data is in multiple >> instances of Map<K,V> types, and the parallelize function only works on >> objects with the Seq trait, and produces an RDD which seemingly doesn't >> then >> have the notion of Keys and Values which I require for joins (amongst >> other >> functions). >> >> Is there a way of using a SparkContext to create a distributed RDD from a >> local Map, rather than from a Hadoop or text file source? >> >> Thanks, >> Jon >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/distributing-Scala-Map-datatypes-to-RDD-tp16320.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >