Hi Ratika, I tried the following:
val l = List("apple", "orange", "banana") var inner = new scala.collection.mutable.HashMap[String, List[String]] inner.put("fruits",l) var list = new scala.collection.mutable.HashMap[String, scala.collection.mutable.HashMap[String, List[String]]] list.put("food", inner) import scala.collection.JavaConverters._ val rdd = sc.parallelize(list.toSeq) Now, the O(1) look up for value for a key is lost here. See the discussion below: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-create-RDD-over-hashmap-td893.html On Wed, Aug 19, 2015 at 10:28 AM, Ratika Prasad <rpra...@couponsinc.com> wrote: > > We need to create RDDas below > > JavaPairRDD<String,List<HashMap<String,List<String>>>> > > The idea is we need to do lookup() on Key which will return a list of hash > maps kind of structure and then do lookup on subkey which is the key in the > HashMap returned > > > > _____________________________ > From: Silas Davis <si...@silasdavis.net> > Sent: Wednesday, August 19, 2015 10:34 pm > Subject: Re: Creating RDD with key and Subkey > To: Ratika Prasad <rpra...@couponsinc.com>, <dev@spark.apache.org> > > > > This should be sent to the user mailing list, I think. > > It depends what you want to do with the RDD, so yes you could throw around > (String, HashMap<String,List<String>>) tuples or perhaps you'd like to be > able to groupByKey, reduceByKey on the key and sub-key as a composite in > which case JavaPairRDD<Tuple2<String,String>, List<String>> might be more > appropriate. Not really clear what you are asking. > > > On Wed, 19 Aug 2015 at 17:15 Ratika Prasad < rpra...@couponsinc.com> > wrote: > >> Hi, >> >> >> >> We have a need where we need the RDD with the following format >> JavaPairRDD<String,HashMap<String,List<String>>>, mostly RDD with a Key and >> Subkey kind of a structure, how is that doable in Spark ? >> >> >> >> Thanks >> >> R >> > > >