Re: Creating RDD with key and Subkey

Ranjana Rajendran Wed, 19 Aug 2015 13:49:59 -0700

Hi Ratika,

I tried the following:


val l = List("apple", "orange", "banana")

var inner = new scala.collection.mutable.HashMap[String, List[String]]

inner.put("fruits",l)

var list = new scala.collection.mutable.HashMap[String,
scala.collection.mutable.HashMap[String, List[String]]]

list.put("food", inner)

import scala.collection.JavaConverters._

val rdd = sc.parallelize(list.toSeq)

Now, the O(1) look up for value for a key is lost here. See the discussion
below:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-create-RDD-over-hashmap-td893.html

On Wed, Aug 19, 2015 at 10:28 AM, Ratika Prasad <rpra...@couponsinc.com>
wrote:

>
> We need to create RDDas below
>
> JavaPairRDD<String,List<HashMap<String,List<String>>>>
>
> The idea is we need to do lookup() on Key which will return a list of hash
> maps kind of structure and then do lookup on subkey which is the key in the
> HashMap returned
>
>
>
> _____________________________
> From: Silas Davis <si...@silasdavis.net>
> Sent: Wednesday, August 19, 2015 10:34 pm
> Subject: Re: Creating RDD with key and Subkey
> To: Ratika Prasad <rpra...@couponsinc.com>, <dev@spark.apache.org>
>
>
>
> This should be sent to the user mailing list, I think.
>
> It depends what you want to do with the RDD, so yes you could throw around
> (String, HashMap<String,List<String>>) tuples or perhaps you'd like to be
> able to groupByKey, reduceByKey on the key and sub-key as a composite in
> which case JavaPairRDD<Tuple2<String,String>, List<String>> might be more
> appropriate. Not really clear what you are asking.
>
>
> On Wed, 19 Aug 2015 at 17:15 Ratika Prasad < rpra...@couponsinc.com>
> wrote:
>
>> Hi,
>>
>>
>>
>> We have a need where we need the RDD with the following format
>> JavaPairRDD<String,HashMap<String,List<String>>>, mostly RDD with a Key and
>> Subkey kind of a structure, how is that doable in Spark ?
>>
>>
>>
>> Thanks
>>
>> R
>>
>
>
>

Re: Creating RDD with key and Subkey

Reply via email to