I did try this earlier before, but I’ve got an error that I couldn’t comprehend:
scala> var hobbies = results.flatMap(row => row(1)) 
<console>:16: error: type mismatch;
 found   : Any
 required: TraversableOnce[?]
       var hobbies = results.flatMap(row => row(1)) 

I must be missing something, perhaps a cast.

On 6 Jan, 2015, at 12:17 am, Pankaj Narang [via Apache Spark User List] 
<ml-node+s1001560n2097...@n3.nabble.com> wrote:

> try as below 
> 
> results.map(row => row(1)).collect 
> 
> try 
> 
> var hobbies = results.flatMap(row => row(1)) 
> 
> It will create all the hobbies in a simpe array nowob 
> 
> hbmap =hobbies.map(hobby =>(hobby,1)).reduceByKey((hobcnt1,hobcnt2) 
> =>hobcnt1+hobcnt2) 
> 
> It will aggregate  hobbies as below 
> 
> {swimming,2}, {hiking,1} 
> 
> 
> Now hbmap .map{case(hobby,count)=>(count,hobby)}.sortByKey(ascending 
> =false).collect 
> 
> will give you hobbies sorted in descending by their count 
>   
> This is pseudo code and must help you 
> 
> Regards 
> Pankaj 
> 
> 
> 
> 
> If you reply to this email, your message will be added to the discussion 
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20975.html
> To unsubscribe from Finding most occurrences in a JSON Nested Array, click 
> here.
> NAML





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20977.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to