I did try this earlier before, but I’ve got an error that I couldn’t comprehend: scala> var hobbies = results.flatMap(row => row(1)) <console>:16: error: type mismatch; found : Any required: TraversableOnce[?] var hobbies = results.flatMap(row => row(1))
I must be missing something, perhaps a cast. On 6 Jan, 2015, at 12:17 am, Pankaj Narang [via Apache Spark User List] <ml-node+s1001560n2097...@n3.nabble.com> wrote: > try as below > > results.map(row => row(1)).collect > > try > > var hobbies = results.flatMap(row => row(1)) > > It will create all the hobbies in a simpe array nowob > > hbmap =hobbies.map(hobby =>(hobby,1)).reduceByKey((hobcnt1,hobcnt2) > =>hobcnt1+hobcnt2) > > It will aggregate hobbies as below > > {swimming,2}, {hiking,1} > > > Now hbmap .map{case(hobby,count)=>(count,hobby)}.sortByKey(ascending > =false).collect > > will give you hobbies sorted in descending by their count > > This is pseudo code and must help you > > Regards > Pankaj > > > > > If you reply to this email, your message will be added to the discussion > below: > http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20975.html > To unsubscribe from Finding most occurrences in a JSON Nested Array, click > here. > NAML -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20977.html Sent from the Apache Spark User List mailing list archive at Nabble.com.