I did try this earlier before, but I’ve got an error that I couldn’t comprehend:
scala> var hobbies = results.flatMap(row => row(1))
<console>:16: error: type mismatch;
found : Any
required: TraversableOnce[?]
var hobbies = results.flatMap(row => row(1))
I must be missing something, perhaps a cast.
On 6 Jan, 2015, at 12:17 am, Pankaj Narang [via Apache Spark User List]
<[email protected]> wrote:
> try as below
>
> results.map(row => row(1)).collect
>
> try
>
> var hobbies = results.flatMap(row => row(1))
>
> It will create all the hobbies in a simpe array nowob
>
> hbmap =hobbies.map(hobby =>(hobby,1)).reduceByKey((hobcnt1,hobcnt2)
> =>hobcnt1+hobcnt2)
>
> It will aggregate hobbies as below
>
> {swimming,2}, {hiking,1}
>
>
> Now hbmap .map{case(hobby,count)=>(count,hobby)}.sortByKey(ascending
> =false).collect
>
> will give you hobbies sorted in descending by their count
>
> This is pseudo code and must help you
>
> Regards
> Pankaj
>
>
>
>
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20975.html
> To unsubscribe from Finding most occurrences in a JSON Nested Array, click
> here.
> NAML
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Finding-most-occurrences-in-a-JSON-Nested-Array-tp20971p20977.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.