Hey Spark users,

I'm trying to group by a dataframe, by appending occurrences into a list
instead of count.

Let's say we have a dataframe as shown below:

| category | id |
| -------- |:--:|
| A        | 1  |
| A        | 2  |
| B        | 3  |
| B        | 4  |
| C        | 5  |

ideally, after some magic group by (reverse explode?):

| category | id_list  |
| -------- | -------- |
| A        | 1,2      |
| B        | 3,4      |
| C        | 5        |

any tricks to achieve that? Scala Spark API is preferred. =D

BR,
Todd Leo

Reply via email to