You are probably looking to do .select(explode($"to"), ...) first, which
will produce a new row for each value in the input array.

On Fri, Jun 19, 2015 at 12:02 AM, Suraj Shetiya <[email protected]>
wrote:

>
> Hi,
>
> I wanted to obtain a grouped by frame from a dataframe.
>
> A snippet of the column on which I need to perform groupby is below.
>
> > df.select("To").show()
>
> To
> ArrayBuffer(vance...
> ArrayBuffer(vance...
> ArrayBuffer(rober...
> ArrayBuffer(richa...
> ArrayBuffer(guill...
> ArrayBuffer(m..pr...
> ArrayBuffer(rich....
> ArrayBuffer(issue...
> ArrayBuffer(jim.f...
> ArrayBuffer(richa...
>
>
> A sample field is as below
>
> > df.select("To").collect()[0]
>
> Row(To=[u'[email protected]', u'[email protected]',
> u'[email protected]', u'[email protected]',
> u'[email protected]', u'[email protected]',
> u'[email protected]', u'[email protected]',
> u'[email protected]', u'[email protected]',
> u'[email protected]', u'[email protected]',
> u'[email protected]'])
>
> I want to perform a group by on "To" column but perform it by each
> recipient of the email rather than the entire field.
>
> Is there a way to do this using the dataframe groupBy command ?
>
>
> Regards,
> Suraj
>

Reply via email to