You are probably looking to do .select(explode($"to"), ...) first, which will produce a new row for each value in the input array.
On Fri, Jun 19, 2015 at 12:02 AM, Suraj Shetiya <[email protected]> wrote: > > Hi, > > I wanted to obtain a grouped by frame from a dataframe. > > A snippet of the column on which I need to perform groupby is below. > > > df.select("To").show() > > To > ArrayBuffer(vance... > ArrayBuffer(vance... > ArrayBuffer(rober... > ArrayBuffer(richa... > ArrayBuffer(guill... > ArrayBuffer(m..pr... > ArrayBuffer(rich.... > ArrayBuffer(issue... > ArrayBuffer(jim.f... > ArrayBuffer(richa... > > > A sample field is as below > > > df.select("To").collect()[0] > > Row(To=[u'[email protected]', u'[email protected]', > u'[email protected]', u'[email protected]', > u'[email protected]', u'[email protected]', > u'[email protected]', u'[email protected]', > u'[email protected]', u'[email protected]', > u'[email protected]', u'[email protected]', > u'[email protected]']) > > I want to perform a group by on "To" column but perform it by each > recipient of the email rather than the entire field. > > Is there a way to do this using the dataframe groupBy command ? > > > Regards, > Suraj >
