A dataframe with following contents is given:

ID PART DETAILS
 1    1 A1
 1    2 A2
 1    3 A3
 2    1 B1
 3    1 C1

Target format should be as following:

ID DETAILS
 1 A1+A2+A3
 2 B1
 3 C1

Note, the order of A1-3 is important.

Currently I am using this alternative:

ID DETAIL_1 DETAIL_2 DETAIL_3
 1 A1       A2       A3
 2 B1
 3 C1

What would be the best method to do such transformation an a large dataset?




---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to