GroupBy and Spark Performance issue

KhajaAsmath Mohammed Mon, 16 Jan 2017 21:39:54 -0800

Hi,

I am trying to group by data in spark and find out maximum value for group
of data. I have to use group by as I need to transpose based on the values.


I tried repartition data by increasing number from 1 to 10000.Job gets run
till the below stage and it takes long time to move ahead. I was never
successful, job gets killed after somtime with GC overhead limit issues.


[image: Inline image 1]

Increased Memory limits too. Not sure what is going wrong, can anyone guide
me through right approach.

Thanks,
Asmath

GroupBy and Spark Performance issue

Reply via email to