[Spark Optimization] Why is one node getting all the pressure?

Aakash Basu Mon, 11 Jun 2018 02:14:25 -0700

Hi,

I have submitted a job on* 4 node cluster*, where I see, most of the
operations happening at one of the worker nodes and other two are simply
chilling out.


Picture below puts light on that -

How to properly distribute the load?

My cluster conf (4 node cluster [1 driver; 3 slaves]) -

*Cores - 6*
*RAM - 12 GB*
*HDD - 60 GB*

My Spark Submit command is as follows -

*spark-submit --master spark://192.168.49.37:7077
<http://192.168.49.37:7077> --num-executors 3 --executor-cores 5
--executor-memory 4G /appdata/bblite-codebase/prima_diabetes_indians.py*

What to do?

Thanks,
Aakash.

[Spark Optimization] Why is one node getting all the pressure?

Reply via email to