Re: [Spark Optimization] Why is one node getting all the pressure?

Jörn Franke Mon, 11 Jun 2018 02:32:04 -0700

What is your code ? Maybe this one does an operation which is bound to a single 
host or your data volume is too small for multiple hosts.


> On 11. Jun 2018, at 11:13, Aakash Basu <aakash.spark....@gmail.com> wrote:
> 
> Hi,
> 
> I have submitted a job on 4 node cluster, where I see, most of the operations 
> happening at one of the worker nodes and other two are simply chilling out.
> 
> Picture below puts light on that -
> <image.png>
> How to properly distribute the load?
> 
> My cluster conf (4 node cluster [1 driver; 3 slaves]) -
> 
> Cores - 6
> RAM - 12 GB
> HDD - 60 GB
> 
> My Spark Submit command is as follows -
> 
> spark-submit --master spark://192.168.49.37:7077 --num-executors 3 
> --executor-cores 5 --executor-memory 4G 
> /appdata/bblite-codebase/prima_diabetes_indians.py
> 
> What to do?
> 
> Thanks,
> Aakash.

Re: [Spark Optimization] Why is one node getting all the pressure?

Reply via email to