Hi, I have submitted a job on* 4 node cluster*, where I see, most of the operations happening at one of the worker nodes and other two are simply chilling out.
Picture below puts light on that - How to properly distribute the load? My cluster conf (4 node cluster [1 driver; 3 slaves]) - *Cores - 6* *RAM - 12 GB* *HDD - 60 GB* My Spark Submit command is as follows - *spark-submit --master spark://192.168.49.37:7077 <http://192.168.49.37:7077> --num-executors 3 --executor-cores 5 --executor-memory 4G /appdata/bblite-codebase/prima_diabetes_indians.py* What to do? Thanks, Aakash.