Hi,
On running yarn-client mode, the following options can be specified: l --executor-cores l --num-executors If we have following machines: l 3 data nodes l 8 cores each node Which is the better? 1. --executor-cores 7 --num-executors 3 (more core for each executor, leaving a few cores for other process) 2. --executor-cores 2 -num-executors 12 (more executor) I've thought that 1. could be better, since it can save some communication overheads between tasks. But in reality, 2. is faster by a large margin. (34 minutes vs 49 minutes) Is there best practices for setting these options? Thanks.