Thanks, Keith. we have set the SPARK_WORKER_INSTANCES=8. So that means we are running 8 workers in a single machine with 1 thread and this gives the 8 threads?
Is there a preference for running 1 worker and 8 threads inside it? These are dual CPU machines, so I believe we at least need 2 worker instances per machine. If this is the case, I can use 2 worker instances each having 4 threads. Another question is how to avoid the disk for shuffle operation? Best, Supun.. On Mon, Jul 15, 2019 at 8:49 PM Keith Chapman <keithgchap...@gmail.com> wrote: > Hi Supun, > > A couple of things with regard to your question. > > --executor-cores means the number of worker threads per VM. According to > your requirement this should be set to 8. > > *repartitionAndSortWithinPartitions *is a RDD operation, RDD operations > in Spark are not performant both in terms of execution and memory. I would > rather use Dataframe sort operation if performance is key. > > Regards, > Keith. > > http://keith-chapman.com > > > On Mon, Jul 15, 2019 at 8:45 AM Supun Kamburugamuve < > supun.kamburugam...@gmail.com> wrote: > >> Hi all, >> >> We are trying to measure the sorting performance of Spark. We have a 16 >> node cluster with 48 cores and 256GB of ram in each machine and 10Gbps >> network. >> >> Let's say we are running with 128 parallel tasks and each partition >> generates about 1GB of data (total 128GB). >> >> We are using the method *repartitionAndSortWithinPartitions* >> >> A standalone cluster is used with the following configuration. >> >> SPARK_WORKER_CORES=1 >> SPARK_WORKER_MEMORY=16G >> SPARK_WORKER_INSTANCES=8 >> >> --executor-memory 16G --executor-cores 1 --num-executors 128 >> >> I believe this sets 128 executors to run the job each having 16GB of >> memory and spread across 16 nodes with 8 threads in each node. This >> configuration runs very slow. The program doesn't use disks to read or >> write data (data generated in-memory and we don't write to file after >> sorting). >> >> It seems even though the data size is small, it uses disk for the >> shuffle. We are not sure our configurations are optimal to achieve the best >> performance. >> >> Best, >> Supun.. >> >>