More cores is finishing faster as expected. My guess is that you are getting more parallelism overall and that speeds things up. However with more tasks executing concurrently on one machine, you are getting some contention, so it's possible more tasks are taking longer - a little I/O contention, CPU throttling under load, etc. Could be other reasons but this is possibly quite normal.
On Thu, Feb 10, 2022 at 7:02 AM 15927907...@163.com <15927907...@163.com> wrote: > Hello, > I recently used spark3.2 to do a test based on the TPC-DS dataset, > and the entire TPC-DS data scale is 1TB(located in HDFS). But I encountered > a problem that I couldn't understand, and I hope to get your help. > The SQL statement tested is as follows: > select count(*) from ( > select distinct c_last_name, c_first_name, d_date > from store_sales, data_dim, customer > where store_sales.ss_sold_date_sk = date_dim.d_date_sk > and store_sales.ss_customer_sk = customer.c_customer_sk > and d_month_seq between 1200 and 1200 + 11 > )hot_cust > limit 100; > > Three tables are used here, and their sizes are: store_sales(387.97GB), > date_dim(9.84GB),customer(1.52GB) > > When submitting the application, the driver and executor parameters are > set as follows: > DRIVER_OPTIONS="--master spark://xx.xx.xx.xx:7078 --total-executor-cores > 48 --conf spark.driver.memory=100g --conf spark.default.parallelism=100 > --conf spark.sql.adaptive.enabled=false" > EXECUTOR_OPTIONS="--executor-cores 12 --conf spark.executor.memory=50g > --conf spark.task.cpus=1 --conf spark.sql.shuffle.partitions=100 --conf > spark.sql.crossJoin.enabled=true" > > The Spark cluster is built on a physical node with only one worker, and > the submitted tasks are in StandAlone mode. A total of 56 cores, 376GB of > memory. I kept executor-num=4 unchanged (that is, the total executors > remained unchanged), and adjusted executor-cores=2, 4, 8, 12. The > corresponding application time was: 96s, 66s, 50s, 49s. It was found that > the total- executor-cores >=32, the time-consuming is almost unchanged. > I monitored CPU utilization, IO, and found no bottlenecks. Then I > analyzed the task time in the longest stage of each application in the > spark web UI interface(The longest time-consuming stage in the four test > cases is stage4, and the corresponding time-consuming: 48s, 28s, 22s, 23s > ). We see that with the increase of cpu cores, the task time in stage4 > will also increase. The average task time is: 3s, 4s, 5s, 7s. And each > executor takes longer to run the first batch of tasks than subsequent > tasks. The more cores there are, the greater the time gap: > > Test-Case-ID total-executor-cores executor-cores total application time > stage4 > time average task time The first batch of tasks takes time The > remaining batch tasks take time > 1 8 2 96s 48s 7s 5-6s 3-4s > 2 16 4 66s 28s 5s 7-8s ~4s > 3 32 8 50s 23s 4s 10-12s ~5s > 4 48 12 49s 22s 3s 14-15s ~6s > > This result confuses me, why after increasing the number of cores, the > execution time of tasks becomes longer. I compared the data size processed > by each task in different cases, they are all consistent. Or why is the > execution efficiency almost unchanged after the number of cores increases > to a certain number. > Looking forward to your reply. > > ------------------------------ > 15927907...@163.com >