Github user gczsjdy commented on the issue:
https://github.com/apache/spark/pull/19763
This happens a lot in our TPC-DS 100TB test. We have a Intel Xeon CPU
E5-2699 v4 @2.2GHz CPU as master, this will influence the driver's performance.
And we set `spark.sql.shuffle.partitions` to 10976. Shuffle partitions * number
of mappers will influence the workload driver does.
Let's take TPC-DS q67 as example:
Without this PR, there's 47:39-(41:16+6.3min) ~ 5s gap between map and
reduce stages, most of which is used to aggregate map statistics using one
thread.
<img width="927" alt="single_thread_q67"
src="https://user-images.githubusercontent.com/7685352/32893095-49216a4a-cb13-11e7-82fe-ccb552a6a625.PNG">
With this PR, there's 25:32-(18:58+6.6min) ~ 0s gap:
<img width="926" alt="multi-thread_q67"
src="https://user-images.githubusercontent.com/7685352/32893264-beb31b82-cb13-11e7-954f-a893f6a9966f.PNG">
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]