Github user gczsjdy commented on the issue:

    https://github.com/apache/spark/pull/19763
  
    This happens a lot in our TPC-DS 100TB test. We have a Intel Xeon CPU 
E5-2699 v4 @2.2GHz CPU as master, this will influence the driver's performance. 
And we set `spark.sql.shuffle.partitions` to 10976. Shuffle partitions * number 
of mappers will influence the workload driver does.
    
    Let's take TPC-DS q67 as example:
    Without this PR, there's 47:39-(41:16+6.3min) ~ 5s gap between map and 
reduce stages, most of which is used to aggregate map statistics using one 
thread. 
    <img width="927" alt="single_thread_q67" 
src="https://user-images.githubusercontent.com/7685352/32893095-49216a4a-cb13-11e7-82fe-ccb552a6a625.PNG";>
    With this PR, there's 25:32-(18:58+6.6min) ~ 0s gap:
    <img width="926" alt="multi-thread_q67" 
src="https://user-images.githubusercontent.com/7685352/32893264-beb31b82-cb13-11e7-954f-a893f6a9966f.PNG";>
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to