leixm commented on PR #307:
URL: 
https://github.com/apache/incubator-uniffle/pull/307#issuecomment-1311162922

   I reused the environment in 
https://github.com/apache/incubator-uniffle/pull/190 to compare the performance 
before and after issue#136, and the performance improvement after merging the PR
   ### Environment
   
   Shuffle Server Num : 5
   Shuffle Write: 48G
   Configuration: --conf spark.sql.shuffle.partitions=5000 --conf 
spark.sql.adaptive.enabled=true --conf 
spark.sql.adaptive.shuffle.targetPostShuffleInputSize=64MB --conf 
spark.dynamicAllocation.maxExecutors=200 --conf spark.executor.cores=6
   
   We measure the performance of get_shuffle_result by the following metrics:
   
   - get_shuffle_result_times: The number of calls of the get_shuffle_result 
interface
   - get_shuffle_result_cost: Time consumption of get_shuffle_result interface
   - get_shuffle_result_for_multi_part_times:The number of calls of the 
get_shuffle_result_for_multi_part interface
   - get_shuffle_result_for_multi_part_cost: Time consumption of 
get_shuffle_result_for_multi_part interface
   
   ### Test Results
   
   Before issue_136
   
   | serverId | get_shuffle_result_times | get_shuffle_result_cost(ms) |
   | -------- | ------------------------ | --------------------------- |
   | Server1  | 1000                     | 157614                      |
   | Server2  | 1000                     | 426897                      |
   | Server3  | 1000                     | 269488                      |
   | Server4  | 1000                     | 906758                      |
   | Server5  | 1001                     | 123217                      |
   | sum      | 5001                     | 1883974                     |
   
   After issue_136
   
   | serverId | get_shuffle_result_for_multi_part_times | 
get_shuffle_result_for_multi_part_cost(ms) |
   | -------- | --------------------------------------- | 
------------------------------------------ |
   | Server1  | 833                                     | 870720                
                     |
   | Server2  | 833                                     | 260865                
                     |
   | Server3  | 834                                     | 333202                
                     |
   | Server4  | 833                                     | 90277                 
                     |
   | Server5  | 835                                     | 94113                 
                     |
   | sum      | 4168                                    | 1649177               
                     |
   
   After this pr
   
   | serverId | get_shuffle_result_for_multi_part_times | 
get_shuffle_result_for_multi_part_cost(ms) |
   | -------- | --------------------------------------- | 
------------------------------------------ |
   | Server1  | 168                                     | 40355                 
                     |
   | Server2  | 167                                     | 43852                 
                     |
   | Server3  | 167                                     | 98452                 
                     |
   | Server4  | 167                                     | 91838                 
                     |
   | Server5  | 168                                     | 25479                 
                     |
   | sum      | 837                                     | 299976                
                     |
   
   
   
   
   
   ### Summarize
   
   After this pr, the number of interface requests is reduced by 79.9%, and the 
total time is reduced by 81.8%.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to