fitermay commented on issue #23986: [SPARK-27070] Fix performance bug in DefaultPartitionCoalescer URL: https://github.com/apache/spark/pull/23986#issuecomment-470943732 By the way. This are the results from the original PR before replacing `min` with `minBy`. It seems to be twice as fast. I'm guessing it's because of the reduction of indirection when passing an implicit ordering instead of a minBy lambda. ``` Intel64 Family 6 Model 63 Stepping 2, GenuineIntel Coalesced RDD: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ Coalesce Num Partitions: 100 Num Hosts: 1 264 289 26 0.4 2644.9 1.0X Coalesce Num Partitions: 100 Num Hosts: 5 211 220 8 0.5 2110.6 1.3X Coalesce Num Partitions: 100 Num Hosts: 10 215 225 10 0.5 2149.9 1.2X Coalesce Num Partitions: 100 Num Hosts: 20 200 203 6 0.5 1996.6 1.3X Coalesce Num Partitions: 100 Num Hosts: 40 198 205 11 0.5 1983.7 1.3X Coalesce Num Partitions: 100 Num Hosts: 80 199 203 4 0.5 1992.8 1.3X Coalesce Num Partitions: 500 Num Hosts: 1 465 477 15 0.2 4654.2 0.6X Coalesce Num Partitions: 500 Num Hosts: 5 271 280 11 0.4 2707.9 1.0X Coalesce Num Partitions: 500 Num Hosts: 10 232 250 18 0.4 2320.5 1.1X Coalesce Num Partitions: 500 Num Hosts: 20 213 222 14 0.5 2130.8 1.2X Coalesce Num Partitions: 500 Num Hosts: 40 210 215 9 0.5 2102.9 1.3X Coalesce Num Partitions: 500 Num Hosts: 80 206 206 0 0.5 2062.4 1.3X Coalesce Num Partitions: 1000 Num Hosts: 1 715 716 1 0.1 7149.7 0.4X Coalesce Num Partitions: 1000 Num Hosts: 5 310 311 1 0.3 3098.5 0.9X Coalesce Num Partitions: 1000 Num Hosts: 10 255 266 17 0.4 2553.8 1.0X Coalesce Num Partitions: 1000 Num Hosts: 20 230 238 12 0.4 2304.1 1.1X Coalesce Num Partitions: 1000 Num Hosts: 40 227 242 21 0.4 2271.1 1.2X Coalesce Num Partitions: 1000 Num Hosts: 80 211 217 10 0.5 2114.5 1.3X Coalesce Num Partitions: 5000 Num Hosts: 1 3043 3616 634 0.0 30428.0 0.1X Coalesce Num Partitions: 5000 Num Hosts: 5 1035 1069 52 0.1 10353.4 0.3X Coalesce Num Partitions: 5000 Num Hosts: 10 613 617 3 0.2 6134.6 0.4X Coalesce Num Partitions: 5000 Num Hosts: 20 408 419 11 0.2 4082.8 0.6X Coalesce Num Partitions: 5000 Num Hosts: 40 315 340 24 0.3 3153.3 0.8X Coalesce Num Partitions: 5000 Num Hosts: 80 258 262 5 0.4 2577.7 1.0X Coalesce Num Partitions: 10000 Num Hosts: 1 5385 5470 124 0.0 53848.7 0.0X Coalesce Num Partitions: 10000 Num Hosts: 5 1856 1861 7 0.1 18561.0 0.1X Coalesce Num Partitions: 10000 Num Hosts: 10 1022 1075 48 0.1 10223.2 0.3X Coalesce Num Partitions: 10000 Num Hosts: 20 619 626 8 0.2 6185.8 0.4X Coalesce Num Partitions: 10000 Num Hosts: 40 417 422 5 0.2 4168.2 0.6X Coalesce Num Partitions: 10000 Num Hosts: 80 312 316 4 0.3 3119.6 0.8X ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
