Kent Yao created SPARK-35879:
--------------------------------

             Summary: Fix performance regression caused by collectFetchRequests
                 Key: SPARK-35879
                 URL: https://issues.apache.org/jira/browse/SPARK-35879
             Project: Spark
          Issue Type: Bug
          Components: Shuffle, Spark Core
    Affects Versions: 3.1.0, 3.2.0
            Reporter: Kent Yao


{code:java}
```sql
 SET spark.sql.adaptive.enabled=true;
 SET spark.sql.shuffle.partitions=3000;
 SELECT /*+ REPARTITION */ 1 as pid, id from range(1, 1000000, 1, 500);
 SELECT /*+ REPARTITION(pid, id) */ 1 as pid, id from range(1, 1000000, 1, 500);
 ```{code}
{code:java}
```log
 21/06/23 13:54:22 DEBUG ShuffleBlockFetcherIterator: maxBytesInFlight: 
50331648, targetRemoteRequestSize: 10066329, maxBlocksInFlightPerAddress: 
2147483647
 21/06/23 13:54:38 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
2314708 at BlockManagerId(2, 10.1.3.114, 36423, None) with 86 blocks
 21/06/23 13:54:59 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
2636612 at BlockManagerId(3, 10.1.3.115, 34293, None) with 87 blocks
 21/06/23 13:55:18 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
2508706 at BlockManagerId(4, 10.1.3.116, 41869, None) with 90 blocks
 21/06/23 13:55:34 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
2350854 at BlockManagerId(5, 10.1.3.117, 45787, None) with 85 blocks
 21/06/23 13:55:34 INFO ShuffleBlockFetcherIterator: Getting 438 (11.8 MiB) 
non-empty blocks including 90 (2.5 MiB) local and 0 (0.0 B) host-local and 348 
(9.4 MiB) remote blocks
 21/06/23 13:55:34 DEBUG ShuffleBlockFetcherIterator: Sending request for 87 
blocks (2.5 MiB) from 10.1.3.115:34293
 21/06/23 13:55:34 INFO TransportClientFactory: Successfully created connection 
to /10.1.3.115:34293 after 1 ms (0 ms spent in bootstraps)
 21/06/23 13:55:34 DEBUG ShuffleBlockFetcherIterator: Sending request for 90 
blocks (2.4 MiB) from 10.1.3.116:41869
 21/06/23 13:55:34 INFO TransportClientFactory: Successfully created connection 
to /10.1.3.116:41869 after 2 ms (0 ms spent in bootstraps)
 21/06/23 13:55:34 DEBUG ShuffleBlockFetcherIterator: Sending request for 85 
blocks (2.2 MiB) from 10.1.3.117:45787
 ```{code}
{code:java}
```log
 21/06/23 14:00:45 INFO MapOutputTracker: Broadcast outputstatuses size = 411, 
actual size = 828997
 21/06/23 14:00:45 INFO MapOutputTrackerWorker: Got the map output locations
 21/06/23 14:00:45 DEBUG ShuffleBlockFetcherIterator: maxBytesInFlight: 
50331648, targetRemoteRequestSize: 10066329, maxBlocksInFlightPerAddress: 
2147483647
 21/06/23 14:00:55 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
1894389 at BlockManagerId(2, 10.1.3.114, 36423, None) with 99 blocks
 21/06/23 14:01:04 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
1919993 at BlockManagerId(3, 10.1.3.115, 34293, None) with 100 blocks
 21/06/23 14:01:14 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
1977186 at BlockManagerId(5, 10.1.3.117, 45787, None) with 103 blocks
 21/06/23 14:01:23 DEBUG ShuffleBlockFetcherIterator: Creating fetch request of 
1938336 at BlockManagerId(4, 10.1.3.116, 41869, None) with 101 blocks
 21/06/23 14:01:23 INFO ShuffleBlockFetcherIterator: Getting 500 (9.1 MiB) 
non-empty blocks including 97 (1820.3 KiB) local and 0 (0.0 B) host-local and 
403 (7.4 MiB) remote blocks
 21/06/23 14:01:23 DEBUG ShuffleBlockFetcherIterator: Sending request for 101 
blocks (1892.9 KiB) from 10.1.3.116:41869
 21/06/23 14:01:23 DEBUG ShuffleBlockFetcherIterator: Sending request for 103 
blocks (1930.8 KiB) from 10.1.3.117:45787
 21/06/23 14:01:23 DEBUG ShuffleBlockFetcherIterator: Sending request for 99 
blocks (1850.0 KiB) from 10.1.3.114:36423
 21/06/23 14:01:23 DEBUG ShuffleBlockFetcherIterator: Sending request for 100 
blocks (1875.0 KiB) from 10.1.3.115:34293
 21/06/23 14:01:23 INFO ShuffleBlockFetcherIterator: Started 4 remote fetches 
in 37889 ms
 ```{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to