zhengchenyu commented on PR #44512:
URL: https://github.com/apache/spark/pull/44512#issuecomment-1874780989

   > > but combine in ExternalSorter will never trigger extra spill.
   > 
   > It could be triggered by `ExternalSorter#maybeSpillCollection`?
   
   @Ngone51 Thanks for your reply!
   
   Yes, It could be triggered by `ExternalSorter#maybeSpillCollection`. I mean 
that the Extra spill is trigger by `ExternalAppendOnlyMap#spill`
   
   Before the change, both`ExternalAppendOnlyMap#spill` and 
`ExternalSorter#maybeSpillCollection` will be spill. After the change, only 
`ExternalSorter#maybeSpillCollection` will be spill.
   
   I created a huge amount of data, and I found the following logs:
   
   ```
   23/12/26 16:55:52 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 522.3 MiB to disk (1 time so far)
   23/12/26 16:56:09 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 629.8 MiB to disk (2 times so far)
   23/12/26 16:56:40 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 491.5 MiB to disk (3 times so far)
   23/12/26 16:56:58 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 553.0 MiB to disk (4 times so far)
   23/12/26 16:57:16 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 442.9 MiB to disk (5 times so far)
   23/12/26 16:57:47 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 576.1 MiB to disk (6 times so far)
   23/12/26 16:58:04 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 540.2 MiB to disk (7 times so far)
   23/12/26 16:58:22 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 686.3 MiB to disk (8 times so far)
   23/12/26 16:58:52 INFO ExternalAppendOnlyMap: Thread 48 spilling in-memory 
map of 540.2 MiB to disk (9 times so far)
   23/12/26 17:00:06 INFO ExternalAppendOnlyMap: Task 5 force spilling 
in-memory map to disk and it will release 942.2 MiB memory
   23/12/26 17:00:27 INFO ExternalSorter: Thread 48 spilling in-memory map of 
954.0 MiB to disk (1 time so far)
   23/12/26 17:01:24 INFO ExternalSorter: Thread 48 spilling in-memory map of 
954.0 MiB to disk (2 times so far)
   23/12/26 17:02:24 INFO ExternalSorter: Thread 48 spilling in-memory map of 
959.1 MiB to disk (3 times so far)
   23/12/26 17:03:23 INFO ExternalSorter: Thread 48 spilling in-memory map of 
954.0 MiB to disk (4 times so far)
   23/12/26 17:04:30 INFO ExternalSorter: Thread 48 spilling in-memory map of 
948.8 MiB to disk (5 times so far)
   23/12/26 17:05:24 INFO ExternalSorter: Thread 48 spilling in-memory map of 
943.7 MiB to disk (6 times so far)
   23/12/26 17:06:25 INFO ExternalSorter: Thread 48 spilling in-memory map of 
954.0 MiB to disk (7 times so far)
   23/12/26 17:07:08 INFO ExternalSorter: Thread 48 spilling in-memory map of 
959.1 MiB to disk (8 times so far)
   23/12/26 17:07:58 INFO ExternalSorter: Thread 48 spilling in-memory map of 
959.1 MiB to disk (9 times so far) 
   ```
   
   After the change, we will only found ExternalSorter spill log.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to