cloud-fan commented on issue #22173: [SPARK-24355] Spark external shuffle server improvement to better handle block fetch requests. URL: https://github.com/apache/spark/pull/22173#issuecomment-571181860 It's good to know that the underlying channel write thread pool has the same concurrency with the request handling thread pool. So there are 2 thread pools: one to handle the requests, one to write data to the channel. Previously, fetch requests were handled by the request handling thread pool, and return immediately after reading shuffle blocks. Now, fetch requests are handled by a new fetch request thread pool, and return until channel write is completed. It's kind of handle fetch requests in sync mode, while previously it's more likely to keep both thread pools busy with reading shuffle blocks/writing data to channel. Unfortunately I can't share our internal workload (we don't have special settings), I'll try to write a microbenchmark.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
