Victsm opened a new pull request #25907: [SPARK-29206] Make number of shuffle server threads a multiple of number of chunk fetch handler threads. URL: https://github.com/apache/spark/pull/25907 ### What changes were proposed in this pull request? We propose to change the configuration of Netty server threads and chunk fetch handler threads to make sure the former is always a multiple of the latter. This change is necessary to make sure the RPC timeout issues can be fully resolved with the dedicated chunk fetch handler EventLoopGroup. SPARK-29206 has more details on the explanations behind this change. ### How was this patch tested? Verified that after configuring the number of threads for both thread pools appropriately, we no longer see the RPC timeout issues when shuffle service gets really busy. A custom Spark shuffle service stress testing suite was used for this purpose.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
