Victsm commented on issue #22173: [SPARK-24355] Spark external shuffle server improvement to better handle block fetch requests. URL: https://github.com/apache/spark/pull/22173#issuecomment-578414568 @cloud-fan What do you think of SPARK-30602 in the context of this perf regression you see? We have also been operating our Spark infrastructure with this change for quite some time, and we do not in general notice performance regressions. When doing shuffle in a large-scale multi-tenancy cluster, the issues we mentioned in SPARK-30602's SPIP doc becomes much more dominant. Without the change in SPARK-24355, before saturating the underlying network, the disk is first saturated due to the small random reads, which will then further propagate its impact to start timing out control plane RPCs. SPARK-24355 is basically an attempt to stop the small random reads impacting control plane RPCs to improve reliability of shuffle service. On top of these, SPARK-30602 will significantly improve the overall throughput and efficiency of Spark shuffle.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
