Github user redsanket commented on a diff in the pull request:
https://github.com/apache/spark/pull/22173#discussion_r216069578
--- Diff:
common/network-common/src/main/java/org/apache/spark/network/util/TransportConf.java
---
@@ -281,4 +282,31 @@ public Properties cryptoConf() {
public long maxChunksBeingTransferred() {
return conf.getLong("spark.shuffle.maxChunksBeingTransferred",
Long.MAX_VALUE);
}
+
+ /**
+ * Percentage of io.serverThreads used by netty to process
ChunkFetchRequest.
+ * Shuffle server will use a separate EventLoopGroup to process
ChunkFetchRequest messages.
+ * Although when calling the async writeAndFlush on the underlying
channel to send
+ * response back to client, the I/O on the channel is still being
handled by
+ * {@link org.apache.spark.network.server.TransportServer}'s default
EventLoopGroup
+ * that's registered with the Channel, by waiting inside the
ChunkFetchRequest handler
+ * threads for the completion of sending back responses, we are able to
put a limit on
+ * the max number of threads from TransportServer's default
EventLoopGroup that are
+ * going to be consumed by writing response to ChunkFetchRequest, which
are I/O intensive
+ * and could take long time to process due to disk contentions. By
configuring a slightly
+ * higher number of shuffler server threads, we are able to reserve some
threads for
+ * handling other RPC messages, thus making the Client less likely to
experience timeout
+ * when sending RPC messages to the shuffle server. Default to 0, which
is 2*#cores
+ * or io.serverThreads. 10 would mean 10% of 2*#cores or 10% of
io.serverThreads
+ * which equals 0.1 * 2*#cores or 0.1 * io.serverThreads.
+ */
+ public int chunkFetchHandlerThreads() {
+ if(!this.getModuleName().equalsIgnoreCase("shuffle")) {
+ return 0;
+ }
+ int chunkFetchHandlerThreadsPercent =
+
conf.getInt("spark.shuffle.server.chunkFetchHandlerThreadsPercent", 0);
+ return this.serverThreads() > 0? (this.serverThreads() *
chunkFetchHandlerThreadsPercent)/100:
--- End diff --
I think it is a good idea to document both as this is an important config.
Let me know your thoughts
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]