vanzin commented on a change in pull request #23560: [SPARK-26632][Core] 
Separate Thread Configurations of Driver and Executor
URL: https://github.com/apache/spark/pull/23560#discussion_r281297348
 
 

 ##########
 File path: docs/configuration.md
 ##########
 @@ -1954,6 +1954,46 @@ Apart from these, the following properties are also 
available, and may be useful
 </tr>
 </table>
 
+### Thread Configurations
+
+Depending on jobs and cluster configurations, we can set number of threads in 
several places in Spark to utilize 
+available resources efficiently to get better performance. Prior to Spark 3.0, 
these thread configurations apply 
+to all roles of Spark, such as driver, executor, worker and master. From Spark 
3.0, we can configure threads in 
+finer granularity starting from driver and executor. Take RPC module as 
example in below table. For other modules,
+like shuffle, just replace "rpc" with "shuffle" in the property names except 
+<code>spark.{driver|executor}.rpc.netty.dispatcher.numThreads</code>, which is 
only for RPC module.
+
+<table class="table">
+<tr><th>Property Name</th><th>Default</th><th>Meaning</th></tr>
+<tr>
+  <td><code>spark.{driver|executor}.rpc.io.serverThreads</code></td>
+  <td>
+    Fall back on spark.rpc.io.serverThreads
+  </td>
+  <td>Number of threads used in the server thread pool</td>
+</tr>
+<tr>
+  <td><code>spark.{driver|executor}.rpc.io.clientThreads</code></td>
+  <td>
+    Fall back on spark.rpc.io.clientThreads
+  </td>
+  <td>Number of threads used in the client thread pool</td>
+</tr>
+<tr>
+  <td><code>spark.{driver|executor}.rpc.netty.dispatcher.numThreads</code></td>
+  <td>
+    Fall back on spark.rpc.netty.dispatcher.numThreads
+  </td>
+  <td>Number of threads used in RPC message dispatcher thread pool</td>
+</tr>
+</table>
+
+The default values of spark.rpc.io.serverThreads, spark.rpc.io.clientThreads 
and spark.rpc.netty.dispatcher.numThreads
+are same. It's <br>
+number of CPU cores if specified. Otherwise, the available processors to the 
JVM. In either cases, the default value 
 
 Review comment:
   This whole paragraph in fact is a little hard to read, and shouldn't 
reference private APIs in Spark. Better wording:
   
   ```
   The default value for number of thread-related config keys is the minimum of 
the number of cores requested for the driver or executor, or, in the absence of 
that value, the number of cores available for the JVM (with a hardcoded upper 
limit of 8).
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to