Github user pwendell commented on a diff in the pull request:
https://github.com/apache/spark/pull/3713#discussion_r21990371
--- Diff: docs/configuration.md ---
@@ -852,6 +852,59 @@ Apart from these, the following properties are also
available, and may be useful
between nodes leading to flooding the network with those.
</td>
</tr>
+<tr>
+ <td><code>spark.shuffle.io.preferDirectBufs</code></td>
+ <td>true</td>
+ <td>
+ (Netty only) Off-heap buffers are used to reduce garbage collection
during shuffle and cache
+ block transfer. For environments where off-heap memory is tightly
limited, users may wish to
+ turn this off to force all allocations from Netty to be on-heap.
+ </td>
+</tr>
+<tr>
+ <td><code>spark.shuffle.io.numConnectionsPerPeer</code></td>
+ <td>1</td>
+ <td>
+ (Netty only) Connections between hosts are reused in order to reduce
connection buildup for
+ large clusters. For small clusters with many hard disks, this may
result in insufficient
+ concurrency to saturate all disks, and so users may consider
increasing this value.
+ </td>
+</tr>
+<tr>
+ <td><code>spark.shuffle.io.serverThreads</code></td>
--- End diff --
Or at least, could you give some indication of exactly why they might
increase it (i.e. if you find you can't saturate the network)?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]