This is an automated email from the ASF dual-hosted git repository.
zhouky pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new 42f08ca21 [CELEBORN-985] Change default value of numConnectionsPerPeer
to 1
42f08ca21 is described below
commit 42f08ca21a73bc93b8368e51fa4a7b7a22604117
Author: sychen <[email protected]>
AuthorDate: Wed Sep 27 22:50:23 2023 +0800
[CELEBORN-985] Change default value of numConnectionsPerPeer to 1
### What changes were proposed in this pull request?
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes #1943 from cxzl25/CELEBORN-985.
Authored-by: sychen <[email protected]>
Signed-off-by: zky.zhoukeyong <[email protected]>
---
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala | 2 +-
docs/configuration/network.md | 2 +-
docs/migration.md | 2 ++
3 files changed, 4 insertions(+), 2 deletions(-)
diff --git
a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
index 162e6e179..078a9fe88 100644
--- a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
+++ b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
@@ -1369,7 +1369,7 @@ object CelebornConf extends Logging {
.categories("network")
.doc("Number of concurrent connections between two nodes.")
.intConf
- .createWithDefault(2)
+ .createWithDefault(1)
val NETWORK_IO_BACKLOG: ConfigEntry[Int] =
buildConf("celeborn.<module>.io.backLog")
diff --git a/docs/configuration/network.md b/docs/configuration/network.md
index 2d28b81f7..225c9de7b 100644
--- a/docs/configuration/network.md
+++ b/docs/configuration/network.md
@@ -30,7 +30,7 @@ license: |
| celeborn.<module>.io.lazyFD | true | Whether to initialize
FileDescriptor lazily or not. If true, file descriptors are created only when
data is going to be transferred. This can reduce the number of open files. | |
| celeborn.<module>.io.maxRetries | 3 | Max number of times we will try
IO exceptions (such as connection timeouts) per request. If set to 0, we will
not do any retries. | |
| celeborn.<module>.io.mode | NIO | Netty EventLoopGroup backend,
available options: NIO, EPOLL. | |
-| celeborn.<module>.io.numConnectionsPerPeer | 2 | Number of concurrent
connections between two nodes. | |
+| celeborn.<module>.io.numConnectionsPerPeer | 1 | Number of concurrent
connections between two nodes. | |
| celeborn.<module>.io.preferDirectBufs | true | If true, we will prefer
allocating off-heap byte buffers within Netty. | |
| celeborn.<module>.io.receiveBuffer | 0b | Receive buffer size
(SO_RCVBUF). Note: the optimal size for receive buffer and send buffer should
be latency * network_bandwidth. Assuming latency = 1ms, network_bandwidth =
10Gbps buffer size should be ~ 1.25MB. | 0.2.0 |
| celeborn.<module>.io.retryWait | 5s | Time that we will wait in order
to perform a retry after an IOException. Only relevant if maxIORetries > 0. |
0.2.0 |
diff --git a/docs/migration.md b/docs/migration.md
index 280b52fe0..4e58086ea 100644
--- a/docs/migration.md
+++ b/docs/migration.md
@@ -28,6 +28,8 @@ license: |
- Since 0.4.0, Celeborn won't support
`org.apache.spark.shuffle.celeborn.RssShuffleManager`.
+- Since 0.4.0, Celeborn changed the default value of
`celeborn.<module>.io.numConnectionsPerPeer` from `2` to `1`.
+
## Upgrading from 0.3.1 to 0.3.2
- Since 0.3.2, Celeborn changed the default value of
`celeborn.worker.monitor.disk.check.interval` from `60` to `30`.