This is an automated email from the ASF dual-hosted git repository.

nicholasjiang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/celeborn.git


The following commit(s) were added to refs/heads/main by this push:
     new 16762c659 [CELEBORN-1774][FOLLOWUP] Change celeborn.<module>.io.mode 
optional to explain default behavior in description
16762c659 is described below

commit 16762c659c88837a534f0010415ecaef5b5bdc31
Author: SteNicholas <[email protected]>
AuthorDate: Thu Jan 2 21:15:19 2025 +0800

    [CELEBORN-1774][FOLLOWUP] Change celeborn.<module>.io.mode optional to 
explain default behavior in description
    
    ### What changes were proposed in this pull request?
    
    Change `celeborn.<module>.io.mode` optional to explain default behavior in 
description.
    
    ### Why are the changes needed?
    
    The default value of `celeborn.<module>.io.mode` in document could be 
changed by whether epoll mode is available for different os. Therefore, 
`celeborn.<module>.io.mode` should be changed to optional and explained the 
default behavior in description of option.
    
    Follow up 
https://github.com/apache/celeborn/pull/3039#discussion_r1899340272.
    
    ### Does this PR introduce _any_ user-facing change?
    
    `celeborn.<module>.io.mode` is optional and explains default behavior in 
description.
    
    ### How was this patch tested?
    
    CI.
    
    Closes #3044 from SteNicholas/CELEBORN-1774.
    
    Authored-by: SteNicholas <[email protected]>
    Signed-off-by: SteNicholas <[email protected]>
---
 .../src/main/scala/org/apache/celeborn/common/CelebornConf.scala | 9 +++++----
 docs/configuration/network.md                                    | 2 +-
 docs/migration.md                                                | 2 +-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git 
a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala 
b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
index 7280a1cde..80d31d747 100644
--- a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
+++ b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
@@ -539,7 +539,9 @@ class CelebornConf(loadDefaults: Boolean) extends Cloneable 
with Logging with Se
   def rpcDumpIntervalMs(): Long = get(RPC_SUMMARY_DUMP_INTERVAL)
 
   def networkIoMode(module: String): String = {
-    getTransportConf(module, NETWORK_IO_MODE)
+    get(
+      NETWORK_IO_MODE.key.replace("<module>", module),
+      if (Epoll.isAvailable) IOMode.EPOLL.name() else IOMode.NIO.name())
   }
 
   def networkIoPreferDirectBufs(module: String): Boolean = {
@@ -1931,15 +1933,14 @@ object CelebornConf extends Logging {
       .timeConf(TimeUnit.MILLISECONDS)
       .createWithDefaultString("60s")
 
-  val NETWORK_IO_MODE: ConfigEntry[String] =
+  val NETWORK_IO_MODE: OptionalConfigEntry[String] =
     buildConf("celeborn.<module>.io.mode")
       .categories("network")
       .doc("Netty EventLoopGroup backend, available options: NIO, EPOLL. If 
epoll mode is available, the default IO mode is EPOLL; otherwise, the default 
is NIO.")
       .stringConf
       .transform(_.toUpperCase)
       .checkValues(Set(IOMode.NIO.name(), IOMode.EPOLL.name()))
-      .createWithDefaultFunction(() =>
-        if (Epoll.isAvailable) IOMode.EPOLL.name() else IOMode.NIO.name())
+      .createOptional
 
   val NETWORK_IO_PREFER_DIRECT_BUFS: ConfigEntry[Boolean] =
     buildConf("celeborn.<module>.io.preferDirectBufs")
diff --git a/docs/configuration/network.md b/docs/configuration/network.md
index f690d205e..4a5f8c943 100644
--- a/docs/configuration/network.md
+++ b/docs/configuration/network.md
@@ -29,7 +29,7 @@ license: |
 | celeborn.&lt;module&gt;.io.enableVerboseMetrics | false | false | Whether to 
track Netty memory detailed metrics. If true, the detailed metrics of Netty 
PoolByteBufAllocator will be gotten, otherwise only general memory usage will 
be tracked. |  |  | 
 | celeborn.&lt;module&gt;.io.lazyFD | true | false | Whether to initialize 
FileDescriptor lazily or not. If true, file descriptors are created only when 
data is going to be transferred. This can reduce the number of open files. If 
setting <module> to `fetch`, it works for worker fetch server. |  |  | 
 | celeborn.&lt;module&gt;.io.maxRetries | 3 | false | Max number of times we 
will try IO exceptions (such as connection timeouts) per request. If set to 0, 
we will not do any retries. If setting <module> to `data`, it works for shuffle 
client push and fetch data. If setting <module> to `replicate`, it works for 
replicate client of worker replicating data to peer worker. If setting <module> 
to `push`, it works for Flink shuffle client push data. |  |  | 
-| celeborn.&lt;module&gt;.io.mode | EPOLL | false | Netty EventLoopGroup 
backend, available options: NIO, EPOLL. If epoll mode is available, the default 
IO mode is EPOLL; otherwise, the default is NIO. |  |  | 
+| celeborn.&lt;module&gt;.io.mode | &lt;undefined&gt; | false | Netty 
EventLoopGroup backend, available options: NIO, EPOLL. If epoll mode is 
available, the default IO mode is EPOLL; otherwise, the default is NIO. |  |  | 
 | celeborn.&lt;module&gt;.io.numConnectionsPerPeer | 1 | false | Number of 
concurrent connections between two nodes. If setting <module> to `rpc_app`, 
works for shuffle client. If setting <module> to `rpc_service`, works for 
master or worker. If setting <module> to `data`, it works for shuffle client 
push and fetch data. If setting <module> to `replicate`, it works for replicate 
client of worker replicating data to peer worker. |  |  | 
 | celeborn.&lt;module&gt;.io.preferDirectBufs | true | false | If true, we 
will prefer allocating off-heap byte buffers within Netty. If setting <module> 
to `rpc_app`, works for shuffle client. If setting <module> to `rpc_service`, 
works for master or worker. If setting <module> to `data`, it works for shuffle 
client push and fetch data. If setting <module> to `push`, it works for worker 
receiving push data. If setting <module> to `replicate`, it works for replicate 
server or client of w [...]
 | celeborn.&lt;module&gt;.io.receiveBuffer | 0b | false | Receive buffer size 
(SO_RCVBUF). Note: the optimal size for receive buffer and send buffer should 
be latency * network_bandwidth. Assuming latency = 1ms, network_bandwidth = 
10Gbps buffer size should be ~ 1.25MB. If setting <module> to `rpc_app`, works 
for shuffle client. If setting <module> to `rpc_service`, works for master or 
worker. If setting <module> to `data`, it works for shuffle client push and 
fetch data. If setting <mod [...]
diff --git a/docs/migration.md b/docs/migration.md
index e9ceb8c03..800655e5e 100644
--- a/docs/migration.md
+++ b/docs/migration.md
@@ -31,7 +31,7 @@ license: |
 
 - Since 0.6.0, Celeborn changed the default value of 
`celeborn.client.spark.fetch.throwsFetchFailure` from `false` to `true`, which 
means Celeborn will enable spark stage rerun at default.
 
-- Since 0.6.0, Celeborn changed the default value of 
`celeborn.<module>.io.mode` from `NIO` to `EPOLL` if epoll mode is available, 
falling back to `NIO` otherwise.
+- Since 0.6.0, Celeborn changed `celeborn.<module>.io.mode` optional, of which 
the default value changed from `NIO` to `EPOLL` if epoll mode is available, 
falling back to `NIO` otherwise.
 
 - Since 0.6.0, Celeborn has introduced a new RESTful API namespace: /api/v1, 
which uses the application/json media type for requests and responses.
    The `celeborn-openapi-client` SDK is also available to help users interact 
with the new RESTful APIs.

Reply via email to