This is an automated email from the ASF dual-hosted git repository.
zhouky pushed a commit to branch branch-0.3
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/branch-0.3 by this push:
new 37d29f76e [CELEBORN-982] Improve RPC bind port tips
37d29f76e is described below
commit 37d29f76e2677ff6745be92dd3d9570b0c021cf7
Author: sychen <[email protected]>
AuthorDate: Mon Sep 18 16:00:22 2023 +0800
[CELEBORN-982] Improve RPC bind port tips
### What changes were proposed in this pull request?
Current
```
23/09/18 11:35:07,506 WARN [main] Utils: Service 'MasterSys' could not bind
on port 9097. Attempting port 9098.
23/09/18 11:35:07,506 INFO [main] NettyRpcEnvFactory: Starting RPC Server
[MasterSys] on clb-master:9098 with advisor endpoint clb-master:9098
Exception in thread "main" java.net.BindException: Address already in use:
Service 'MasterSys' failed after 1 retries (starting from 9097)! Consider
explicitly setting the appropriate port for the service 'MasterSys' (for
example spark.ui.port for SparkUI) to an available port or increasing
spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
```
PR
```
23/09/18 11:43:03,157 WARN [main] Utils: Service 'MasterSys' could not bind
on port 9097. Attempting port 9098.
23/09/18 11:43:03,157 INFO [main] NettyRpcEnvFactory: Starting RPC Server
[MasterSys] on clb-master:9098 with advisor endpoint clb-master:9098
Exception in thread "main" java.net.BindException: Address already in use:
Service 'MasterSys' failed after 1 retries (starting from 9097)! Consider
explicitly setting the appropriate port for the service 'MasterSys' to an
available port or increasing celeborn.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
```
### Why are the changes needed?
### Does this PR introduce _any_ user-facing change?
### How was this patch tested?
Closes #1918 from cxzl25/CELEBORN-982.
Authored-by: sychen <[email protected]>
Signed-off-by: zky.zhoukeyong <[email protected]>
(cherry picked from commit fbeb5a62ecf2451f15c8b05491c1a6ef2e46a106)
Signed-off-by: zky.zhoukeyong <[email protected]>
---
common/src/main/scala/org/apache/celeborn/common/util/Utils.scala | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/common/src/main/scala/org/apache/celeborn/common/util/Utils.scala
b/common/src/main/scala/org/apache/celeborn/common/util/Utils.scala
index 2d55bd227..18956f34b 100644
--- a/common/src/main/scala/org/apache/celeborn/common/util/Utils.scala
+++ b/common/src/main/scala/org/apache/celeborn/common/util/Utils.scala
@@ -42,6 +42,7 @@ import org.apache.hadoop.fs.{FSDataInputStream, Path}
import org.roaringbitmap.RoaringBitmap
import org.apache.celeborn.common.CelebornConf
+import org.apache.celeborn.common.CelebornConf.PORT_MAX_RETRY
import org.apache.celeborn.common.exception.CelebornException
import org.apache.celeborn.common.internal.Logging
import org.apache.celeborn.common.meta.{DiskStatus, FileInfo, WorkerInfo}
@@ -274,13 +275,12 @@ object Utils extends Logging {
s"${e.getMessage}: Service$serviceString failed after " +
s"$maxRetries retries (on a random free port)! " +
s"Consider explicitly setting the appropriate binding
address for " +
- s"the service$serviceString (for example
spark.driver.bindAddress " +
- s"for SparkDriver) to the correct binding address."
+ s"the service$serviceString to the correct binding address."
} else {
s"${e.getMessage}: Service$serviceString failed after " +
s"$maxRetries retries (starting from $startPort)! Consider
explicitly setting " +
- s"the appropriate port for the service$serviceString (for
example spark.ui.port " +
- s"for SparkUI) to an available port or increasing
spark.port.maxRetries."
+ s"the appropriate port for the service$serviceString to an
available port " +
+ s"or increasing ${PORT_MAX_RETRY.key}."
}
val exception = new BindException(exceptionMessage)
// restore original stack trace