This is an automated email from the ASF dual-hosted git repository.
zhouky pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/incubator-celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new 107f3df8b [CELEBORN-979] Reduce default disk Check Interval
107f3df8b is described below
commit 107f3df8ba79036ccaec54d0e095d69643570d02
Author: jiaoqingbo <[email protected]>
AuthorDate: Mon Sep 18 14:54:22 2023 +0800
[CELEBORN-979] Reduce default disk Check Interval
### What changes were proposed in this pull request?
Reduce default disk Check Interval
### Why are the changes needed?
since https://github.com/apache/incubator-celeborn/pull/1909 ,In
PushDataHandler#checkDiskFull method,Added check logic for DiskInfo status, the
default disk Check Interval should be reduced
### Does this PR introduce _any_ user-facing change?
NO
### How was this patch tested?
PASS GA
Closes #1915 from jiaoqingbo/979.
Authored-by: jiaoqingbo <[email protected]>
Signed-off-by: zky.zhoukeyong <[email protected]>
---
common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala | 2 +-
docs/configuration/worker.md | 2 +-
docs/migration.md | 4 ++++
3 files changed, 6 insertions(+), 2 deletions(-)
diff --git
a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
index 1534a6921..ecf35f67a 100644
--- a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
+++ b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
@@ -2406,7 +2406,7 @@ object CelebornConf extends Logging {
.version("0.3.0")
.doc("Intervals between device monitor to check disk.")
.timeConf(TimeUnit.MILLISECONDS)
- .createWithDefaultString("60s")
+ .createWithDefaultString("30s")
val WORKER_DISK_MONITOR_SYS_BLOCK_DIR: ConfigEntry[String] =
buildConf("celeborn.worker.monitor.disk.sys.block.dir")
diff --git a/docs/configuration/worker.md b/docs/configuration/worker.md
index 8b05d5ad7..52719bf66 100644
--- a/docs/configuration/worker.md
+++ b/docs/configuration/worker.md
@@ -60,7 +60,7 @@ license: |
| celeborn.worker.graceful.shutdown.saveCommittedFileInfo.interval | 5s |
Interval for a Celeborn worker to flush committed file infos into Level DB. |
0.3.1 |
| celeborn.worker.graceful.shutdown.saveCommittedFileInfo.sync | false |
Whether to call sync method to save committed file infos into Level DB to
handle OS crash. | 0.3.1 |
| celeborn.worker.graceful.shutdown.timeout | 600s | The worker's graceful
shutdown timeout time. | 0.2.0 |
-| celeborn.worker.monitor.disk.check.interval | 60s | Intervals between device
monitor to check disk. | 0.3.0 |
+| celeborn.worker.monitor.disk.check.interval | 30s | Intervals between device
monitor to check disk. | 0.3.0 |
| celeborn.worker.monitor.disk.check.timeout | 30s | Timeout time for worker
check device status. | 0.3.0 |
| celeborn.worker.monitor.disk.checklist | readwrite,diskusage | Monitor type
for disk, available items are: iohang, readwrite and diskusage. | 0.2.0 |
| celeborn.worker.monitor.disk.enabled | true | When true, worker will monitor
device and report to master. | 0.3.0 |
diff --git a/docs/migration.md b/docs/migration.md
index a76c34301..280b52fe0 100644
--- a/docs/migration.md
+++ b/docs/migration.md
@@ -28,6 +28,10 @@ license: |
- Since 0.4.0, Celeborn won't support
`org.apache.spark.shuffle.celeborn.RssShuffleManager`.
+## Upgrading from 0.3.1 to 0.3.2
+
+- Since 0.3.2, Celeborn changed the default value of
`celeborn.worker.monitor.disk.check.interval` from `60` to `30`.
+
## Upgrading from 0.3.0 to 0.3.1
- Since 0.3.1, Celeborn changed the default value of
`celeborn.worker.directMemoryRatioToResume` from `0.5` to `0.7`.