This is an automated email from the ASF dual-hosted git repository.
rexxiong pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/celeborn.git
The following commit(s) were added to refs/heads/main by this push:
new 2a57fab86 [CELEBORN-1400] Bump Ratis version from 2.5.1 to 3.0.1
2a57fab86 is described below
commit 2a57fab8698a93049021caf41169c9df2f18bb0e
Author: SteNicholas <[email protected]>
AuthorDate: Thu May 30 17:22:22 2024 +0800
[CELEBORN-1400] Bump Ratis version from 2.5.1 to 3.0.1
### What changes were proposed in this pull request?
Bump Ratis version from 2.5.1 to 3.0.1. Address incompatible changes:
- RATIS-589. Eliminate buffer copying in
SegmentedRaftLogOutputStream.(https://github.com/apache/ratis/pull/964)
- RATIS-1677. Do not auto format RaftStorage in
RECOVER.(https://github.com/apache/ratis/pull/718)
- RATIS-1710. Refactor metrics api and implementation to separated modules.
(https://github.com/apache/ratis/pull/749)
### Why are the changes needed?
Bump Ratis version from 2.5.1 to 3.0.1. Ratis has released v3.0.0, v3.0.1,
which release note refers to [3.0.0](https://ratis.apache.org/post/3.0.0.html),
[3.0.1](https://ratis.apache.org/post/3.0.1.html). The 3.0.x version include
new features like pluggable metrics and lease read, etc, some improvements and
bugfixes including:
- 3.0.0: Change list of ratis 3.0.0 In total, there are roughly 100 commits
diffing from 2.5.1 including:
- Incompatible Changes
- RaftStorage Auto-Format
- RATIS-1677. Do not auto format RaftStorage in RECOVER.
(https://github.com/apache/ratis/pull/718)
- RATIS-1694. Fix the compatibility issue of RATIS-1677.
(https://github.com/apache/ratis/pull/731)
- RATIS-1871. Auto format RaftStorage when there is only one
directory configured. (https://github.com/apache/ratis/pull/903)
- Pluggable Ratis-Metrics (RATIS-1688)
- RATIS-1689. Remove the use of the thirdparty Gauge.
(https://github.com/apache/ratis/pull/728)
- RATIS-1692. Remove the use of the thirdparty Counter.
(https://github.com/apache/ratis/pull/732)
- RATIS-1693. Remove the use of the thirdparty Timer.
(https://github.com/apache/ratis/pull/734)
- RATIS-1703. Move MetricsReporting and JvmMetrics to impl.
(https://github.com/apache/ratis/pull/741)
- RATIS-1704. Fix SuppressWarnings(“VisibilityModifier”) in
RatisMetrics. (https://github.com/apache/ratis/pull/742)
- RATIS-1710. Refactor metrics api and implementation to separated
modules. (https://github.com/apache/ratis/pull/749)
- RATIS-1712. Add a dropwizard 3 implementation of ratis-metrics-api.
(https://github.com/apache/ratis/pull/751)
- RATIS-1391. Update library dropwizard.metrics version to 4.x
(https://github.com/apache/ratis/pull/632)
- RATIS-1601. Use the shaded dropwizard metrics and remove the
dependency (https://github.com/apache/ratis/pull/671)
- Streaming Protocol Change
- RATIS-1569. Move the asyncRpcApi.sendForward(..) call to the client
side. (https://github.com/apache/ratis/pull/635)
- New Features
- Leader Lease (RATIS-1864)
- RATIS-1865. Add leader lease bound ratio configuration
(https://github.com/apache/ratis/pull/897)
- RATIS-1866. Maintain leader lease after AppendEntries
(https://github.com/apache/ratis/pull/898)
- RATIS-1894. Implement ReadOnly based on leader lease
(https://github.com/apache/ratis/pull/925)
- RATIS-1882. Support read-after-write consistency
(https://github.com/apache/ratis/pull/913)
- StateMachine API
- RATIS-1874. Add notifyLeaderReady function in IStateMachine
(https://github.com/apache/ratis/pull/906)
- RATIS-1897. Make TransactionContext available in DataApi.write(..).
(https://github.com/apache/ratis/pull/930)
- New Configuration Properties
- RATIS-1862. Add the parameter whether to take Snapshot when
stopping to adapt to different services
(https://github.com/apache/ratis/pull/896)
- RATIS-1930. Add a conf for enable/disable majority-add.
(https://github.com/apache/ratis/pull/961)
- RATIS-1918. Introduces parameters that separately control the
shutdown of RaftServerProxy by JVMPauseMonitor.
(https://github.com/apache/ratis/pull/950)
- RATIS-1636. Support re-config ratis properties
(https://github.com/apache/ratis/pull/800)
- RATIS-1860. Add ratis-shell cmd to generate a new raft-meta.conf.
(https://github.com/apache/ratis/pull/901)
- Improvements & Bug Fixes
- Netty
- RATIS-1898. Netty should use EpollEventLoopGroup by default
(https://github.com/apache/ratis/pull/931)
- RATIS-1899. Use EpollEventLoopGroup for Netty Proxies
(https://github.com/apache/ratis/pull/932)
- RATIS-1921. Shared worker group in WorkerGroupGetter should be
closed. (https://github.com/apache/ratis/pull/955)
- RATIS-1923. Netty: atomic operations require side-effect-free
functions. (https://github.com/apache/ratis/pull/956)
- RaftServer
- RATIS-1924. Increase the default of
raft.server.log.segment.size.max. (https://github.com/apache/ratis/pull/957)
- RATIS-1892. Unify the lifetime of the RaftServerProxy thread
pool (https://github.com/apache/ratis/pull/923)
- RATIS-1889. NoSuchMethodError:
RaftServerMetricsImpl.addNumPendingRequestsGauge
https://github.com/apache/ratis/pull/922
(https://github.com/apache/ratis/pull/922)
- RATIS-761. Handle writeStateMachineData failure in leader.
(https://github.com/apache/ratis/pull/927)
- RATIS-1902. The snapshot index is set incorrectly in
InstallSnapshotReplyProto. (https://github.com/apache/ratis/pull/933)
- RATIS-1912. Fix infinity election when perform membership
change. (https://github.com/apache/ratis/pull/954)
- RATIS-1858. Follower keeps logging first election timeout.
(https://github.com/apache/ratis/pull/894)
- 3.0.1:This is a bugfix release. See the [changes between 3.0.0 and
3.0.1](https://github.com/apache/ratis/compare/ratis-3.0.0...ratis-3.0.1)
releases.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Cluster manual test.
Closes #2480 from SteNicholas/CELEBORN-1400.
Authored-by: SteNicholas <[email protected]>
Signed-off-by: Shuang <[email protected]>
---
.../org/apache/celeborn/common/CelebornConf.scala | 20 ++++++++++++++++++++
dev/deps/dependencies-client-flink-1.14 | 8 ++++----
dev/deps/dependencies-client-flink-1.15 | 8 ++++----
dev/deps/dependencies-client-flink-1.17 | 8 ++++----
dev/deps/dependencies-client-flink-1.18 | 8 ++++----
dev/deps/dependencies-client-flink-1.19 | 8 ++++----
dev/deps/dependencies-client-mr | 8 ++++----
dev/deps/dependencies-client-spark-2.4 | 8 ++++----
dev/deps/dependencies-client-spark-3.0 | 8 ++++----
dev/deps/dependencies-client-spark-3.1 | 8 ++++----
dev/deps/dependencies-client-spark-3.2 | 8 ++++----
dev/deps/dependencies-client-spark-3.3 | 8 ++++----
dev/deps/dependencies-client-spark-3.4 | 8 ++++----
dev/deps/dependencies-client-spark-3.5 | 8 ++++----
dev/deps/dependencies-server | 20 ++++++++++----------
docs/configuration/ha.md | 3 ++-
master/pom.xml | 4 ++++
.../deploy/master/clustermeta/ha/HARaftServer.java | 19 +++++++++++++++++++
pom.xml | 7 ++++++-
project/CelebornBuild.scala | 4 +++-
20 files changed, 116 insertions(+), 65 deletions(-)
diff --git
a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
index 36635e84e..e4e7b9026 100644
--- a/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
+++ b/common/src/main/scala/org/apache/celeborn/common/CelebornConf.scala
@@ -743,8 +743,10 @@ class CelebornConf(loadDefaults: Boolean) extends
Cloneable with Logging with Se
def haMasterRatisRpcType: String = get(HA_MASTER_RATIS_RPC_TYPE)
def haMasterRatisStorageDir: String = get(HA_MASTER_RATIS_STORAGE_DIR)
+ def haMasterRatisStorageStartupOption: String =
get(HA_MASTER_RATIS_STORAGE_STARTUP_OPTION)
def haMasterRatisLogSegmentSizeMax: Long =
get(HA_MASTER_RATIS_LOG_SEGMENT_SIZE_MAX)
def haMasterRatisLogPreallocatedSize: Long =
get(HA_MASTER_RATIS_LOG_PREALLOCATED_SIZE)
+ def haMasterRatisLogWriteBufferSize: Long =
get(HA_MASTER_RATIS_LOG_WRITE_BUFFER_SIZE)
def haMasterRatisLogAppenderQueueNumElements: Int =
get(HA_MASTER_RATIS_LOG_APPENDER_QUEUE_NUM_ELEMENTS)
def haMasterRatisLogAppenderQueueBytesLimit: Long =
@@ -2263,10 +2265,20 @@ object CelebornConf extends Logging {
buildConf("celeborn.master.ha.ratis.raft.server.storage.dir")
.withAlternative("celeborn.ha.master.ratis.raft.server.storage.dir")
.categories("ha")
+ .doc("Root storage directory to hold RaftServer data.")
.version("0.3.0")
.stringConf
.createWithDefault("/tmp/ratis")
+ val HA_MASTER_RATIS_STORAGE_STARTUP_OPTION: ConfigEntry[String] =
+ buildConf("celeborn.master.ha.ratis.raft.server.storage.startup.option")
+ .categories("ha")
+ .doc("Startup option of RaftServer storage. Available options: RECOVER,
FORMAT.")
+ .version("0.5.0")
+ .stringConf
+ .checkValues(Set("RECOVER", "FORMAT"))
+ .createWithDefault("RECOVER")
+
val HA_MASTER_RATIS_LOG_SEGMENT_SIZE_MAX: ConfigEntry[Long] =
buildConf("celeborn.master.ha.ratis.raft.server.log.segment.size.max")
.withAlternative("celeborn.ha.master.ratis.raft.server.log.segment.size.max")
@@ -2285,6 +2297,14 @@ object CelebornConf extends Logging {
.bytesConf(ByteUnit.BYTE)
.createWithDefaultString("4MB")
+ val HA_MASTER_RATIS_LOG_WRITE_BUFFER_SIZE: ConfigEntry[Long] =
+ buildConf("celeborn.master.ha.ratis.raft.server.log.write.buffer.size")
+ .internal
+ .categories("ha")
+ .version("0.5.0")
+ .bytesConf(ByteUnit.BYTE)
+ .createWithDefaultString("36MB")
+
val HA_MASTER_RATIS_LOG_APPENDER_QUEUE_NUM_ELEMENTS: ConfigEntry[Int] =
buildConf("celeborn.master.ha.ratis.raft.server.log.appender.buffer.element-limit")
.withAlternative("celeborn.ha.master.ratis.raft.server.log.appender.buffer.element-limit")
diff --git a/dev/deps/dependencies-client-flink-1.14
b/dev/deps/dependencies-client-flink-1.14
index 7a7f49b4a..f3ce797ac 100644
--- a/dev/deps/dependencies-client-flink-1.14
+++ b/dev/deps/dependencies-client-flink-1.14
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.18//scala-library-2.12.18.jar
scala-reflect/2.12.18//scala-reflect-2.12.18.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-flink-1.15
b/dev/deps/dependencies-client-flink-1.15
index 7a7f49b4a..f3ce797ac 100644
--- a/dev/deps/dependencies-client-flink-1.15
+++ b/dev/deps/dependencies-client-flink-1.15
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.18//scala-library-2.12.18.jar
scala-reflect/2.12.18//scala-reflect-2.12.18.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-flink-1.17
b/dev/deps/dependencies-client-flink-1.17
index 7a7f49b4a..f3ce797ac 100644
--- a/dev/deps/dependencies-client-flink-1.17
+++ b/dev/deps/dependencies-client-flink-1.17
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.18//scala-library-2.12.18.jar
scala-reflect/2.12.18//scala-reflect-2.12.18.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-flink-1.18
b/dev/deps/dependencies-client-flink-1.18
index 7a7f49b4a..f3ce797ac 100644
--- a/dev/deps/dependencies-client-flink-1.18
+++ b/dev/deps/dependencies-client-flink-1.18
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.18//scala-library-2.12.18.jar
scala-reflect/2.12.18//scala-reflect-2.12.18.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-flink-1.19
b/dev/deps/dependencies-client-flink-1.19
index 7a7f49b4a..f3ce797ac 100644
--- a/dev/deps/dependencies-client-flink-1.19
+++ b/dev/deps/dependencies-client-flink-1.19
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.18//scala-library-2.12.18.jar
scala-reflect/2.12.18//scala-reflect-2.12.18.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-mr b/dev/deps/dependencies-client-mr
index 309398c85..e952be60b 100644
--- a/dev/deps/dependencies-client-mr
+++ b/dev/deps/dependencies-client-mr
@@ -180,10 +180,10 @@ okhttp/4.9.3//okhttp-4.9.3.jar
okio/2.8.0//okio-2.8.0.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
re2j/1.1//re2j-1.1.jar
reload4j/1.2.22//reload4j-1.2.22.jar
scala-library/2.12.18//scala-library-2.12.18.jar
diff --git a/dev/deps/dependencies-client-spark-2.4
b/dev/deps/dependencies-client-spark-2.4
index 5b0e11c52..dc0d1a718 100644
--- a/dev/deps/dependencies-client-spark-2.4
+++ b/dev/deps/dependencies-client-spark-2.4
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.11.12//scala-library-2.11.12.jar
scala-reflect/2.11.12//scala-reflect-2.11.12.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-spark-3.0
b/dev/deps/dependencies-client-spark-3.0
index 856c67a17..ffb7b0599 100644
--- a/dev/deps/dependencies-client-spark-3.0
+++ b/dev/deps/dependencies-client-spark-3.0
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.10//scala-library-2.12.10.jar
scala-reflect/2.12.10//scala-reflect-2.12.10.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-spark-3.1
b/dev/deps/dependencies-client-spark-3.1
index a92623e1f..381b67ff4 100644
--- a/dev/deps/dependencies-client-spark-3.1
+++ b/dev/deps/dependencies-client-spark-3.1
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.10//scala-library-2.12.10.jar
scala-reflect/2.12.10//scala-reflect-2.12.10.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-spark-3.2
b/dev/deps/dependencies-client-spark-3.2
index 3bf69bab0..ba87c7a60 100644
--- a/dev/deps/dependencies-client-spark-3.2
+++ b/dev/deps/dependencies-client-spark-3.2
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.15//scala-library-2.12.15.jar
scala-reflect/2.12.15//scala-reflect-2.12.15.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-spark-3.3
b/dev/deps/dependencies-client-spark-3.3
index d647abdf1..a5da10cf3 100644
--- a/dev/deps/dependencies-client-spark-3.3
+++ b/dev/deps/dependencies-client-spark-3.3
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.15//scala-library-2.12.15.jar
scala-reflect/2.12.15//scala-reflect-2.12.15.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-spark-3.4
b/dev/deps/dependencies-client-spark-3.4
index aa4db0c96..6f0005936 100644
--- a/dev/deps/dependencies-client-spark-3.4
+++ b/dev/deps/dependencies-client-spark-3.4
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.17//scala-library-2.12.17.jar
scala-reflect/2.12.17//scala-reflect-2.12.17.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-client-spark-3.5
b/dev/deps/dependencies-client-spark-3.5
index c93ed6eff..bce13ca7c 100644
--- a/dev/deps/dependencies-client-spark-3.5
+++ b/dev/deps/dependencies-client-spark-3.5
@@ -73,10 +73,10 @@
netty-transport-udt/4.1.109.Final//netty-transport-udt-4.1.109.Final.jar
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
scala-library/2.12.18//scala-library-2.12.18.jar
scala-reflect/2.12.18//scala-reflect-2.12.18.jar
slf4j-api/1.7.36//slf4j-api-1.7.36.jar
diff --git a/dev/deps/dependencies-server b/dev/deps/dependencies-server
index c420faf98..ca9a26f94 100644
--- a/dev/deps/dependencies-server
+++ b/dev/deps/dependencies-server
@@ -117,16 +117,16 @@
netty-transport/4.1.109.Final//netty-transport-4.1.109.Final.jar
osgi-resource-locator/1.0.3//osgi-resource-locator-1.0.3.jar
paranamer/2.8//paranamer-2.8.jar
protobuf-java/3.21.7//protobuf-java-3.21.7.jar
-ratis-client/2.5.1//ratis-client-2.5.1.jar
-ratis-common/2.5.1//ratis-common-2.5.1.jar
-ratis-grpc/2.5.1//ratis-grpc-2.5.1.jar
-ratis-metrics/2.5.1//ratis-metrics-2.5.1.jar
-ratis-netty/2.5.1//ratis-netty-2.5.1.jar
-ratis-proto/2.5.1//ratis-proto-2.5.1.jar
-ratis-server-api/2.5.1//ratis-server-api-2.5.1.jar
-ratis-server/2.5.1//ratis-server-2.5.1.jar
-ratis-shell/2.5.1//ratis-shell-2.5.1.jar
-ratis-thirdparty-misc/1.0.4//ratis-thirdparty-misc-1.0.4.jar
+ratis-client/3.0.1//ratis-client-3.0.1.jar
+ratis-common/3.0.1//ratis-common-3.0.1.jar
+ratis-grpc/3.0.1//ratis-grpc-3.0.1.jar
+ratis-metrics-default/3.0.1/ratis-metrics-default-3.0.1.jar
+ratis-netty/3.0.1//ratis-netty-3.0.1.jar
+ratis-proto/3.0.1//ratis-proto-3.0.1.jar
+ratis-server-api/3.0.1//ratis-server-api-3.0.1.jar
+ratis-server/3.0.1//ratis-server-3.0.1.jar
+ratis-shell/3.0.1//ratis-shell-3.0.1.jar
+ratis-thirdparty-misc/1.0.5//ratis-thirdparty-misc-1.0.5.jar
reflections/0.10.2//reflections-0.10.2.jar
rocksdbjni/8.11.3//rocksdbjni-8.11.3.jar
scala-library/2.12.18//scala-library-2.12.18.jar
diff --git a/docs/configuration/ha.md b/docs/configuration/ha.md
index 15044ea61..c88c1d2ad 100644
--- a/docs/configuration/ha.md
+++ b/docs/configuration/ha.md
@@ -25,5 +25,6 @@ license: |
| celeborn.master.ha.node.<id>.port | 9097 | false | Port to bind of
master node <id> in HA mode. | 0.3.0 | celeborn.ha.master.node.<id>.port
|
| celeborn.master.ha.node.<id>.ratis.port | 9872 | false | Ratis port to
bind of master node <id> in HA mode. | 0.3.0 |
celeborn.ha.master.node.<id>.ratis.port |
| celeborn.master.ha.ratis.raft.rpc.type | netty | false | RPC type for Ratis,
available options: netty, grpc. | 0.3.0 |
celeborn.ha.master.ratis.raft.rpc.type |
-| celeborn.master.ha.ratis.raft.server.storage.dir | /tmp/ratis | false | |
0.3.0 | celeborn.ha.master.ratis.raft.server.storage.dir |
+| celeborn.master.ha.ratis.raft.server.storage.dir | /tmp/ratis | false | Root
storage directory to hold RaftServer data. | 0.3.0 |
celeborn.ha.master.ratis.raft.server.storage.dir |
+| celeborn.master.ha.ratis.raft.server.storage.startup.option | RECOVER |
false | Startup option of RaftServer storage. Available options: RECOVER,
FORMAT. | 0.5.0 | |
<!--end-include-->
diff --git a/master/pom.xml b/master/pom.xml
index bad0dfabf..abe061473 100644
--- a/master/pom.xml
+++ b/master/pom.xml
@@ -74,6 +74,10 @@
<groupId>org.apache.ratis</groupId>
<artifactId>ratis-shell</artifactId>
</dependency>
+ <dependency>
+ <groupId>org.apache.ratis</groupId>
+ <artifactId>ratis-metrics-default</artifactId>
+ </dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-slf4j-impl</artifactId>
diff --git
a/master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/ha/HARaftServer.java
b/master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/ha/HARaftServer.java
index f0a7560d7..0c4118373 100644
---
a/master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/ha/HARaftServer.java
+++
b/master/src/main/java/org/apache/celeborn/service/deploy/master/clustermeta/ha/HARaftServer.java
@@ -50,6 +50,7 @@ import org.apache.ratis.rpc.RpcType;
import org.apache.ratis.rpc.SupportedRpcType;
import org.apache.ratis.server.RaftServer;
import org.apache.ratis.server.RaftServerConfigKeys;
+import org.apache.ratis.server.storage.RaftStorage.StartupOption;
import org.apache.ratis.thirdparty.com.google.protobuf.ByteString;
import org.apache.ratis.util.LifeCycle;
import org.apache.ratis.util.SizeInBytes;
@@ -144,6 +145,8 @@ public class HARaftServer {
.setProperties(serverProperties)
.setParameters(sslParameters)
.setStateMachine(masterStateMachine)
+ // RATIS-1677. Do not auto format RaftStorage in RECOVER.
+
.setOption(StartupOption.valueOf(conf.haMasterRatisStorageStartupOption()))
.build();
StringBuilder raftPeersStr = new StringBuilder();
@@ -306,8 +309,22 @@ public class HARaftServer {
// Set RAFT segment pre-allocated size
long raftSegmentPreallocatedSize = conf.haMasterRatisLogPreallocatedSize();
+ long raftSegmentWriteBufferSize = conf.haMasterRatisLogWriteBufferSize();
int logAppenderQueueNumElements =
conf.haMasterRatisLogAppenderQueueNumElements();
long logAppenderQueueByteLimit =
conf.haMasterRatisLogAppenderQueueBytesLimit();
+ // RATIS-589. Eliminate buffer copying in SegmentedRaftLogOutputStream.
+ // 4 bytes (serialized size) + logAppenderQueueByteLimit + 4 bytes
(checksum)
+ if (raftSegmentWriteBufferSize < logAppenderQueueByteLimit + 8) {
+ throw new IllegalArgumentException(
+ CelebornConf.HA_MASTER_RATIS_LOG_WRITE_BUFFER_SIZE().key()
+ + " (= "
+ + raftSegmentWriteBufferSize
+ + ") is less than "
+ +
CelebornConf.HA_MASTER_RATIS_LOG_APPENDER_QUEUE_BYTE_LIMIT().key()
+ + " + 8 (= "
+ + (logAppenderQueueByteLimit + 8)
+ + ")");
+ }
boolean shouldInstallSnapshot =
conf.haMasterRatisLogInstallSnapshotEnabled();
RaftServerConfigKeys.Log.Appender.setBufferElementLimit(
properties, logAppenderQueueNumElements);
@@ -315,6 +332,8 @@ public class HARaftServer {
properties, SizeInBytes.valueOf(logAppenderQueueByteLimit));
RaftServerConfigKeys.Log.setPreallocatedSize(
properties, SizeInBytes.valueOf(raftSegmentPreallocatedSize));
+ RaftServerConfigKeys.Log.setWriteBufferSize(
+ properties, SizeInBytes.valueOf(raftSegmentWriteBufferSize));
RaftServerConfigKeys.Log.Appender.setInstallSnapshotEnabled(properties,
shouldInstallSnapshot);
int logPurgeGap = conf.haMasterRatisLogPurgeGap();
RaftServerConfigKeys.Log.setPurgeGap(properties, logPurgeGap);
diff --git a/pom.xml b/pom.xml
index 307b95227..0b7fd07af 100644
--- a/pom.xml
+++ b/pom.xml
@@ -89,7 +89,7 @@
<netty.version>4.1.109.Final</netty.version>
<bouncycastle.version>1.77</bouncycastle.version>
<protobuf.version>3.21.7</protobuf.version>
- <ratis.version>2.5.1</ratis.version>
+ <ratis.version>3.0.1</ratis.version>
<scalatest.version>3.2.16</scalatest.version>
<slf4j.version>1.7.36</slf4j.version>
<roaringbitmap.version>1.0.6</roaringbitmap.version>
@@ -283,6 +283,11 @@
</exclusion>
</exclusions>
</dependency>
+ <dependency>
+ <groupId>org.apache.ratis</groupId>
+ <artifactId>ratis-metrics-default</artifactId>
+ <version>${ratis.version}</version>
+ </dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
diff --git a/project/CelebornBuild.scala b/project/CelebornBuild.scala
index c8a61cbaa..c616237b0 100644
--- a/project/CelebornBuild.scala
+++ b/project/CelebornBuild.scala
@@ -56,7 +56,7 @@ object Dependencies {
val metricsVersion = "3.2.6"
val mockitoVersion = "4.11.0"
val nettyVersion = "4.1.109.Final"
- val ratisVersion = "2.5.1"
+ val ratisVersion = "3.0.1"
val roaringBitmapVersion = "1.0.6"
val rocksdbJniVersion = "8.11.3"
val jacksonVersion = "2.15.3"
@@ -127,6 +127,7 @@ object Dependencies {
val ratisClient = "org.apache.ratis" % "ratis-client" % ratisVersion
val ratisCommon = "org.apache.ratis" % "ratis-common" % ratisVersion
val ratisGrpc = "org.apache.ratis" % "ratis-grpc" % ratisVersion
+ val ratisMetricsDefault = "org.apache.ratis" % "ratis-metrics-default" %
ratisVersion
val ratisNetty = "org.apache.ratis" % "ratis-netty" % ratisVersion
val ratisServer = "org.apache.ratis" % "ratis-server" % ratisVersion
val ratisShell = "org.apache.ratis" % "ratis-shell" % ratisVersion
excludeAll(
@@ -551,6 +552,7 @@ object CelebornMaster {
Dependencies.ratisClient,
Dependencies.ratisCommon,
Dependencies.ratisGrpc,
+ Dependencies.ratisMetricsDefault,
Dependencies.ratisNetty,
Dependencies.ratisServer,
Dependencies.ratisShell