TaoZex commented on code in PR #3696:
URL:
https://github.com/apache/incubator-seatunnel/pull/3696#discussion_r1050627166
##########
docs/en/seatunnel-engine/cluster-mode.md:
##########
@@ -1,6 +1,24 @@
---
-sidebar_position: 4
+sidebar_position: 3
---
# Run Job With Cluster Mode
+This is the most recommended way to use SeaTunnel Engine in the production
environment. Full functionality of SeaTunnel Engine is supported in this mode
and the cluster mode will have better performance and stability.
+
+In the cluster mode, the SeaTunnel Engine cluster needs to be deployed first,
and the client will submit the job to the SeaTunnel Engine cluster for running.
+
+## Deploy SeaTunnel Engine Cluster
+Deploy a SeaTunnel Engine Cluster reference [SeaTunnel Engine Cluster
Deploy](deployment.md)
Review Comment:
Put a blank line in the middle
##########
docs/en/seatunnel-engine/deployment.md:
##########
@@ -1,8 +1,189 @@
---
-sidebar_position: 2
+sidebar_position: 4
---
# Deployment SeaTunnel Engine
+## 1. Download
+
SeaTunnel Engine is the default engine of SeaTunnel. The installation package
of SeaTunnel already contains all the contents of SeaTunnel Engine.
+## 2 Config SEATUNNEL_HOME
+
+You can config `SEATUNNEL_HOME` by add `/etc/profile.d/seatunnel.sh` file. The
content of `/etc/profile.d/seatunnel.sh` are
+
+```
+export SEATUNNEL_HOME=${seatunnel install path}
+export PATH=$PATH:$SEATUNNEL_HOME/bin
+```
+
+## 3. Config SeaTunnel Engine JVM options
+
+SeaTunnel Engine supported two ways to set jvm options.
+
+1. Add JVM Options to `$SEATUNNEL_HOME/bin/seatunnel-cluster.sh`.
+
+ Modify the `$SEATUNNEL_HOME/bin/seatunnel-cluster.sh` file and add
`JAVA_OPTS="-Xms2G -Xmx2G"` in the first line.
+2. Add JVM Options when start SeaTunnel Engine. For example
`seatunnel-cluster.sh -DJvmOption="-Xms2G -Xmx2G"`
+
+## 4. Config SeaTunnel Engine
+
+SeaTunnel Engine provides many functions, which need to be configured in
seatunnel.yaml.
+
+### 4.1 Backup count
+
+SeaTunnel Engine implement cluster management based on [Hazelcast
IMDG](https://docs.hazelcast.com/imdg/4.1/). The state data of cluster(Job
Running State, Resource State) are storage is [Hazelcast
IMap](https://docs.hazelcast.com/imdg/4.1/data-structures/map).
+The data saved in Hazelcast IMap will be distributed and stored in all nodes
of the cluster. Hazelcast will partition the data stored in Imap. Each
partition can specify the number of backups.
+Therefore, SeaTunnel Engine can achieve cluster HA without using other
services(for example zookeeper).
+
+The `backup count` is to define the number of synchronous backups. For
example, if it is set to 1, backup of a partition will be placed on one other
member. If it is 2, it will be placed on two other members.
+
+We suggest the value of `backup-count` is the `min(1, max(5, N/2))`. `N` is
the number of the cluster node.
+
+```
+seatunnel:
+ engine:
+ backup-count: 1
+ # other config
+
+```
+
+### 4.2 Slot service
+
+The number of Slots determines the number of TaskGroups the cluster node can
run in parallel. SeaTunnel Engine is a data synchronization engine and most
jobs are IO intensive.
+
+Dynamic Slot is suggest.
+
+```
+seatunnel:
+ engine:
+ slot-service:
+ dynamic-slot: true
+ # other config
+```
+
+### 4.3 Checkpoint Manager
+
+Like Flink, SeaTunnel Engine support Chandy–Lamport algorithm. Therefore,
SeaTunnel Engine can realize data synchronization without data loss and
duplication.
+
+**interval**
+
+The interval between two checkpoints, unit is milliseconds. If the
`checkpoint.interval` parameter is configured in the `env` of the job config
file, the value set here will be overwritten.
+
+**timeout**
+
+The timeout of a checkpoint. If a checkpoint cannot be completed within the
timeout period, a checkpoint failure will be triggered. Job will be restored.
+
+**max-concurrent**
+
+How many checkpoints can be performed simultaneously at most.
+
+**tolerable-failure**
+
+Maximum number of retries after checkpoint failure.
+
+Example
+
+```
+seatunnel:
+ engine:
+ backup-count: 1
+ print-execution-info-interval: 10
+ slot-service:
+ dynamic-slot: true
+ checkpoint:
+ interval: 300000
+ timeout: 10000
+ max-concurrent: 1
+ tolerable-failure: 2
+
+```
+
+**checkpoint storage**
+
+About the checkpoint storage, you can see [checkpoint
storage](checkpoint-storage.md)
+
+## 5. Config SeaTunnel Engine Server
+
+All SeaTunnel Engine Server config in `hazelcast.yaml` file.
+
+### 5.1 cluster-name
+
+The SeaTunnel Engine nodes use the cluster name to determine whether the other
is a cluster with themselves. If the cluster names between the two nodes are
different, the SeaTunnel Engine will reject the service request.
+
+### 5.2 Network
+
+Base on
[Hazelcast](https://docs.hazelcast.com/imdg/4.1/clusters/discovery-mechanisms),
A SeaTunnel Engine cluster is a network of cluster members that run SeaTunnel
Engine Server. Cluster members automatically join together to form a cluster.
This automatic joining takes place with various discovery mechanisms that the
cluster members use to find each other.
+
+Please note that, after a cluster is formed, communication between cluster
members is always via TCP/IP, regardless of the discovery mechanism used.
+
+SeaTunnel Engine uses the following discovery mechanisms.
+
+#### TCP
+
+You can configure SeaTunnel Engine to be a full TCP/IP cluster. See the
[Discovering Members by TCP section](tcp.md) for configuration details.
+
+An example is like this `hazelcast.yaml`
+
+```yaml
+hazelcast:
+ cluster-name: seatunnel
+ network:
+ join:
+ tcp-ip:
+ enabled: true
+ member-list:
+ - hostname1
+ port:
+ auto-increment: false
+ port: 5801
+ properties:
+ hazelcast.logging.type: log4j2
+
+```
+
+TCP is our suggest way in a standalone SeaTunnel Engine cluster.
+
+On the other hand, Hazelcast provides some other service discovery methods.
For details, please refer to [hazelcast
network](https://docs.hazelcast.com/imdg/4.1/clusters/setting-up-clusters)
+
+## 6. Config SeaTunnel Engine Client
+
+All SeaTunnel Engine Client config in `hazelcast-client.yaml`
Review Comment:
```suggestion
All SeaTunnel Engine Client config in `hazelcast-client.yaml`.
```
##########
docs/en/seatunnel-engine/deployment.md:
##########
@@ -1,8 +1,189 @@
---
-sidebar_position: 2
+sidebar_position: 4
---
# Deployment SeaTunnel Engine
+## 1. Download
+
SeaTunnel Engine is the default engine of SeaTunnel. The installation package
of SeaTunnel already contains all the contents of SeaTunnel Engine.
+## 2 Config SEATUNNEL_HOME
+
+You can config `SEATUNNEL_HOME` by add `/etc/profile.d/seatunnel.sh` file. The
content of `/etc/profile.d/seatunnel.sh` are
+
+```
+export SEATUNNEL_HOME=${seatunnel install path}
+export PATH=$PATH:$SEATUNNEL_HOME/bin
+```
+
+## 3. Config SeaTunnel Engine JVM options
+
+SeaTunnel Engine supported two ways to set jvm options.
+
+1. Add JVM Options to `$SEATUNNEL_HOME/bin/seatunnel-cluster.sh`.
+
+ Modify the `$SEATUNNEL_HOME/bin/seatunnel-cluster.sh` file and add
`JAVA_OPTS="-Xms2G -Xmx2G"` in the first line.
+2. Add JVM Options when start SeaTunnel Engine. For example
`seatunnel-cluster.sh -DJvmOption="-Xms2G -Xmx2G"`
+
+## 4. Config SeaTunnel Engine
+
+SeaTunnel Engine provides many functions, which need to be configured in
seatunnel.yaml.
+
+### 4.1 Backup count
+
+SeaTunnel Engine implement cluster management based on [Hazelcast
IMDG](https://docs.hazelcast.com/imdg/4.1/). The state data of cluster(Job
Running State, Resource State) are storage is [Hazelcast
IMap](https://docs.hazelcast.com/imdg/4.1/data-structures/map).
+The data saved in Hazelcast IMap will be distributed and stored in all nodes
of the cluster. Hazelcast will partition the data stored in Imap. Each
partition can specify the number of backups.
+Therefore, SeaTunnel Engine can achieve cluster HA without using other
services(for example zookeeper).
+
+The `backup count` is to define the number of synchronous backups. For
example, if it is set to 1, backup of a partition will be placed on one other
member. If it is 2, it will be placed on two other members.
+
+We suggest the value of `backup-count` is the `min(1, max(5, N/2))`. `N` is
the number of the cluster node.
+
+```
+seatunnel:
+ engine:
+ backup-count: 1
+ # other config
+
+```
+
+### 4.2 Slot service
+
+The number of Slots determines the number of TaskGroups the cluster node can
run in parallel. SeaTunnel Engine is a data synchronization engine and most
jobs are IO intensive.
+
+Dynamic Slot is suggest.
+
+```
+seatunnel:
+ engine:
+ slot-service:
+ dynamic-slot: true
+ # other config
+```
+
+### 4.3 Checkpoint Manager
+
+Like Flink, SeaTunnel Engine support Chandy–Lamport algorithm. Therefore,
SeaTunnel Engine can realize data synchronization without data loss and
duplication.
+
+**interval**
+
+The interval between two checkpoints, unit is milliseconds. If the
`checkpoint.interval` parameter is configured in the `env` of the job config
file, the value set here will be overwritten.
+
+**timeout**
+
+The timeout of a checkpoint. If a checkpoint cannot be completed within the
timeout period, a checkpoint failure will be triggered. Job will be restored.
Review Comment:
```suggestion
The timeout of a checkpoint. If a checkpoint cannot be completed within the
timeout period, a checkpoint failure will be triggered. Therefore, Job will be
restored.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]