[dolphinscheduler-website] branch master updated: proof writing documents under architecture directory (#718)

zhongjiajie Mon, 07 Mar 2022 18:45:45 -0800

This is an automated email from the ASF dual-hosted git repository.

zhongjiajie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/dolphinscheduler-website.git



The following commit(s) were added to refs/heads/master by this push:
     new 1315037  proof writing documents under architecture directory (#718)
1315037 is described below

commit 1315037f5f5c8443d67c9ad96c4f19ad1e933155
Author: Tq <[email protected]>
AuthorDate: Tue Mar 8 10:40:27 2022 +0800

    proof writing documents under architecture directory (#718)
---
 .../About_DolphinScheduler.md                      |  16 +-
 docs/en-us/dev/user_doc/architecture/cache.md      |  16 +-
 .../dev/user_doc/architecture/configuration.md     | 131 ++++++++--------
 docs/en-us/dev/user_doc/architecture/design.md     | 164 ++++++++++-----------
 .../dev/user_doc/architecture/load-balance.md      |  28 ++--
 docs/en-us/dev/user_doc/architecture/metadata.md   | 135 ++++++++---------
 .../dev/user_doc/architecture/task-structure.md    |  40 ++---
 7 files changed, 268 insertions(+), 262 deletions(-)

diff --git 
a/docs/en-us/dev/user_doc/About_DolphinScheduler/About_DolphinScheduler.md 
b/docs/en-us/dev/user_doc/About_DolphinScheduler/About_DolphinScheduler.md
index aafcca1..a0d314e 100644
--- a/docs/en-us/dev/user_doc/About_DolphinScheduler/About_DolphinScheduler.md
+++ b/docs/en-us/dev/user_doc/About_DolphinScheduler/About_DolphinScheduler.md
@@ -1,19 +1,19 @@
 # About DolphinScheduler
 
-Apache DolphinScheduler is a cloud-native visual Big Data workflow scheduler 
system, committed to “solving complex big-data task dependencies and triggering 
relationships in data OPS orchestration so that various types of big data tasks 
can be used out of the box”.
+Apache DolphinScheduler is a distributed, easy to extend visual DAG workflow 
task scheduling open-source system. Solves the intricate dependencies of data 
R&D ETL and the inability to monitor the health status of tasks. 
DolphinScheduler assembles tasks in the DAG streaming way, which can monitor 
the execution status of tasks in time, and supports operations like retry, 
recovery failure from specified nodes, pause, resume and kill tasks, etc.
 
-## High Reliability
+## Simple to Use
 
-- Decentralized multi-master and multi-worker, HA is supported by itself, 
overload processing
+- DolphinScheduler has DAG monitoring user interfaces, users can customize DAG 
by dragging and dropping. All process definitions are visualized, supports rich 
third-party systems APIs and one-click deployment.
 
-## User-Friendly
+## High Reliability
 
-- All process definition operations are visualized, Visualization process 
defines key information at a glance, One-click deployment
+- Decentralized multi-masters and multi-workers, support HA, select queues to 
avoid overload.
 
 ## Rich Scenarios
 
-- Support multi-tenant. Support many task types e.g., spark,flink,hive, mr, 
shell, python, sub_process
+- Support features like multi-tenants, suspend and resume operations to cope 
with big data scenarios. Support many task types like Spark, Flink, Hive, MR, 
shell, python, sub_process.
 
-## High Expansibility
+## High Scalability
 
-- Support custom task types, Distributed scheduling, and the overall 
scheduling capability will increase linearly with the scale of the cluster
+- Supports customized task types, distributed scheduling, and the overall 
scheduling capability increases linearly with the scale of the cluster.
\ No newline at end of file
diff --git a/docs/en-us/dev/user_doc/architecture/cache.md 
b/docs/en-us/dev/user_doc/architecture/cache.md
index a07190d..a0251e3 100644
--- a/docs/en-us/dev/user_doc/architecture/cache.md
+++ b/docs/en-us/dev/user_doc/architecture/cache.md
@@ -2,9 +2,9 @@
 
 ## Purpose
 
-Due to the master-server scheduling process, there will be a large number of 
database read operations, such as `tenant`, `user`, `processDefinition`, etc. 
On the one hand, it will put a lot of pressure on the DB, and on the other 
hand, it will slow down the entire core scheduling process.
+Due to the large database read operations during the master-server scheduling 
process. Such as read tables like `tenant`, `user`, `processDefinition`, etc. 
Operations stress read pressure to the DB, and slow down the entire core 
scheduling process.
 
-Considering that this part of the business data is a scenario where more reads 
and less writes are performed, a cache module is introduced to reduce the DB 
read pressure and speed up the core scheduling process;
+By considering this part of the business data is a high-read and low-write 
scenario, a cache module is introduced to reduce the DB read pressure and speed 
up the core scheduling process.
 
 ## Cache Settings
 
@@ -23,20 +23,20 @@ spring:
       spec: maximumSize=100,expireAfterWrite=300s,recordStats
 ```
 
-The cache-module use [spring-cache](https://spring.io/guides/gs/caching/), so 
you can set cache config in the spring application.yaml directly. Default 
disable cache, and you can enable it by `type: caffeine`.
+The cache module uses [spring-cache](https://spring.io/guides/gs/caching/), so 
you can set cache config like whether to enable cache (`none` to disable by 
default), cache types in the spring `application.yaml` directly.
 
-With the config of [caffeine](https://github.com/ben-manes/caffeine), you can 
set the cache size, expire time, etc.
+Currently, implements the config of 
[caffeine](https://github.com/ben-manes/caffeine), you can assign cache configs 
like cache size, expire time, etc.
 
 ## Cache Read
 
-The cache adopts the annotation `@Cacheable` of spring-cache and is configured 
in the mapper layer. For example: `TenantMapper`.
+The cache module adopts the `@Cacheable` annotation from spring-cache and you 
can annotate the annotation in the related mapper layer. Refer to the 
`TenantMapper`.
 
 ## Cache Evict
 
-The business data update comes from the api-server, and the cache end is in 
the master-server. So it is necessary to monitor the data update of the 
api-server (aspect intercept `@CacheEvict`), and the master-server will be 
notified when the cache eviction is required.
+The business data updates come from the api-server, and the cache side is in 
the master-server. Then it is necessary to monitor the data updates from the 
api-server (use aspect point cut interceptor `@CacheEvict`), and notify the 
master-server of `cacheEvictCommand` when processing a cache eviction.
 
-It should be noted that the final strategy for cache update comes from the 
user's expiration strategy configuration in caffeine, so please configure it in 
conjunction with the business;
+Note: the final strategy for cache update comes from the expiration strategy 
configuration in caffeine, therefore configure it under the business scenarios;
 
-The sequence diagram is shown in the following figure:
+The sequence diagram shows below:
 
 <img src="/img/cache-evict.png" alt="cache-evict" style="zoom: 67%;" />
\ No newline at end of file
diff --git a/docs/en-us/dev/user_doc/architecture/configuration.md 
b/docs/en-us/dev/user_doc/architecture/configuration.md
index 37063ea..7f3afb2 100644
--- a/docs/en-us/dev/user_doc/architecture/configuration.md
+++ b/docs/en-us/dev/user_doc/architecture/configuration.md
@@ -1,4 +1,5 @@
 <!-- markdown-link-check-disable -->
+
 # Configuration
 
 ## Preface
@@ -7,26 +8,28 @@ This document explains the DolphinScheduler application 
configurations according
 
 ## Directory Structure
 
-Currently, all the configuration files are under [conf ] directory. Please 
check the following simplified DolphinScheduler installation directories to 
have a direct view about the position [conf] directory in and configuration 
files inside. This document only describes DolphinScheduler configurations and 
other modules are not going into.
+Currently, all the configuration files are under [conf ] directory. 
+Check the following simplified DolphinScheduler installation directories to 
have a direct view about the position of [conf] directory and configuration 
files it has. 
+This document only describes DolphinScheduler configurations and other topics 
are not going into.
 
 [Note: the DolphinScheduler (hereinafter called the ‘DS’) .]
 
 ```
 ├─bin                               DS application commands directory
-│  ├─dolphinscheduler-daemon.sh         startup/shutdown DS application 
-│  ├─start-all.sh                  A     startup all DS services with 
configurations
+│  ├─dolphinscheduler-daemon.sh         startup or shutdown DS application 
+│  ├─start-all.sh                       startup all DS services with 
configurations
 │  ├─stop-all.sh                        shutdown all DS services with 
configurations
 ├─conf                              configurations directory
 │  ├─application-api.properties         API-service config properties
 │  ├─datasource.properties              datasource config properties
 │  ├─zookeeper.properties               ZooKeeper config properties
-│  ├─master.properties                  master config properties
-│  ├─worker.properties                  worker config properties
+│  ├─master.properties                  master-service config properties
+│  ├─worker.properties                  worker-service config properties
 │  ├─quartz.properties                  quartz config properties
-│  ├─common.properties                  common-service[storage] config 
properties
+│  ├─common.properties                  common-service [storage] config 
properties
 │  ├─alert.properties                   alert-service config properties
 │  ├─config                             environment variables config directory
-│      ├─install_config.conf                DS environment variables 
configuration script[install/start DS]
+│      ├─install_config.conf                DS environment variables 
configuration script [install or start DS]
 │  ├─env                                load environment variables configs 
script directory
 │      ├─dolphinscheduler_env.sh            load environment variables configs 
[eg: JAVA_HOME,HADOOP_HOME, HIVE_HOME ...]
 │  ├─org                                mybatis mapper files directory
@@ -35,13 +38,13 @@ Currently, all the configuration files are under [conf ] 
directory. Please check
 │  ├─logback-master.xml                 master-service log config
 │  ├─logback-worker.xml                 worker-service log config
 │  ├─logback-alert.xml                  alert-service log config
-├─sql                                   DS metadata to create/upgrade .sql 
directory
+├─sql                                   .sql files to create or upgrade DS 
metadata
 │  ├─create                             create SQL scripts directory
 │  ├─upgrade                            upgrade SQL scripts directory
-│  ├─dolphinscheduler_postgre.sql       postgre database init script
-│  ├─dolphinscheduler_mysql.sql         mysql database init script
+│  ├─dolphinscheduler_postgre.sql       PostgreSQL database init script
+│  ├─dolphinscheduler_mysql.sql         MySQL database init script
 │  ├─soft_version                       current DS version-id file
-├─script                            DS services deployment, database 
create/upgrade scripts directory
+├─script                            DS services deployment, database create or 
upgrade scripts directory
 │  ├─create-dolphinscheduler.sh         DS database init script
 │  ├─upgrade-dolphinscheduler.sh        DS database upgrade script
 │  ├─monitor-server.sh                  DS monitor-server start script       
@@ -56,13 +59,13 @@ Currently, all the configuration files are under [conf ] 
directory. Please check
 
 serial number| service classification| config file|
 |--|--|--|
-1|startup/shutdown DS application|dolphinscheduler-daemon.sh
-2|datasource config properties| datasource.properties
+1|startup or shutdown DS application|dolphinscheduler-daemon.sh
+2|datasource config properties|datasource.properties
 3|ZooKeeper config properties|zookeeper.properties
 4|common-service[storage] config properties|common.properties
 5|API-service config properties|application-api.properties
-6|master config properties|master.properties
-7|worker config properties|worker.properties
+6|master-service config properties|master.properties
+7|worker-service config properties|worker.properties
 8|alert-service config properties|alert.properties
 9|quartz config properties|quartz.properties
 10|DS environment variables configuration script[install/start 
DS]|install_config.conf
@@ -70,11 +73,11 @@ serial number| service classification| config file|
 12|services log config files|API-service log config : logback-api.xml  <br /> 
master-service log config  : logback-master.xml    <br /> worker-service log 
config : logback-worker.xml  <br /> alert-service log config : logback-alert.xml
 
 
-### dolphinscheduler-daemon.sh [startup/shutdown DS application]
+### dolphinscheduler-daemon.sh [startup or shutdown DS application]
 
 dolphinscheduler-daemon.sh is responsible for DS startup and shutdown.
-Essentially, start-all.sh/stop-all.sh startup/shutdown the cluster via 
dolphinscheduler-daemon.sh.
-Currently, DS just makes a basic config, please config further JVＭ options 
based on your practical situation of resources.
+Essentially, start-all.sh or stop-all.sh startup and shutdown the cluster via 
dolphinscheduler-daemon.sh.
+Currently, DS just makes a basic config, remember to config further JVM 
options based on your practical situation of resources.
 
 Default simplified parameters are:
 ```bash
@@ -102,10 +105,10 @@ spring.datasource.driver-class-name||datasource driver
 spring.datasource.url||datasource connection url
 spring.datasource.username||datasource username
 spring.datasource.password||datasource password
-spring.datasource.initialSize|5| initail connection pool size number
+spring.datasource.initialSize|5| initial connection pool size number
 spring.datasource.minIdle|5| minimum connection pool size number
 spring.datasource.maxActive|5| maximum connection pool size number
-spring.datasource.maxWait|60000| max wait mili-seconds
+spring.datasource.maxWait|60000| max wait milliseconds
 spring.datasource.timeBetweenEvictionRunsMillis|60000| idle connection check 
interval
 spring.datasource.timeBetweenConnectErrorMillis|60000| retry interval
 spring.datasource.minEvictableIdleTimeMillis|300000| connections over 
minEvictableIdleTimeMillis will be collect when idle check
@@ -116,7 +119,7 @@ spring.datasource.testOnBorrow|true| validity check when 
the program requests a
 spring.datasource.testOnReturn|false| validity check when the program recalls 
a connection
 spring.datasource.defaultAutoCommit|true| whether auto commit
 spring.datasource.keepAlive|true| runs validationQuery SQL to avoid the 
connection closed by pool when the connection idles over 
minEvictableIdleTimeMillis
-spring.datasource.poolPreparedStatements|true| Open PSCache
+spring.datasource.poolPreparedStatements|true| open PSCache
 spring.datasource.maxPoolPreparedStatementPerConnectionSize|20| specify the 
size of PSCache on each connection
 
 
@@ -135,7 +138,7 @@ zookeeper.retry.maxtime|10| maximum retry times
 
 ### common.properties [hadoop、s3、yarn config properties]
 
-Currently, common.properties mainly configures hadoop/s3a related 
configurations.
+Currently, common.properties mainly configures Hadoop,s3a related 
configurations.
 |Parameters | Default value| Description|
 |--|--|--|
 data.basedir.path|/tmp/dolphinscheduler| local directory used to store temp 
files
@@ -148,12 +151,12 @@ login.user.keytab.path|/opt/hdfs.headless.keytab|kerberos 
user keytab
 kerberos.expire.time|2|kerberos expire time,integer,the unit is hour
 resource.view.suffixs| txt,log,sh,conf,cfg,py,java,sql,hql,xml,properties| 
file types supported by resource center
 hdfs.root.user|hdfs| configure users with corresponding permissions if storage 
type is HDFS
-fs.defaultFS|hdfs://mycluster:8020|If resource.storage.type=S3, then the 
request url would be similar to 's3a://dolphinscheduler'. Otherwise if 
resource.storage.type=HDFS and hadoop supports HA, please copy core-site.xml 
and hdfs-site.xml into 'conf' directory
+fs.defaultFS|hdfs://mycluster:8020|If resource.storage.type=S3, then the 
request url would be similar to 's3a://dolphinscheduler'. Otherwise if 
resource.storage.type=HDFS and hadoop supports HA, copy core-site.xml and 
hdfs-site.xml into 'conf' directory
 fs.s3a.endpoint||s3 endpoint url
 fs.s3a.access.key||s3 access key
 fs.s3a.secret.key||s3 secret key
 yarn.resourcemanager.ha.rm.ids||specify the yarn resourcemanager url. if 
resourcemanager supports HA, input HA IP addresses (separated by comma), or 
input null for standalone
-yarn.application.status.address|http://ds1:8088/ws/v1/cluster/apps/%s|keep 
default if resourcemanager supports HA or not use resourcemanager. Or replace 
ds1 with corresponding hostname if resourcemanager in standalone mode
+yarn.application.status.address|http://ds1:8088/ws/v1/cluster/apps/%s|keep 
default if ResourceManager supports HA or not use ResourceManager, or replace 
ds1 with corresponding hostname if ResourceManager in standalone mode
 dolphinscheduler.env.path|env/dolphinscheduler_env.sh|load environment 
variables configs [eg: JAVA_HOME,HADOOP_HOME, HIVE_HOME ...]
 development.state|false| specify whether in development state
 
@@ -179,10 +182,10 @@ security.authentication.type|PASSWORD| authentication type
 |Parameters | Default value| Description|
 |--|--|--|
 master.listen.port|5678|master listen port
-master.exec.threads|100|master execute thread number to limit process 
instances in parallel
-master.exec.task.num|20|master execute task number in parallel per process 
instance
-master.dispatch.task.num|3|master dispatch task number per batch
-master.host.selector|LowerWeight|master host selector to select a suitable 
worker, default value: LowerWeight. Optional values include Random, RoundRobin, 
LowerWeight
+master.exec.threads|100|master-service execute thread number, used to limit 
the number of process instances in parallel
+master.exec.task.num|20|defines the number of parallel tasks for each process 
instance of the master-service
+master.dispatch.task.num|3|defines the number of dispatch tasks for each batch 
of the master-service
+master.host.selector|LowerWeight|master host selector, to select a suitable 
worker to run the task, optional value: random, round-robin, lower weight
 master.heartbeat.interval|10|master heartbeat interval, the unit is second
 master.task.commit.retryTimes|5|master commit task retry times
 master.task.commit.interval|1000|master commit task interval, the unit is 
millisecond
@@ -194,12 +197,12 @@ master.reserved.memory|0.3|master reserved memory, only 
lower than system availa
 
 |Parameters | Default value| Description|
 |--|--|--|
-worker.listen.port|1234|worker listen port
-worker.exec.threads|100|worker execute thread number to limit task instances 
in parallel
-worker.heartbeat.interval|10|worker heartbeat interval, the unit is second
+worker.listen.port|1234|worker-service listen port
+worker.exec.threads|100|worker-service execute thread number, used to limit 
the number of task instances in parallel
+worker.heartbeat.interval|10|worker-service heartbeat interval, the unit is 
second
 worker.max.cpuload.avg|-1|worker max CPU load avg, only higher than the system 
CPU load average, worker server can be dispatched tasks. default value -1: the 
number of CPU cores * 2
 worker.reserved.memory|0.3|worker reserved memory, only lower than system 
available memory, worker server can be dispatched tasks. default value 0.3, the 
unit is G
-worker.groups|default|worker groups separated by comma, like 
'worker.groups=default,test' <br> worker will join corresponding group 
according to this config when startup
+worker.groups|default|worker groups separated by comma, e.g., 
'worker.groups=default,test' <br> worker will join corresponding group 
according to this config when startup
 
 
 ### alert.properties [alert-service log config]
@@ -232,40 +235,40 @@ plugin.dir|/Users/xx/your/path/to/plugin/dir|plugin 
directory
 
 ### quartz.properties [quartz config properties]
 
-This part describes quartz configs and please configure them based on your 
practical situation and resources.
+This part describes quartz configs and configure them based on your practical 
situation and resources.
 |Parameters | Default value| Description|
 |--|--|--|
-org.quartz.jobStore.driverDelegateClass | 
org.quartz.impl.jdbcjobstore.StdJDBCDelegate
-org.quartz.jobStore.driverDelegateClass | 
org.quartz.impl.jdbcjobstore.PostgreSQLDelegate
-org.quartz.scheduler.instanceName | DolphinScheduler
-org.quartz.scheduler.instanceId | AUTO
-org.quartz.scheduler.makeSchedulerThreadDaemon | true
-org.quartz.jobStore.useProperties | false
-org.quartz.threadPool.class | org.quartz.simpl.SimpleThreadPool
-org.quartz.threadPool.makeThreadsDaemons | true
-org.quartz.threadPool.threadCount | 25
-org.quartz.threadPool.threadPriority | 5
-org.quartz.jobStore.class | org.quartz.impl.jdbcjobstore.JobStoreTX
-org.quartz.jobStore.tablePrefix | QRTZ_
-org.quartz.jobStore.isClustered | true
-org.quartz.jobStore.misfireThreshold | 60000
-org.quartz.jobStore.clusterCheckinInterval | 5000
-org.quartz.jobStore.acquireTriggersWithinLock|true
-org.quartz.jobStore.dataSource | myDs
-org.quartz.dataSource.myDs.connectionProvider.class | 
org.apache.dolphinscheduler.service.quartz.DruidConnectionProvider
-
-
-### install_config.conf [DS environment variables configuration 
script[install/start DS]]
+org.quartz.jobStore.driverDelegateClass | 
org.quartz.impl.jdbcjobstore.StdJDBCDelegate |
+org.quartz.jobStore.driverDelegateClass | 
org.quartz.impl.jdbcjobstore.PostgreSQLDelegate |
+org.quartz.scheduler.instanceName | DolphinScheduler |
+org.quartz.scheduler.instanceId | AUTO |
+org.quartz.scheduler.makeSchedulerThreadDaemon | true |
+org.quartz.jobStore.useProperties | false |
+org.quartz.threadPool.class | org.quartz.simpl.SimpleThreadPool |
+org.quartz.threadPool.makeThreadsDaemons | true |
+org.quartz.threadPool.threadCount | 25 |
+org.quartz.threadPool.threadPriority | 5 |
+org.quartz.jobStore.class | org.quartz.impl.jdbcjobstore.JobStoreTX |
+org.quartz.jobStore.tablePrefix | QRTZ_ |
+org.quartz.jobStore.isClustered | true |
+org.quartz.jobStore.misfireThreshold | 60000 |
+org.quartz.jobStore.clusterCheckinInterval | 5000 |
+org.quartz.jobStore.acquireTriggersWithinLock|true |
+org.quartz.jobStore.dataSource | myDs |
+org.quartz.dataSource.myDs.connectionProvider.class | 
org.apache.dolphinscheduler.service.quartz.DruidConnectionProvider |
+
+
+### install_config.conf [DS environment variables configuration script[install 
or start DS]]
 
 install_config.conf is a bit complicated and is mainly used in the following 
two places.
-* DS Cluster Auto Installation
+* DS Cluster Auto Installation.
 
 > System will load configs in the install_config.conf and auto-configure files 
 > below, based on the file content when executing 'install.sh'.
-> Files such as 
dolphinscheduler-daemon.sh、datasource.properties、zookeeper.properties、common.properties、application-api.properties、master.properties、worker.properties、alert.properties、quartz.properties
 and etc.
+> Files such as dolphinscheduler-daemon.sh, datasource.properties, 
zookeeper.properties, common.properties, application-api.properties, 
master.properties, worker.properties, alert.properties, quartz.properties, etc.
 
+* Startup and Shutdown DS Cluster.
 
-* Startup/Shutdown DS Cluster
-> The system will load masters, workers, alertServer, apiServers and other 
parameters inside the file to startup/shutdown DS cluster.
+> The system will load masters, workers, alert-server, API-servers and other 
parameters inside the file to startup or shutdown DS cluster.
 
 #### File Content as Follows:
 
@@ -297,7 +300,7 @@ 
zkQuorum="192.168.xx.xx:2181,192.168.xx.xx:2181,192.168.xx.xx:2181"
 installPath="/data1_1T/dolphinscheduler"
 
 # Deployment user
-# Note: Deployment user needs 'sudo' privilege and has rights to operate HDFS
+# Note: Deployment user needs 'sudo' privilege and has rights to operate HDFS.
 #     Root directory must be created by the same user if using HDFS, otherwise 
permission related issues will be raised.
 deployUser="dolphinscheduler"
 
@@ -318,16 +321,16 @@ mailUser="xxxxxxxxxx"
 # Mail password
 mailPassword="xxxxxxxxxx"
 
-# Mail supports TLS set true if not set false
+# Whether mail supports TLS
 starttlsEnable="true"
 
-# Mail supports SSL set true if not set false. Note: starttlsEnable and 
sslEnable cannot both set true
+# Whether mail supports SSL. Note: starttlsEnable and sslEnable cannot both 
set true.
 sslEnable="false"
 
 # Mail server host, same as mailServerHost
 sslTrust="smtp.exmail.qq.com"
 
-# Specify which resource upload function to use for resources storage such as 
sql files. And supported options are HDFS, S3 and NONE. HDFS for upload to HDFS 
and NONE for not using this function.
+# Specify which resource upload function to use for resources storage, such as 
sql files. And supported options are HDFS, S3 and NONE. HDFS for upload to HDFS 
and NONE for not using this function.
 resourceStorageType="NONE"
 
 # if S3, write S3 address. HA, for example: s3a://dolphinscheduler，
@@ -355,7 +358,7 @@ hdfsRootUser="hdfs"
 
 # Followings are Kerberos configs
 
-# Spicify Kerberos enable or not
+# Specify Kerberos enable or not
 kerberosStartUp="false"
 
 # Kdc krb5 config file path
@@ -395,7 +398,7 @@ apiServers="ds1"
 ### dolphinscheduler_env.sh [load environment variables configs]
 
 When using shell to commit tasks, DS will load environment variables inside 
dolphinscheduler_env.sh into the host.
-Types of tasks involved are: Shell task、Python task、Spark task、Flink 
task、Datax task and etc.
+Types of tasks involved are: Shell, Python, Spark, Flink, DataX, etc.
 ```bash
 export HADOOP_HOME=/opt/soft/hadoop
 export HADOOP_CONF_DIR=/opt/soft/hadoop/etc/hadoop
diff --git a/docs/en-us/dev/user_doc/architecture/design.md 
b/docs/en-us/dev/user_doc/architecture/design.md
index 396e29c..5e0bdf8 100644
--- a/docs/en-us/dev/user_doc/architecture/design.md
+++ b/docs/en-us/dev/user_doc/architecture/design.md
@@ -1,9 +1,10 @@
 # System Architecture Design
 
-Before explaining the architecture of the scheduling system, let's first 
understand the commonly used terms of the scheduling system
+Before explain the architecture of the scheduling system, let's first get to 
know the terms commonly used in scheduling system.
 
 ## Glossary
-**DAG：** The full name is Directed Acyclic Graph, referred to as DAG. Task 
tasks in the workflow are assembled in the form of a directed acyclic graph, 
and topological traversal is performed from nodes with zero degrees of entry 
until there are no subsequent nodes. Examples are as follows:
+
+**DAG：** The full name is Directed Acyclic Graph, the abbreviation is DAG. 
Tasks in the workflow are assembled in the form of a directed acyclic graph, 
and topological traversal performs from zero degree entry nodes until there are 
no subsequent nodes. Examples are as follows:
 
 <p align="center">
   <img src="/img/dag_examples_cn.jpg" alt="dag example"  width="60%" />
@@ -12,27 +13,27 @@ Before explaining the architecture of the scheduling 
system, let's first underst
   </p>
 </p>
 
-**Process definition**: Visualization formed by dragging task nodes and 
establishing task node associations**DAG**
+**Process definition**: A visualized **DAG** formed by association of task 
nodes which is created by dragging and dropping.
 
-**Process instance**: The process instance is the instantiation of the process 
definition, which can be generated by manual start or scheduled scheduling. 
Each time the process definition runs, a process instance is generated
+**Process instance**: The process instance is the instantiation of a process 
definition, which can be generated by manual start or scheduled scheduling. A 
process instance generates by everytime process definition runs.
 
-**Task instance**: The task instance is the instantiation of the task node in 
the process definition, which identifies the specific task execution status
+**Task instance**: The task instance is the instantiation of the task node in 
the process definition, which identifies specific task execution status.
 
-**Task type**: Currently supports SHELL, SQL, SUB_PROCESS (sub-process), 
PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (depends), and plans to support dynamic 
plug-in expansion, note: **SUB_PROCESS**  It is also a separate process 
definition that can be started and executed separately
+**Task type**: Currently supports shell, SQL, SUB_PROCESS (sub-process), 
PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (dependent), and plans to support 
dynamic plug-in extension. Note: **SUB_PROCESS** is also an individual process 
definition that can be start and execute separately.
 
-**Scheduling method**: The system supports scheduled scheduling and manual 
scheduling based on cron expressions. Command type support: start workflow, 
start execution from current node, resume fault-tolerant workflow, resume pause 
process, start execution from failed node, complement, timing, rerun, pause, 
stop, resume waiting thread. Among them **Resume fault-tolerant workflow** and 
**Resume waiting thread** The two command types are used by the internal 
control of scheduling, and canno [...]
+**Scheduling method**: The system supports cron expressions based scheduling 
and manual scheduling. Command type support: start workflow, start execution 
from current node, resume fault-tolerant workflow, resume pause process, start 
execution from failed node, complement, timing, rerun, pause, stop, resume 
waiting thread. Among them **Resume fault-tolerant workflow** and **Resume 
waiting thread** these command types are controlled by the internal scheduling, 
and cannot be called from the [...]
 
-**Scheduled**: System adopts **quartz** distributed scheduler, and supports 
the visual generation of cron expressions
+**Scheduled**: System adopts **quartz** distributed scheduler, and supports 
the visual generation of cron expressions.
 
-**Rely**: The system not only supports **DAG** simple dependencies between the 
predecessor and successor nodes, but also provides **task dependent** nodes, 
supporting **between processes**
+**Rely**: The system not only supports **DAG** simple dependencies between the 
predecessor and successor nodes, but also provides **task dependent** nodes, 
supporting **dependencies between customized tasks of processes**.
 
-**Priority**: Support the priority of process instances and task instances, if 
the priority of process instances and task instances is not set, the default is 
first-in-first-out
+**Priority**: Support the priority of process instances and task instances. By 
default, the priority is first-in-first-out.
 
-**Email alert**: Support **SQL task** Query result email sending, process 
instance running result email alert and fault tolerance alert notification
+**Email alert**: Support send **SQL task** query result email, process 
instance execution result email alert and fault tolerance alert notification.
 
-**Failure strategy**: For tasks running in parallel, if a task fails, two 
failure strategy processing methods are provided. **Continue** refers to 
regardless of the status of the task running in parallel until the end of the 
process failure. **End** means that once a failed task is found, Kill will also 
run the parallel task at the same time, and the process fails and ends
+**Failure strategy**: For parallel tasks, if a task fails, there are two 
failure strategy remedy. **Continue** refers to regardless of the status of the 
task running in parallel until the end of the process failure. **End** means 
that once a task failed, kill the parallel task, and the process has a failure 
result and end.
 
-**Complement**: Supplement historical data，Supports **interval parallel and 
serial** two complement methods
+**Complement**: Complement historical data, Supports **interval parallel and 
serial** two complement methods.
 
 ## System Structure
 
@@ -58,40 +59,40 @@ Before explaining the architecture of the scheduling 
system, let's first underst
 
 * **MasterServer** 
 
-    MasterServer adopts a distributed and centerless design concept. 
MasterServer is mainly responsible for DAG task segmentation, task submission 
monitoring, and monitoring the health status of other MasterServer and 
WorkerServer at the same time.
+    MasterServer adopts a distributed and decentralized design concept. 
MasterServer is mainly responsible for DAG task segmentation, task submission 
monitoring, and monitoring the health status of other MasterServer and 
WorkerServer at the same time.
     When the MasterServer service starts, register a temporary node with 
ZooKeeper, and perform fault tolerance by monitoring changes in the temporary 
node of ZooKeeper.
     MasterServer provides monitoring services based on netty.
 
     #### The Service Mainly Includes:
 
-    - **Distributed Quartz** distributed scheduling component, which is mainly 
responsible for the start and stop operations of scheduled tasks. When Quartz 
starts the task, there will be a thread pool inside the Master that is 
specifically responsible for the follow-up operation of the processing task
+    - **Distributed Quartz** distributed scheduling component, which is mainly 
responsible for the start and stop operations of schedule tasks. When Quartz 
starts the task, there will be a thread pool inside the Master responsible for 
the follow-up operation of the processing task.
 
-    - **MasterSchedulerThread** is a scanning thread that regularly scans the 
**command** table in the database and performs different business operations 
according to different **command types**
+    - **MasterSchedulerThread** is a scanning thread that regularly scans the 
**command** table in the database and runs different business operations 
according to different **command types**.
 
-    - **MasterExecThread** is mainly responsible for DAG task segmentation, 
task submission monitoring, and logical processing of various command types
+    - **MasterExecThread** is mainly responsible for DAG task segmentation, 
task submission monitoring, and logical processing to different command types.
 
-    - **MasterTaskExecThread** is mainly responsible for the persistence of 
tasks
+    - **MasterTaskExecThread** is mainly responsible for the persistence to 
tasks.
 
 * **WorkerServer** 
 
      WorkerServer also adopts a distributed and decentralized design concept. 
WorkerServer is mainly responsible for task execution and providing log 
services.
 
      When the WorkerServer service starts, register a temporary node with 
ZooKeeper and maintain a heartbeat.
-     Server provides monitoring services based on netty. Worker
+     Server provides monitoring services based on netty.
   
      #### The Service Mainly Includes:
   
-     - **Fetch TaskThread** is mainly responsible for continuously getting 
tasks from **Task Queue**, and calling **TaskScheduleThread** corresponding 
executor according to different task types.
+     - **Fetch TaskThread** is mainly responsible for continuously getting 
tasks from the **Task Queue**, and calling **TaskScheduleThread** corresponding 
executor according to different task types.
 
 * **ZooKeeper** 
 
-    ZooKeeper service, MasterServer and WorkerServer nodes in the system all 
use ZooKeeper for cluster management and fault tolerance. In addition, the 
system is based on ZooKeeper for event monitoring and distributed locks.
+    ZooKeeper service, MasterServer and WorkerServer nodes in the system all 
use ZooKeeper for cluster management and fault tolerance. In addition, the 
system implements event monitoring and distributed locks based on ZooKeeper.
 
-    We have also implemented queues based on Redis, but we hope that 
DolphinScheduler depends on as few components as possible, so we finally 
removed the Redis implementation.
+    We have also implemented queues based on Redis, but we hope 
DolphinScheduler depends on as few components as possible, so we finally 
removed the Redis implementation.
 
 * **Task Queue** 
 
-    Provide task queue operation, the current queue is also implemented based 
on ZooKeeper. Because there is less information stored in the queue, there is 
no need to worry about too much data in the queue. In fact, we have tested the 
millions of data storage queues, which has no impact on system stability and 
performance.
+    Provide task queue operation, the current queue is also implement base on 
ZooKeeper. Due to little information stored in the queue, there is no need to 
worry about excessive data in the queue. In fact, we have tested the millions 
of data storage in queues, which has no impact on system stability and 
performance.
 
 * **Alert** 
 
@@ -99,12 +100,12 @@ Before explaining the architecture of the scheduling 
system, let's first underst
 
 * **API** 
 
-    The API interface layer is mainly responsible for processing requests from 
the front-end UI layer. The service uniformly provides RESTful APIs to provide 
request services to the outside world. Interfaces include workflow creation, 
definition, query, modification, release, logoff, manual start, stop, pause, 
resume, start execution from the node and so on.
+    The API interface layer is mainly responsible for processing requests from 
the front-end UI layer. The service uniformly provides RESTful APIs to provide 
request services to external.
+    Interfaces include workflow creation, definition, query, modification, 
release, logoff, manual start, stop, pause, resume, start execution from 
specific node, etc.
 
 * **UI** 
 
-  The front-end page of the system provides various visual operation 
interfaces of the system,See more
-  at [Introduction to Functions](../guide/homepage.md) section。
+  The front-end page of the system provides various visual operation 
interfaces of the system, see more at [Introduction to 
Functions](../guide/homepage.md) section.
 
 ### Architecture Design Ideas
 
@@ -112,17 +113,17 @@ Before explaining the architecture of the scheduling 
system, let's first underst
 
 ##### Centralized Thinking
 
-The centralized design concept is relatively simple. The nodes in the 
distributed cluster are divided into roles according to roles, which are 
roughly divided into two roles:
+The centralized design concept is relatively simple. The nodes in the 
distributed cluster are roughly divided into two roles according to 
responsibilities:
 <p align="center">
    <img 
src="https://analysys.github.io/easyscheduler_docs_cn/images/master_slave.png"; 
alt="master-slave character"  width="50%" />
  </p>
 
-- The role of the master is mainly responsible for task distribution and 
monitoring the health status of the slave, and can dynamically balance the task 
to the slave, so that the slave node will not be in a "busy dead" or "idle 
dead" state.
-- The role of Worker is mainly responsible for task execution and maintenance 
and Master's heartbeat, so that Master can assign tasks to Slave.
+- The role of the master is mainly responsible for task distribution and 
monitoring the health status of the slave, and can dynamically balance the task 
to the slave, so that the slave node won't be in a "busy dead" or "idle dead" 
state.
+- The role of Worker is mainly responsible for task execution and heartbeat 
maintenance to the Master, so that Master can assign tasks to Slave.
 
 Problems in centralized thought design:
 
-- Once there is a problem with the Master, the dragons are headless and the 
entire cluster will collapse. In order to solve this problem, most of the 
Master/Slave architecture models adopt the design scheme of active and standby 
Master, which can be hot standby or cold standby, or automatic switching or 
manual switching, and more and more new systems are beginning to have The 
ability to automatically elect and switch Master to improve the availability of 
the system.
+- Once there is a problem with the Master, the team grow aimless without 
commander and the entire cluster collapse. In order to solve this problem, most 
of the Master and Slave architecture models adopt the design scheme of active 
and standby Master, which can be hot standby or cold standby, or automatic 
switching or manual switching. More and more new systems are beginning to have 
ability to automatically elect and switch Master to improve the availability of 
the system.
 - Another problem is that if the Scheduler is on the Master, although it can 
support different tasks in a DAG running on different machines, it will cause 
the Master to be overloaded. If the Scheduler is on the slave, all tasks in a 
DAG can only submit jobs on a certain machine. When there are more parallel 
tasks, the pressure on the slave may be greater.
 
 ##### Decentralized
@@ -131,23 +132,20 @@ Problems in centralized thought design:
    <img 
src="https://analysys.github.io/easyscheduler_docs_cn/images/decentralization.png";
 alt="Decentralization"  width="50%" />
  </p>
 
-- In the decentralized design, there is usually no concept of Master/Slave, 
all roles are the same, the status is equal, the global Internet is a typical 
decentralized distributed system, any node equipment connected to the network 
is down, All will only affect a small range of functions.
-- The core design of decentralized design is that there is no "manager" 
different from other nodes in the entire distributed system, so there is no 
single point of failure. However, because there is no "manager" node, each node 
needs to communicate with other nodes to obtain the necessary machine 
information, and the unreliability of distributed system communication greatly 
increases the difficulty of implementing the above functions.
+- In the decentralized design, there is usually no concept of Master or Slave. 
All roles are the same, the status is equal, the global Internet is a typical 
decentralized distributed system. Any node connected to the network goes down, 
will only affect a small range of functions.
+- The core design of decentralized design is that there is no distinct 
"manager" different from other nodes in the entire distributed system, so there 
is no single point failure. However, because there is no "manager" node, each 
node needs to communicate with other nodes to obtain the necessary machine 
information, and the unreliability of distributed system communication greatly 
increases the difficulty to implement the above functions.
 - In fact, truly decentralized distributed systems are rare. Instead, dynamic 
centralized distributed systems are constantly pouring out. Under this 
architecture, the managers in the cluster are dynamically selected, rather than 
preset, and when the cluster fails, the nodes of the cluster will automatically 
hold "meetings" to elect new "managers" To preside over the work. The most 
typical case is Etcd implemented by ZooKeeper and Go language.
-
-
-
-- The decentralization of DolphinScheduler is that the Master/Worker is 
registered in ZooKeeper, and the Master cluster and Worker cluster are 
centerless, and the ZooKeeper distributed lock is used to elect one of the 
Master or Worker as the "manager" to perform the task.
+- The decentralization of DolphinScheduler is that the Master and Worker 
register in ZooKeeper, for implement the centerless feature to Master cluster 
and Worker cluster. Use the ZooKeeper distributed lock to elect one of the 
Master or Worker as the "manager" to perform the task.
 
 #### Distributed Lock Practice
 
-DolphinScheduler uses ZooKeeper distributed lock to realize that only one 
Master executes Scheduler at the same time, or only one Worker executes the 
submission of tasks.
+DolphinScheduler uses ZooKeeper distributed lock to implement only one Master 
executes Scheduler at the same time, or only one Worker executes the submission 
of tasks.
 1. The core process algorithm for acquiring distributed locks is as follows:
  <p align="center">
    <img 
src="https://analysys.github.io/easyscheduler_docs_cn/images/distributed_lock.png";
 alt="Obtain distributed lock process"  width="50%" />
  </p>
 
-2. Flow chart of implementation of Scheduler thread distributed lock in 
DolphinScheduler:
+2. Flow diagram of implementation of Scheduler thread distributed lock in 
DolphinScheduler:
  <p align="center">
    <img src="/img/distributed_lock_procss.png" alt="Obtain distributed lock 
process"  width="50%" />
  </p>
@@ -155,36 +153,37 @@ DolphinScheduler uses ZooKeeper distributed lock to 
realize that only one Master
 
 #### Insufficient Thread Loop Waiting Problem
 
--  If there is no sub-process in a DAG, if the number of data in the Command 
is greater than the threshold set by the thread pool, the process directly 
waits or fails.
--  If many sub-processes are nested in a large DAG, the following figure will 
produce a "dead" state:
+-  If there is no sub-process in a DAG, when the number of data in the Command 
is greater than the threshold set by the thread pool, the process directly 
waits or fails.
+-  If a large DAG nests many sub-processes, there will produce a "dead" state 
as the following figure:
 
  <p align="center">
    <img 
src="https://analysys.github.io/easyscheduler_docs_cn/images/lack_thread.png"; 
alt="Insufficient threads waiting loop problem"  width="50%" />
  </p>
-In the above figure, MainFlowThread waits for the end of SubFlowThread1, 
SubFlowThread1 waits for the end of SubFlowThread2, SubFlowThread2 waits for 
the end of SubFlowThread3, and SubFlowThread3 waits for a new thread in the 
thread pool, then the entire DAG process cannot end, so that the threads cannot 
be released. In this way, the state of the child-parent process loop waiting is 
formed. At this time, unless a new Master is started to add threads to break 
such a "stalemate", the sched [...]
+In the above figure, MainFlowThread waits for the end of SubFlowThread1, 
SubFlowThread1 waits for the end of SubFlowThread2, SubFlowThread2 waits for 
the end of SubFlowThread3, and SubFlowThread3 waits for a new thread in the 
thread pool, then the entire DAG process cannot finish, and the threads cannot 
be released. In this situation, the state of the child-parent process loop 
waiting is formed. At this moment, unless a new Master is started and add 
threads to break such a "stalemate", t [...]
 
 It seems a bit unsatisfactory to start a new Master to break the deadlock, so 
we proposed the following three solutions to reduce this risk:
 
-1. Calculate the sum of all Master threads, and then calculate the number of 
threads required for each DAG, that is, pre-calculate before the DAG process is 
executed. Because it is a multi-master thread pool, the total number of threads 
is unlikely to be obtained in real time. 
-2. Judge the single-master thread pool. If the thread pool is full, let the 
thread fail directly.
+1. Calculate the sum of all Master threads, and then calculate the number of 
threads required for each DAG, that is, pre-calculate before the DAG process 
executes. Because it is a multi-master thread pool, it is unlikely to obtain 
the total number of threads in real time. 
+2. Judge whether the single-master thread pool is full, let the thread fail 
directly when fulfilled.
 3. Add a Command type with insufficient resources. If the thread pool is 
insufficient, suspend the main process. In this way, there are new threads in 
the thread pool, which can make the process suspended by insufficient resources 
wake up to execute again.
 
-note: The Master Scheduler thread is executed by FIFO when acquiring the 
Command.
+Note: The Master Scheduler thread executes by FIFO when acquiring the Command.
 
-So we chose the third way to solve the problem of insufficient threads.
+So we choose the third way to solve the problem of insufficient threads.
 
 
 #### Fault-Tolerant Design
-Fault tolerance is divided into service downtime fault tolerance and task 
retry, and service downtime fault tolerance is divided into master fault 
tolerance and worker fault tolerance.
+
+Fault tolerance divides into service downtime fault tolerance and task retry, 
and service downtime fault tolerance divides into master fault tolerance and 
worker fault tolerance.
 
 ##### Downtime Fault Tolerance
 
-The service fault-tolerance design relies on ZooKeeper's Watcher mechanism, 
and the implementation principle is shown in the figure:
+The service fault-tolerance design relies on ZooKeeper's Watcher mechanism, 
and the implementation principle shows in the figure:
 
  <p align="center">
    <img 
src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant.png";
 alt="DolphinScheduler fault-tolerant design"  width="40%" />
  </p>
-Among them, the Master monitors the directories of other Masters and Workers. 
If the remove event is heard, fault tolerance of the process instance or task 
instance will be performed according to the specific business logic.
+Among them, the Master monitors the directories of other Masters and Workers. 
If the remove event is triggered, perform fault tolerance of the process 
instance or task instance according to the specific business logic.
 
 
 
@@ -194,11 +193,11 @@ Among them, the Master monitors the directories of other 
Masters and Workers. If
    <img src="/img/failover-master.jpg" alt="failover-master"  width="50%" />
  </p>
 
-Fault tolerance range: From the perspective of host, the fault tolerance range 
of Master includes: own host + node host that does not exist in the registry, 
and the entire process of fault tolerance will be locked;
+Fault tolerance range: From the perspective of host, the fault tolerance range 
of Master includes: own host and node host that does not exist in the registry, 
and the entire process of fault tolerance will be locked;
 
-Fault-tolerant content: Master's fault-tolerant content includes: 
fault-tolerant process instances and task instances. Before fault-tolerant, it 
compares the start time of the instance with the server start-up time, and 
skips fault-tolerance if after the server start time;
+Fault-tolerant content: Master's fault-tolerant content includes: 
fault-tolerant process instances and task instances. Before fault-tolerant, 
compares the start time of the instance with the server start-up time, and 
skips fault-tolerance if after the server start time;
 
-Fault-tolerant post-processing: After the fault tolerance of ZooKeeper Master 
is completed, it is re-scheduled by the Scheduler thread in DolphinScheduler, 
traverses the DAG to find the "running" and "submit successful" tasks, monitors 
the status of its task instances for the "running" tasks, and "commits 
successful" tasks It is necessary to determine whether the task queue already 
exists. If it exists, the status of the task instance is also monitored. If it 
does not exist, resubmit the [...]
+Fault-tolerant post-processing: After the fault tolerance of ZooKeeper Master 
completed, then re-schedule by the Scheduler thread in DolphinScheduler, 
traverses the DAG to find the "running" and "submit successful" tasks. Monitor 
the status of its task instances for the "running" tasks, and for the "commits 
successful" tasks, it is necessary to find out whether the task queue already 
exists. If exists, monitor the status of the task instance. Otherwise, resubmit 
the task instance.
 
 - Worker fault tolerance：
 
@@ -208,60 +207,60 @@ Fault-tolerant post-processing: After the fault tolerance 
of ZooKeeper Master is
 
 Fault tolerance range: From the perspective of process instance, each Master 
is only responsible for fault tolerance of its own process instance; it will 
lock only when `handleDeadServer`;
 
-Fault-tolerant content: When sending the remove event of the Worker node, the 
Master only fault-tolerant task instances. Before fault-tolerant, it compares 
the start time of the instance with the server start-up time, and skips 
fault-tolerance if after the server start time;
+Fault-tolerant content: When sending the remove event of the Worker node, the 
Master only fault-tolerant task instances. Before fault-tolerant, compares the 
start time of the instance with the server start-up time, and skips 
fault-tolerance if after the server start time;
 
 Fault-tolerant post-processing: Once the Master Scheduler thread finds that 
the task instance is in the "fault-tolerant" state, it takes over the task and 
resubmits it.
 
- Note: Due to "network jitter", the node may lose its heartbeat with ZooKeeper 
in a short period of time, and the node's remove event may occur. For this 
situation, we use the simplest way, that is, once the node and ZooKeeper 
timeout connection occurs, then directly stop the Master or Worker service.
+ Note: Due to "network jitter", the node may lose heartbeat with ZooKeeper in 
a short period of time, and the node's remove event may occur. For this 
situation, we use the simplest way, that is, once the node and ZooKeeper 
timeout connection occurs, then directly stop the Master or Worker service.
 
 ##### Task Failed and Try Again
 
-Here we must first distinguish the concepts of task failure retry, process 
failure recovery, and process failure rerun:
+Here we must first distinguish the concepts of task failure retry, process 
failure recovery, and process failure re-run:
 
-- Task failure retry is at the task level and is automatically performed by 
the scheduling system. For example, if a Shell task is set to retry for 3 
times, it will try to run it again up to 3 times after the Shell task fails.
-- Process failure recovery is at the process level and is performed manually. 
Recovery can only be performed **from the failed node** or **from the current 
node**
-- Process failure rerun is also at the process level and is performed 
manually, rerun is performed from the start node
+- Task failure retry is at the task level and is automatically performed by 
the schedule system. For example, if a Shell task sets to retry for 3 times, it 
will try to run it again up to 3 times after the Shell task fails.
+- Process failure recovery is at the process level and is performed manually. 
Recovery can only perform **from the failed node** or **from the current node**.
+- Process failure re-run is also at the process level and is performed 
manually, re-run perform from the beginning node.
 
-Next to the topic, we divide the task nodes in the workflow into two types.
+Next to the main point, we divide the task nodes in the workflow into two 
types.
 
-- One is a business node, which corresponds to an actual script or processing 
statement, such as Shell node, MR node, Spark node, and dependent node.
+- One is a business node, which corresponds to an actual script or process 
command, such as shell node, MR node, Spark node, and dependent node.
 
-- There is also a logical node, which does not do actual script or statement 
processing, but only logical processing of the entire process flow, such as 
sub-process sections.
+- Another is a logical node, which does not operate actual script or process 
command, but only logical processing to the entire process flow, such as 
sub-process sections.
 
-Each **business node** can be configured with the number of failed retries. 
When the task node fails, it will automatically retry until it succeeds or 
exceeds the configured number of retries. **Logical node** Failure retry is not 
supported. But the tasks in the logical node support retry.
+Each **business node** can configure the number of failed retries. When the 
task node fails, it will automatically retry until it succeeds or exceeds the 
retry times. **Logical node** failure retry is not supported, but the tasks in 
the logical node support.
 
-If there is a task failure in the workflow that reaches the maximum number of 
retries, the workflow will fail to stop, and the failed workflow can be 
manually rerun or process recovery operation
+If there is a task failure in the workflow that reaches the maximum retry 
times, the workflow will fail and stop, and the failed workflow can be manually 
re-run or process recovery operations.
 
 #### Task Priority Design
 
-In the early scheduling design, if there is no priority design and the fair 
scheduling design is used, the task submitted first may be completed at the 
same time as the task submitted later, and the process or task priority cannot 
be set, so We have redesigned this, and our current design is as follows:
+In the early schedule design, if there is no priority design and use the fair 
scheduling, the task submitted first may complete at the same time with the 
task submitted later, thus invalid the priority of process or task. So we have 
re-designed this, and our current design is as follows:
 
--  According to **priority of different process instances** priority over 
**priority of the same process instance** priority over **priority of tasks 
within the same process**priority over **tasks within the same 
process**submission order from high to Low task processing.
-    - The specific implementation is to parse the priority according to the 
JSON of the task instance, and then save the **process instance 
priority_process instance id_task priority_task id** information in the 
ZooKeeper task queue, when obtained from the task queue, pass String comparison 
can get the tasks that need to be executed first
+-  According to **the priority of different process instances** prior over 
**priority of the same process instance** prior over **priority of tasks within 
the same process** prior over **tasks within the same process**, process task 
submission order from highest to Lowest.
+    - The specific implementation is to parse the priority according to the 
JSON of the task instance, and then save the **process instance 
priority_process instance id_task priority_task id** information to the 
ZooKeeper task queue. When obtain from the task queue, we can get the highest 
priority task by comparing string.
 
-        - The priority of the process definition is to consider that some 
processes need to be processed before other processes. This can be configured 
when the process is started or scheduled to start. There are 5 levels in total, 
which are HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below
+          - The priority of the process definition is to consider that some 
processes need to process before other processes. Configure the priority when 
the process starts or schedules. There are 5 levels in total, which are 
HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below
             <p align="center">
                <img 
src="https://user-images.githubusercontent.com/10797147/146744784-eb351b14-c94a-4ed6-8ba4-5132c2a3d116.png";
 alt="Process priority configuration"  width="40%" />
              </p>
 
-        - The priority of the task is also divided into 5 levels, followed by 
HIGHEST, HIGH, MEDIUM, LOW, LOWEST. As shown below
+        - The priority of the task is also divides into 5 levels, ordered by 
HIGHEST, HIGH, MEDIUM, LOW, LOWEST. As shown below:
             <p align="center">
                <img 
src="https://user-images.githubusercontent.com/10797147/146744830-5eac611f-5933-4f53-a0c6-31613c283708.png";
 alt="Task priority configuration"  width="35%" />
              </p>
 
 #### Logback and Netty Implement Log Access
 
--  Since Web (UI) and Worker are not necessarily on the same machine, viewing 
the log cannot be like querying a local file. There are two options:
-  -  Put logs on the ES search engine
-  -  Obtain remote log information through netty communication
+-  Since Web (UI) and Worker are not always on the same machine, to view the 
log cannot be like querying a local file. There are two options:
+  -  Put logs on the ES search engine.
+  -  Obtain remote log information through netty communication.
 
--  In consideration of the lightness of DolphinScheduler as much as possible, 
so I chose gRPC to achieve remote access to log information.
+-  In consideration of the lightness of DolphinScheduler as much as possible, 
so choose gRPC to achieve remote access to log information.
 
  <p align="center">
    <img src="https://analysys.github.io/easyscheduler_docs_cn/images/grpc.png"; 
alt="grpc remote access"  width="50%" />
  </p>
 
-- We use the FileAppender and Filter functions of the custom Logback to 
realize that each task instance generates a log file.
+- We use the customized FileAppender and Filter functions from Logback to 
implement each task instance generates one log file.
 - FileAppender is mainly implemented as follows：
 
 ```java
@@ -290,7 +289,7 @@ In the early scheduling design, if there is no priority 
design and the fair sche
 }
 ```
 
-Generate logs in the form of /process definition id/process instance id/task 
instance id.log
+Generate logs in the form of /process definition id /process instance id /task 
instance id.log
 
 - Filter to match the thread name starting with TaskLogInfo:
 
@@ -314,22 +313,23 @@ public class TaskLogFilter extends Filter<ILoggingEvent> {
 
 ## Module Introduction
 
-- dolphinscheduler-alert alarm module, providing AlertServer service.
+- dolphinscheduler-alert: alarm module, providing AlertServer service.
 
-- dolphinscheduler-api web application module, providing ApiServer service.
+- dolphinscheduler-api: web application module, providing ApiServer service.
 
-- dolphinscheduler-common General constant enumeration, utility class, data 
structure or base class
+- dolphinscheduler-common: contains general constant enumeration, utility 
class, data structure and base class.
 
-- dolphinscheduler-dao provides operations such as database access.
+- dolphinscheduler-dao: provides operations such as database access.
 
-- dolphinscheduler-remote client and server based on netty
+- dolphinscheduler-remote: client and server based on netty.
 
-- dolphinscheduler-server MasterServer and WorkerServer services
+- dolphinscheduler-server: MasterServer and WorkerServer services.
 
-- dolphinscheduler-service service module, including Quartz, ZooKeeper, log 
client access service, easy to call server module and api module
+- dolphinscheduler-service: service module, including Quartz, ZooKeeper, log 
client access service, convenient for calling from server module and API module.
 
-- dolphinscheduler-ui front-end module
+- dolphinscheduler-ui: front-end module.
 
 ## Sum Up
-From the perspective of scheduling, this article preliminarily introduces the 
architecture principles and implementation ideas of the big data distributed 
workflow scheduling system-DolphinScheduler. To be continued
+
+From the perspective of scheduling, this article preliminarily introduces the 
architecture principles and implementation ideas of the big data distributed 
workflow scheduling system: DolphinScheduler. To be continued.
 
diff --git a/docs/en-us/dev/user_doc/architecture/load-balance.md 
b/docs/en-us/dev/user_doc/architecture/load-balance.md
index e21abba..5cf2d4e 100644
--- a/docs/en-us/dev/user_doc/architecture/load-balance.md
+++ b/docs/en-us/dev/user_doc/architecture/load-balance.md
@@ -6,17 +6,17 @@ Load balancing refers to the reasonable allocation of server 
pressure through ro
 
 DolphinScheduler-Master allocates tasks to workers, and by default provides 
three algorithms:
 
-Weighted random (random)
+- Weighted random (random)
 
-Smoothing polling (roundrobin)
+- Smoothing polling (round-robin)
 
-Linear load (lowerweight)
+- Linear load (lower weight)
 
 The default configuration is the linear load.
 
-As the routing is done on the client side, the master service, you can change 
master.host.selector in master.properties to configure the algorithm what you 
want.
+As the routing sets on the client side, the master service, you can change 
master.host.selector in master.properties to configure the algorithm.
 
-e.g. master.host.selector = random (case-insensitive)
+e.g. master.host.selector=random (case-insensitive)
 
 ## Worker Load Balancing Configuration
 
@@ -24,34 +24,34 @@ The configuration file is worker.properties
 
 ### Weight
 
-All of the above load algorithms are weighted based on weights, which affect 
the outcome of the triage. You can set different weights for different machines 
by modifying the worker.weight value.
+All the load algorithms above are weighted based on weights, which affect the 
routing outcome. You can set different weights for different machines by 
modifying the `worker.weight` value.
 
 ### Preheating
 
-With JIT optimisation in mind, we will let the worker run at low power for a 
period of time after startup so that it can gradually reach its optimal state, 
a process we call preheating. If you are interested, you can read some articles 
about JIT.
+Consider JIT optimization, worker runs at low power for a period of time after 
startup, so that it can gradually reach its optimal state, a process we call 
preheating. If you are interested, you can read some articles about JIT.
 
-So the worker will gradually reach its maximum weight over time after it 
starts (by default ten minutes, we don't provide a configuration item, you can 
change it and submit a PR if needed).
+So the worker gradually reaches its maximum weight with time after starts up ( 
by default ten minutes, there is no configuration about the pre-heating 
duration, it's recommend to submit a PR if have needs to change the duration).
 
-## Load Balancing Algorithm Breakdown
+## Load Balancing Algorithm in Details
 
 ### Random (Weighted)
 
-This algorithm is relatively simple, one of the matched workers is selected at 
random (the weighting affects his weighting).
+This algorithm is relatively simple, select a worker by random (the weight 
affects its weighting).
 
 ### Smoothed Polling (Weighted)
 
-An obvious drawback of the weighted polling algorithm. Namely, under certain 
specific weights, weighted polling scheduling generates an uneven sequence of 
instances, and this unsmoothed load may cause some instances to experience 
transient high loads, leading to a risk of system downtime. To address this 
scheduling flaw, we provide a smooth weighted polling algorithm.
+An obvious drawback of the weighted polling algorithm, which is under special 
weights circumstance, weighted polling scheduling generates an imbalanced 
sequence of instances, and this unsmooth load may cause some instances to 
experience transient high loads, leading to a risk of system crash. To address 
this scheduling flaw, we provide a smooth weighted polling algorithm.
 
-Each worker is given two weights, weight (which remains constant after warm-up 
is complete) and current_weight (which changes dynamically), for each route. 
The current_weight + weight is iterated over all the workers, and the weight of 
all the workers is added up and counted as total_weight, then the worker with 
the largest current_weight is selected as the worker for this task. 
current_weight-total_weight.
+Each worker has two weights parameters, weight (which remains constant after 
warm-up is complete) and current_weight (which changes dynamically). For every 
route, calculate the current_weight + weight and is iterated over all the 
workers, the weight of all the workers sum up and count as total_weight, then 
the worker with the largest current_weight is selected as the worker for this 
task. By meantime, set worker's current_weight-total_weight.
 
 ### Linear Weighting (Default Algorithm)
 
-The algorithm reports its own load information to the registry at regular 
intervals. We base our judgement on two main pieces of information
+This algorithm reports its own load information to the registry at regular 
intervals. Make decision on two main pieces of information:
 
 - load average (default is the number of CPU cores * 2)
 - available physical memory (default is 0.3, in G)
 
-If either of the two is lower than the configured item, then this worker will 
not participate in the load. (no traffic will be allocated)
+If either of these is lower than the configured item, then this worker will 
not participate in the load. (no traffic will be allocated)
 
 You can customise the configuration by changing the following properties in 
worker.properties
 
diff --git a/docs/en-us/dev/user_doc/architecture/metadata.md 
b/docs/en-us/dev/user_doc/architecture/metadata.md
index 616e50f..54ebc56 100644
--- a/docs/en-us/dev/user_doc/architecture/metadata.md
+++ b/docs/en-us/dev/user_doc/architecture/metadata.md
@@ -1,18 +1,16 @@
 # MetaData
 
-<a name="V5KOl"></a>
-
 ## DolphinScheduler DB Table Overview
 
 | Table Name | Comment |
 | :---: | :---: |
-| t_ds_access_token | token for access ds backend |
+| t_ds_access_token | token for access DolphinScheduler backend |
 | t_ds_alert | alert detail |
 | t_ds_alertgroup | alert group |
 | t_ds_command | command detail |
 | t_ds_datasource | data source |
 | t_ds_error_command | error command detail |
-| t_ds_process_definition | process difinition |
+| t_ds_process_definition | process definition |
 | t_ds_process_instance | process instance |
 | t_ds_project | project |
 | t_ds_queue | queue |
@@ -20,62 +18,52 @@
 | t_ds_relation_process_instance | sub process |
 | t_ds_relation_project_user | project related to user |
 | t_ds_relation_resources_user | resource related to user |
-| t_ds_relation_udfs_user | UDF related to user |
+| t_ds_relation_udfs_user | UDF functions related to user |
 | t_ds_relation_user_alertgroup | alert group related to user |
 | t_ds_resources | resoruce center file |
-| t_ds_schedules | process difinition schedule |
+| t_ds_schedules | process definition schedule |
 | t_ds_session | user login session |
 | t_ds_task_instance | task instance |
 | t_ds_tenant | tenant |
 | t_ds_udfs | UDF resource |
 | t_ds_user | user detail |
-| t_ds_version | ds version |
+| t_ds_version | DolphinScheduler version |
 
 
 ---
 
-<a name="XCLy1"></a>
-
 ## E-R Diagram
 
-<a name="5hWWZ"></a>
-
 ### User Queue DataSource
 
 ![image.png](/img/metadata-erd/user-queue-datasource.png)
 
-- Multiple users can belong to one tenant
-- The queue field in the t_ds_user table stores the queue_name information in 
the t_ds_queue table, but t_ds_tenant stores queue information using queue_id. 
During the execution of the process definition, the user queue has the highest 
priority. If the user queue is empty, the tenant queue is used.
-- The user_id field in the t_ds_datasource table indicates the user who 
created the data source. The user_id in t_ds_relation_datasource_user indicates 
the user who has permission to the data source.
-<a name="7euSN"></a>
+- One tenant can own Multiple users.
+- The queue field in the t_ds_user table stores the queue_name information in 
the t_ds_queue table, t_ds_tenant stores queue information using queue_id 
column. During the execution of the process definition, the user queue has the 
highest priority. If the user queue is null, use the tenant queue.
+- The user_id field in the t_ds_datasource table shows the user who create the 
data source. The user_id in t_ds_relation_datasource_user shows the user who 
has permission to the data source.
   
 ### Project Resource Alert
 
 ![image.png](/img/metadata-erd/project-resource-alert.png)
 
-- User can have multiple projects, User project authorization completes the 
relationship binding using project_id and user_id in t_ds_relation_project_user 
table
-- The user_id in the t_ds_projcet table represents the user who created the 
project, and the user_id in the t_ds_relation_project_user table represents 
users who have permission to the project
-- The user_id in the t_ds_resources table represents the user who created the 
resource, and the user_id in t_ds_relation_resources_user represents the user 
who has permissions to the resource
-- The user_id in the t_ds_udfs table represents the user who created the UDF, 
and the user_id in the t_ds_relation_udfs_user table represents a user who has 
permission to the UDF
-<a name="JEw4v"></a>
+- User can have multiple projects, user project authorization completes the 
relationship binding using project_id and user_id in t_ds_relation_project_user 
table.
+- The user_id in the t_ds_projcet table represents the user who create the 
project, and the user_id in the t_ds_relation_project_user table represents 
users who have permission to the project.
+- The user_id in the t_ds_resources table represents the user who create the 
resource, and the user_id in t_ds_relation_resources_user represents the user 
who has permissions to the resource.
+- The user_id in the t_ds_udfs table represents the user who create the UDF, 
and the user_id in the t_ds_relation_udfs_user table represents a user who has 
permission to the UDF.
   
 ### Command Process Task
 
 ![image.png](/img/metadata-erd/command.png)<br 
/>![image.png](/img/metadata-erd/process-task.png)
 
-- A project has multiple process definitions, a process definition can 
generate multiple process instances, and a process instance can generate 
multiple task instances
-- The t_ds_schedulers table stores the timing schedule information for process 
difinition
-- The data stored in the t_ds_relation_process_instance table is used to deal 
with that the process definition contains sub-processes, 
parent_process_instance_id field represents the id of the main process instance 
containing the child process, process_instance_id field represents the id of 
the sub-process instance, parent_task_instance_id field represents the task 
instance id of the sub-process node
+- A project has multiple process definitions, a process definition can 
generate multiple process instances, and a process instance can generate 
multiple task instances.
+- The t_ds_schedulers table stores the specified time schedule information for 
process definition.
+- The data stored in the t_ds_relation_process_instance table is used to deal 
with the sub-processes of a process definition, parent_process_instance_id 
field represents the id of the main process instance who contains child 
processes, process_instance_id field represents the id of the sub-process 
instance, parent_task_instance_id field represents the task instance id of the 
sub-process node.
 - The process instance table and the task instance table correspond to the 
t_ds_process_instance table and the t_ds_task_instance table, respectively.
 
 ---
 
-<a name="yd79T"></a>
-
 ## Core Table Schema
 
-<a name="6bVhH"></a>
-
 ### t_ds_process_definition
 
 | Field | Type | Comment |
@@ -86,10 +74,10 @@
 | release_state | tinyint | process definition release 
state：0:offline,1:online |
 | project_id | int | project id |
 | user_id | int | process definition creator id |
-| process_definition_json | longtext | process definition json content |
-| description | text | process difinition desc |
+| process_definition_json | longtext | process definition JSON content |
+| description | text | process definition description |
 | global_params | text | global parameters |
-| flag | tinyint | process is available: 0 not available, 1 available |
+| flag | tinyint | whether process available: 0 not available, 1 available |
 | locations | text | Node location information |
 | connects | text | Node connection information |
 | receivers | text | receivers |
@@ -98,8 +86,8 @@
 | timeout | int | timeout |
 | tenant_id | int | tenant id |
 | update_time | datetime | update time |
-
-<a name="t5uxM"></a>
+| modify_by | varchar | define user modify the process |
+| resource_ids | varchar | resource id set |
 
 ### t_ds_process_instance
 
@@ -108,38 +96,36 @@
 | id | int | primary key |
 | name | varchar | process instance name |
 | process_definition_id | int | process definition id |
-| state | tinyint | process instance Status: 0 commit succeeded, 1 running, 2 
prepare to pause, 3 pause, 4 prepare to stop, 5 stop, 6 fail, 7 succeed, 8 need 
fault tolerance, 9 kill, 10 wait for thread, 11 wait for dependency to complete 
|
-| recovery | tinyint | process instance failover flag：0:normal,1:failover 
instance |
+| state | tinyint | process instance Status: 0 successful commit, 1 running, 2 
prepare to pause, 3 pause, 4 prepare to stop, 5 stop, 6 fail, 7 succeed, 8 need 
fault tolerance, 9 kill, 10 wait for thread, 11 wait for dependency to complete 
|
+| recovery | tinyint | process instance failover flag：0: normal,1: failover 
instance needs restart |
 | start_time | datetime | process instance start time |
 | end_time | datetime | process instance end time |
 | run_times | int | process instance run times |
 | host | varchar | process instance host |
-| command_type | tinyint | command type：0 start ,1 Start from the current 
node,2 Resume a fault-tolerant process,3 Resume Pause Process, 4 Execute from 
the failed node,5 Complement, 6 dispatch, 7 re-run, 8 pause, 9 stop ,10 Resume 
waiting thread |
-| command_param | text | json command parameters |
-| task_depend_type | tinyint | task depend type. 0: only current node,1:before 
the node,2:later nodes |
+| command_type | tinyint | command type：0 start ,1 start from the current 
node,2 resume a fault-tolerant process,3 resume from pause process, 4 execute 
from the failed node,5 complement, 6 dispatch, 7 re-run, 8 pause, 9 stop, 10 
resume waiting thread |
+| command_param | text | JSON command parameters |
+| task_depend_type | tinyint | node dependency type: 0 current node, 1 
forward, 2 backward |
 | max_try_times | tinyint | max try times |
-| failure_strategy | tinyint | failure strategy. 0:end the process when node 
failed,1:continue running the other nodes when node failed |
-| warning_type | tinyint | warning type. 0:no warning,1:warning if process 
success,2:warning if process failed,3:warning if success |
+| failure_strategy | tinyint | failure strategy, 0: end the process when node 
failed,1: continue run the other nodes when failed |
+| warning_type | tinyint | warning type 0: no warning, 1: warning if process 
success, 2: warning if process failed, 3: warning whatever results |
 | warning_group_id | int | warning group id |
 | schedule_time | datetime | schedule time |
 | command_start_time | datetime | command start time |
 | global_params | text | global parameters |
-| process_instance_json | longtext | process instance json |
-| flag | tinyint | process instance is available: 0 not available, 1 available 
|
+| process_instance_json | longtext | process instance JSON |
+| flag | tinyint | whether process instance is available: 0 not available, 1 
available |
 | update_time | timestamp | update time |
 | is_sub_process | int | whether the process is sub process: 1 sub-process, 0 
not sub-process |
 | executor_id | int | executor id |
-| locations | text | Node location information |
-| connects | text | Node connection information |
-| history_cmd | text | history commands of process instance operation |
-| dependence_schedule_times | text | depend schedule fire time |
-| process_instance_priority | int | process instance priority. 0 Highest,1 
High,2 Medium,3 Low,4 Lowest |
-| worker_group_id | int | worker group id |
-| timeout | int | time out |
+| locations | text | node location information |
+| connects | text | node connection information |
+| history_cmd | text | history commands, record all the commands to a instance 
|
+| dependence_schedule_times | text | depend schedule estimate time |
+| process_instance_priority | int | process instance priority. 0 highest,1 
high,2 medium,3 low,4 lowest |
+| worker_group | varchar | worker group who assign the task |
+| timeout | int | timeout |
 | tenant_id | int | tenant id |
 
-<a name="tHZsY"></a>
-
 ### t_ds_task_instance
 
 | Field | Type | Comment |
@@ -149,7 +135,7 @@
 | task_type | varchar | task type |
 | process_definition_id | int | process definition id |
 | process_instance_id | int | process instance id |
-| task_json | longtext | task content json |
+| task_json | longtext | task content JSON |
 | state | tinyint | Status: 0 commit succeeded, 1 running, 2 prepare to pause, 
3 pause, 4 prepare to stop, 5 stop, 6 fail, 7 succeed, 8 need fault tolerance, 
9 kill, 10 wait for thread, 11 wait for dependency to complete |
 | submit_time | datetime | task submit time |
 | start_time | datetime | task start time |
@@ -160,31 +146,48 @@
 | alert_flag | tinyint | whether alert |
 | retry_times | int | task retry times |
 | pid | int | pid of task |
-| app_link | varchar | yarn app id |
-| flag | tinyint | taskinstance is available: 0 not available, 1 available |
-| retry_interval | int | retry interval when task failed  |
+| app_link | varchar | Yarn app id |
+| flag | tinyint | task instance is available : 0 not available, 1 available |
+| retry_interval | int | retry interval when task failed |
 | max_retry_times | int | max retry times |
-| task_instance_priority | int | task instance priority:0 Highest,1 High,2 
Medium,3 Low,4 Lowest |
-| worker_group_id | int | worker group id |
+| task_instance_priority | int | task instance priority:0 highest,1 high,2 
medium,3 low,4 lowest |
+| worker_group | varchar | worker group who assign the task |
 
-<a name="gLGtm"></a>
+#### t_ds_schedules
+
+| Field | Type | Comment |
+| --- | --- | --- |
+| id | int | primary key |
+| process_definition_id | int | process definition id |
+| start_time | datetime | schedule start time |
+| end_time | datetime | schedule end time |
+| crontab | varchar | crontab expression |
+| failure_strategy | tinyint | failure strategy: 0 end,1 continue |
+| user_id | int | user id |
+| release_state | tinyint | release status: 0 not yet released,1 released |
+| warning_type | tinyint | warning type: 0: no warning, 1: warning if process 
success, 2: warning if process failed, 3: warning whatever results |
+| warning_group_id | int | warning group id |
+| process_instance_priority | int | process instance priority:0 highest,1 
high,2 medium,3 low,4 lowest |
+| worker_group | varchar | worker group who assign the task |
+| create_time | datetime | create time |
+| update_time | datetime | update time |
 
 ### t_ds_command
 
 | Field | Type | Comment |
 | --- | --- | --- |
 | id | int | primary key |
-| command_type | tinyint | Command type: 0 start workflow, 1 start execution 
from current node, 2 resume fault-tolerant workflow, 3 resume pause process, 4 
start execution from failed node, 5 complement, 6 schedule, 7 rerun, 8 pause, 9 
stop, 10 resume waiting thread |
+| command_type | tinyint | command type: 0 start workflow, 1 start execution 
from current node, 2 resume fault-tolerant workflow, 3 resume pause process, 4 
start execution from failed node, 5 complement, 6 schedule, 7 re-run, 8 pause, 
9 stop, 10 resume waiting thread |
 | process_definition_id | int | process definition id |
-| command_param | text | json command parameters |
-| task_depend_type | tinyint | Node dependency type: 0 current node, 1 
forward, 2 backward |
-| failure_strategy | tinyint | Failed policy: 0 end, 1 continue |
-| warning_type | tinyint | Alarm type: 0 is not sent, 1 process is sent 
successfully, 2 process is sent failed, 3 process is sent successfully and all 
failures are sent |
-| warning_group_id | int | warning group |
+| command_param | text | JSON command parameters |
+| task_depend_type | tinyint | node dependency type: 0 current node, 1 
forward, 2 backward |
+| failure_strategy | tinyint | failed policy: 0 end, 1 continue |
+| warning_type | tinyint | alarm type: 0 no alarm, 1 alarm if process success, 
2: alarm if process failed, 3: warning whatever results |
+| warning_group_id | int | warning group id |
 | schedule_time | datetime | schedule time |
 | start_time | datetime | start time |
 | executor_id | int | executor id |
-| dependence | varchar | dependence |
+| dependence | varchar | dependence column |
 | update_time | datetime | update time |
-| process_instance_priority | int | process instance priority: 0 Highest,1 
High,2 Medium,3 Low,4 Lowest |
-| worker_group_id | int | worker group id |
\ No newline at end of file
+| process_instance_priority | int | process instance priority: 0 highest,1 
high,2 medium,3 low,4 lowest |
+| worker_group_id | int |  worker group who assign the task |
\ No newline at end of file
diff --git a/docs/en-us/dev/user_doc/architecture/task-structure.md 
b/docs/en-us/dev/user_doc/architecture/task-structure.md
index 0b1c28a..37864fe 100644
--- a/docs/en-us/dev/user_doc/architecture/task-structure.md
+++ b/docs/en-us/dev/user_doc/architecture/task-structure.md
@@ -2,16 +2,16 @@
 
 ## Overall Tasks Storage Structure
 
-All tasks created in DolphinScheduler are saved in the t_ds_process_definition 
table.
+All tasks in DolphinScheduler are saved in the `t_ds_process_definition` table.
 
-The following shows the 't_ds_process_definition' table structure:
+The following shows the `t_ds_process_definition` table structure:
 
 No. | field  | type  |  description
 -------- | ---------| -------- | ---------
 1|id|int(11)|primary key
 2|name|varchar(255)|process definition name
 3|version|int(11)|process definition version
-4|release_state|tinyint(4)|release status of process definition: 0 not online, 
1 online
+4|release_state|tinyint(4)|release status of process definition: 0 not 
released, 1 released
 5|project_id|int(11)|project id
 6|user_id|int(11)|user id of the process definition
 7|process_definition_json|longtext|process definition JSON
@@ -26,10 +26,10 @@ No. | field  | type  |  description
 16|timeout|int(11) |timeout
 17|tenant_id|int(11) |tenant id
 18|update_time|datetime|update time
-19|modify_by|varchar(36)|specifics of the user that made the modification
+19|modify_by|varchar(36)|specify the user that made the modification
 20|resource_ids|varchar(255)|resource ids
 
-The 'process_definition_json' field is the core field, which defines the task 
information in the DAG diagram, and it is stored in JSON format.
+The `process_definition_json` field is the core field, which defines the task 
information in the DAG diagram, and it is stored in JSON format.
 
 The following table describes the common data structure.
 No. | field  | type  |  description
@@ -64,9 +64,9 @@ Data example:
 No.|parameter name||type|description |notes
 -------- | ---------| ---------| -------- | --------- | ---------
 1|id | |String| task Id|
-2|type ||String |task type |SHELL
+2|type | |String |task type |SHELL
 3| name| |String|task name |
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |rawScript |String| Shell script |
 6| | localParams| Array|customized local parameters||
 7| | resourceList| Array|resource files||
@@ -141,7 +141,7 @@ No.|parameter name||type|description |note
 1|id | |String|task id|
 2|type ||String |task type |SQL
 3| name| |String|task name|
-4| params| |Object|customized parameters|Json format
+4| params| |Object|customized parameters|JSON format
 5| |type |String |database type
 6| |datasource |Int |datasource id
 7| |sql |String |query SQL statement
@@ -150,10 +150,10 @@ No.|parameter name||type|description |note
 10| |title |String | mail title
 11| |receivers |String |receivers
 12| |receiversCc |String |CC receivers
-13| |showType | String|display type of mail|optionals: TABLE or ATTACHMENT
+13| |showType | String|display type of mail|options: TABLE or ATTACHMENT
 14| |connParams | String|connect parameters
 15| |preStatements | Array|preposition SQL statements
-16| | postStatements| Array|postposition SQL statements||
+16| | postStatements| Array|post-position SQL statements||
 17| | localParams| Array|customized parameters||
 18|description | |String|description | |
 19|runFlag | |String |execution flag| |
@@ -243,7 +243,7 @@ No.|parameter name||type|description |notes
 1|id | |String| task Id|
 2|type ||String |task type |SPARK
 3| name| |String|task name |
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |mainClass |String | main class
 6| |mainArgs | String| execution arguments
 7| |others | String| other arguments
@@ -341,7 +341,7 @@ No.|parameter name||type|description |notes
 1|id | |String| task Id|
 2|type ||String |task type |MR
 3| name| |String|task name |
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |mainClass |String | main class
 6| |mainArgs | String|execution arguments
 7| |others | String|other arguments
@@ -424,7 +424,7 @@ No.|parameter name||type|description |notes
 1|id | |String|  task Id|
 2|type ||String |task type|PYTHON
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |rawScript |String| Python script|
 6| | localParams| Array|customized local parameters||
 7| | resourceList| Array|resource files||
@@ -498,7 +498,7 @@ No.|parameter name||type|description |notes
 1|id | |String|task Id|
 2|type ||String |task type|FLINK
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |mainClass |String |main class
 6| |mainArgs | String|execution arguments
 7| |others | String|other arguments
@@ -593,7 +593,7 @@ No.|parameter name||type|description |notes
 1|id | |String|task Id|
 2|type ||String |task type|HTTP
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |url |String |request url
 6| |httpMethod | String|http method|GET,POST,HEAD,PUT,DELETE
 7| | httpParams| Array|http parameters
@@ -677,7 +677,7 @@ No.|parameter name||type|description |notes
 1|id | |String| task Id|
 2|type ||String |task type|DATAX
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |customConfig |Int |specify whether use customized config| 0 none 
customized, 1 customized
 6| |dsType |String | datasource type
 7| |dataSource |Int | datasource ID
@@ -688,7 +688,7 @@ No.|parameter name||type|description |notes
 12| |jobSpeedByte |Int |job speed limiting(bytes)
 13| |jobSpeedRecord | Int|job speed limiting(records)
 14| |preStatements | Array|preposition SQL
-15| | postStatements| Array|postposition SQL
+15| | postStatements| Array|post-position SQL
 16| | json| String|customized configs|valid if customConfig=1
 17| | localParams| Array|customized parameters|valid if customConfig=1
 18|description | |String|description| |
@@ -764,7 +764,7 @@ No.|parameter name||type|description |notes
 1|id | |String|task ID|
 2|type ||String |task type|SQOOP
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| | concurrency| Int|concurrency rate
 6| | modelType|String |flow direction|import,export
 7| |sourceType|String |datasource type|
@@ -903,7 +903,7 @@ No.|parameter name||type|description |notes
 1|id | |String| task ID|
 2|type ||String |task type|SHELL
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |processDefinitionId |Int| process definition ID
 6|description | |String|description | |
 7|runFlag | |String |execution flag| |
@@ -962,7 +962,7 @@ No.|parameter name||type|description |notes
 1|id | |String| task ID|
 2|type ||String |task type|DEPENDENT
 3| name| |String|task name|
-4| params| |Object|customized parameters |Json format
+4| params| |Object|customized parameters |JSON format
 5| |rawScript |String|Shell script|
 6| | localParams| Array|customized local parameters||
 7| | resourceList| Array|resource files||

[dolphinscheduler-website] branch master updated: proof writing documents under architecture directory (#718)

Reply via email to