This is an automated email from the ASF dual-hosted git repository.
lgcareer pushed a commit to branch master
in repository
https://gitbox.apache.org/repos/asf/incubator-dolphinscheduler-website.git
The following commit(s) were added to refs/heads/master by this push:
new 929f198 add en doc of 1.2.1
new e54c58e Merge pull request #92 from lgcareer/master
929f198 is described below
commit 929f1988813318f3798aa173d7a195cb7f10537d
Author: lgcareer <[email protected]>
AuthorDate: Wed Feb 26 16:07:43 2020 +0800
add en doc of 1.2.1
---
docs/en-us/1.2.1/user_doc/architecture-design.md | 316 ++++++++++
docs/en-us/1.2.1/user_doc/metadata-1.2.md | 174 ++++++
docs/en-us/1.2.1/user_doc/plugin-development.md | 54 ++
docs/en-us/1.2.1/user_doc/quick-start.md | 65 ++
docs/en-us/1.2.1/user_doc/system-manual.md | 738 +++++++++++++++++++++++
docs/en-us/1.2.1/user_doc/upgrade.md | 39 ++
site_config/site.js | 6 +-
7 files changed, 1389 insertions(+), 3 deletions(-)
diff --git a/docs/en-us/1.2.1/user_doc/architecture-design.md
b/docs/en-us/1.2.1/user_doc/architecture-design.md
new file mode 100644
index 0000000..cdc1c89
--- /dev/null
+++ b/docs/en-us/1.2.1/user_doc/architecture-design.md
@@ -0,0 +1,316 @@
+## Architecture Design
+Before explaining the architecture of the schedule system, let us first
understand the common nouns of the schedule system.
+
+### 1.Noun Interpretation
+
+**DAG:** Full name Directed Acyclic Graph,referred to as DAG。Tasks in the
workflow are assembled in the form of directed acyclic graphs, which are
topologically traversed from nodes with zero indegrees of ingress until there
are no successor nodes. For example, the following picture:
+
+<p align="center">
+ <img src="/img/dag_examples_cn.jpg" alt="dag示例" width="60%" />
+ <p align="center">
+ <em>dag example</em>
+ </p>
+</p>
+
+**Process definition**: Visualization **DAG** by dragging task nodes and
establishing associations of task nodes
+
+**Process instance**: A process instance is an instantiation of a process
definition, which can be generated by manual startup or scheduling. The
process definition runs once, a new process instance is generated
+
+**Task instance**: A task instance is the instantiation of a specific task
node when a process instance runs, which indicates the specific task execution
status
+
+**Task type**: Currently supports SHELL, SQL, SUB_PROCESS (sub-process),
PROCEDURE, MR, SPARK, PYTHON, DEPENDENT (dependency), and plans to support
dynamic plug-in extension, note: the sub-**SUB_PROCESS** is also A separate
process definition that can be launched separately
+
+**Schedule mode** : The system supports timing schedule and manual schedule
based on cron expressions. Command type support: start workflow, start
execution from current node, resume fault-tolerant workflow, resume pause
process, start execution from failed node, complement, timer, rerun, pause,
stop, resume waiting thread. Where **recovers the fault-tolerant workflow** and
**restores the waiting thread** The two command types are used by the
scheduling internal control and cannot be ca [...]
+
+**Timed schedule**: The system uses **quartz** distributed scheduler and
supports the generation of cron expression visualization
+
+**Dependency**: The system does not only support **DAG** Simple dependencies
between predecessors and successor nodes, but also provides **task
dependencies** nodes, support for **custom task dependencies between processes**
+
+**Priority**: Supports the priority of process instances and task instances.
If the process instance and task instance priority are not set, the default is
first in, first out.
+
+**Mail Alert**: Support **SQL Task** Query Result Email Send, Process Instance
Run Result Email Alert and Fault Tolerant Alert Notification
+
+**Failure policy**: For tasks running in parallel, if there are tasks that
fail, two failure policy processing methods are provided. **Continue** means
that the status of the task is run in parallel until the end of the process
failure. **End** means that once a failed task is found, Kill also drops the
running parallel task and the process ends.
+
+**Complement**: Complement historical data, support **interval parallel and
serial** two complement methods
+
+
+
+### 2.System architecture
+
+#### 2.1 System Architecture Diagram
+<p align="center">
+ <img src="/img/architecture.jpg" alt="System Architecture Diagram" />
+ <p align="center">
+ <em>System Architecture Diagram</em>
+ </p>
+</p>
+
+
+
+#### 2.2 Architectural description
+
+* **MasterServer**
+
+ MasterServer adopts the distributed non-central design concept.
MasterServer is mainly responsible for DAG task split, task submission
monitoring, and monitoring the health status of other MasterServer and
WorkerServer.
+ When the MasterServer service starts, it registers a temporary node with
Zookeeper, and listens to the Zookeeper temporary node state change for fault
tolerance processing.
+
+
+
+ ##### The service mainly contains:
+
+ - **Distributed Quartz** distributed scheduling component, mainly
responsible for the start and stop operation of the scheduled task. When the
quartz picks up the task, the master internally has a thread pool to be
responsible for the subsequent operations of the task.
+
+ - **MasterSchedulerThread** is a scan thread that periodically scans the
**command** table in the database for different business operations based on
different **command types**
+
+ - **MasterExecThread** is mainly responsible for DAG task segmentation,
task submission monitoring, logic processing of various command types
+
+ - **MasterTaskExecThread** is mainly responsible for task persistence
+
+
+
+* **WorkerServer**
+
+ - WorkerServer also adopts a distributed, non-central design concept.
WorkerServer is mainly responsible for task execution and providing log
services. When the WorkerServer service starts, it registers the temporary node
with Zookeeper and maintains the heartbeat.
+
+ ##### This service contains:
+
+ - **FetchTaskThread** is mainly responsible for continuously receiving
tasks from **Task Queue** and calling **TaskScheduleThread** corresponding
executors according to different task types.
+ - **LoggerServer** is an RPC service that provides functions such as
log fragment viewing, refresh and download.
+
+ - **ZooKeeper**
+
+ The ZooKeeper service, the MasterServer and the WorkerServer nodes in
the system all use the ZooKeeper for cluster management and fault tolerance. In
addition, the system also performs event monitoring and distributed locking
based on ZooKeeper.
+ We have also implemented queues based on Redis, but we hope that
DolphinScheduler relies on as few components as possible, so we finally removed
the Redis implementation.
+
+ - **Task Queue**
+
+ The task queue operation is provided. Currently, the queue is also
implemented based on Zookeeper. Since there is less information stored in the
queue, there is no need to worry about too much data in the queue. In fact, we
have over-measured a million-level data storage queue, which has no effect on
system stability and performance.
+
+ - **Alert**
+
+ Provides alarm-related interfaces. The interfaces mainly include
**Alarms**. The storage, query, and notification functions of the two types of
alarm data. The notification function has two types: **mail notification** and
**SNMP (not yet implemented)**.
+
+ - **API**
+
+ The API interface layer is mainly responsible for processing requests
from the front-end UI layer. The service provides a RESTful api to provide
request services externally.
+ Interfaces include workflow creation, definition, query, modification,
release, offline, manual start, stop, pause, resume, start execution from this
node, and more.
+
+ - **UI**
+
+ The front-end page of the system provides various visual operation
interfaces of the system. For details, see the <a
href="/en-us/docs/user_doc/system-manual.html" target="_self">System User
Manual</a> section.
+
+
+
+#### 2.3 Architectural Design Ideas
+
+##### I. Decentralized vs centralization
+
+###### Centralization Thought
+
+The centralized design concept is relatively simple. The nodes in the
distributed cluster are divided into two roles according to their roles:
+
+<p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/master_slave.png"
alt="master-slave role" width="50%" />
+ </p>
+
+- The role of Master is mainly responsible for task distribution and
supervising the health status of Slave. It can dynamically balance the task to
Slave, so that the Slave node will not be "busy" or "free".
+- The role of the Worker is mainly responsible for the execution of the task
and maintains the heartbeat with the Master so that the Master can assign tasks
to the Slave.
+
+Problems in the design of centralized :
+
+- Once the Master has a problem, the group has no leader and the entire
cluster will crash. In order to solve this problem, most Master/Slave
architecture modes adopt the design scheme of the master and backup masters,
which can be hot standby or cold standby, automatic switching or manual
switching, and more and more new systems are available. Automatically elects
the ability to switch masters to improve system availability.
+- Another problem is that if the Scheduler is on the Master, although it can
support different tasks in one DAG running on different machines, it will
generate overload of the Master. If the Scheduler is on the Slave, all tasks in
a DAG can only be submitted on one machine. If there are more parallel tasks,
the pressure on the Slave may be larger.
+
+###### Decentralization
+
+ <p align="center"
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/decentralization.png"
alt="decentralized" width="50%" />
+ </p>
+
+- In the decentralized design, there is usually no Master/Slave concept, all
roles are the same, the status is equal, the global Internet is a typical
decentralized distributed system, networked arbitrary node equipment down
machine , all will only affect a small range of features.
+- The core design of decentralized design is that there is no "manager" that
is different from other nodes in the entire distributed system, so there is no
single point of failure problem. However, since there is no "manager" node,
each node needs to communicate with other nodes to get the necessary machine
information, and the unreliable line of distributed system communication
greatly increases the difficulty of implementing the above functions.
+- In fact, truly decentralized distributed systems are rare. Instead, dynamic
centralized distributed systems are constantly emerging. Under this
architecture, the managers in the cluster are dynamically selected, rather than
preset, and when the cluster fails, the nodes of the cluster will spontaneously
hold "meetings" to elect new "managers". Go to preside over the work. The most
typical case is the Etcd implemented in ZooKeeper and Go.
+
+- Decentralization of DolphinScheduler is the registration of Master/Worker to
ZooKeeper. The Master Cluster and the Worker Cluster are not centered, and the
Zookeeper distributed lock is used to elect one Master or Worker as the
“manager” to perform the task.
+
+##### 二、Distributed lock practice
+
+DolphinScheduler uses ZooKeeper distributed locks to implement only one Master
to execute the Scheduler at the same time, or only one Worker to perform task
submission.
+
+1. The core process algorithm for obtaining distributed locks is as follows
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/distributed_lock.png"
alt="Get Distributed Lock Process" width="50%" />
+ </p>
+
+2. Scheduler thread distributed lock implementation flow chart in
DolphinScheduler:
+
+ <p align="center">
+ <img src="/img/distributed_lock_procss.png" alt="Get Distributed Lock
Process" width="50%" />
+ </p>
+
+##### Third, the thread is insufficient loop waiting problem
+
+- If there is no subprocess in a DAG, if the number of data in the Command is
greater than the threshold set by the thread pool, the direct process waits or
fails.
+- If a large number of sub-processes are nested in a large DAG, the following
figure will result in a "dead" state:
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/lack_thread.png"
alt="Thread is not enough to wait for loop" width="50%" />
+ </p>
+
+In the above figure, MainFlowThread waits for SubFlowThread1 to end,
SubFlowThread1 waits for SubFlowThread2 to end, SubFlowThread2 waits for
SubFlowThread3 to end, and SubFlowThread3 waits for a new thread in the thread
pool, then the entire DAG process cannot end, and thus the thread cannot be
released. This forms the state of the child parent process loop waiting. At
this point, the scheduling cluster will no longer be available unless a new
Master is started to add threads to break s [...]
+
+It seems a bit unsatisfactory to start a new Master to break the deadlock, so
we proposed the following three options to reduce this risk:
+
+1. Calculate the sum of the threads of all Masters, and then calculate the
number of threads required for each DAG, that is, pre-calculate before the DAG
process is executed. Because it is a multi-master thread pool, the total number
of threads is unlikely to be obtained in real time.
+2. Judge the single master thread pool. If the thread pool is full, let the
thread fail directly.
+3. Add a Command type with insufficient resources. If the thread pool is
insufficient, the main process will be suspended. This way, the thread pool has
a new thread, which can make the process with insufficient resources hang up
and wake up again.
+
+Note: The Master Scheduler thread is FIFO-enabled when it gets the Command.
+
+So we chose the third way to solve the problem of insufficient threads.
+
+##### IV. Fault Tolerant Design
+
+Fault tolerance is divided into service fault tolerance and task retry.
Service fault tolerance is divided into two types: Master Fault Tolerance and
Worker Fault Tolerance.
+
+###### 1. Downtime fault tolerance
+
+Service fault tolerance design relies on ZooKeeper's Watcher mechanism. The
implementation principle is as follows:
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant.png"
alt="DolphinScheduler Fault Tolerant Design" width="40%" />
+ </p>
+
+The Master monitors the directories of other Masters and Workers. If the
remove event is detected, the process instance is fault-tolerant or the task
instance is fault-tolerant according to the specific business logic.
+
+
+
+- Master fault tolerance flow chart:
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant_master.png"
alt="Master Fault Tolerance Flowchart" width="40%" />
+ </p>
+
+After the ZooKeeper Master is fault-tolerant, it is rescheduled by the
Scheduler thread in DolphinScheduler. It traverses the DAG to find the
"Running" and "Submit Successful" tasks, and monitors the status of its task
instance for the "Running" task. You need to determine whether the Task Queue
already exists. If it exists, monitor the status of the task instance. If it
does not exist, resubmit the task instance.
+
+
+
+- Worker fault tolerance flow chart:
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/fault-tolerant_worker.png"
alt="Worker Fault Tolerance Flowchart" width="40%" />
+ </p>
+
+Once the Master Scheduler thread finds the task instance as "need to be fault
tolerant", it takes over the task and resubmits.
+
+ Note: Because the "network jitter" may cause the node to lose the heartbeat
of ZooKeeper in a short time, the node's remove event occurs. In this case, we
use the easiest way, that is, once the node has timeout connection with
ZooKeeper, it will directly stop the Master or Worker service.
+
+###### 2. Task failure retry
+
+Here we must first distinguish between the concept of task failure retry,
process failure recovery, and process failure rerun:
+
+- Task failure Retry is task level, which is automatically performed by the
scheduling system. For example, if a shell task sets the number of retries to 3
times, then the shell task will try to run up to 3 times after failing to run.
+- Process failure recovery is process level, is done manually, recovery can
only be performed **from the failed node** or **from the current node**
+- Process failure rerun is also process level, is done manually, rerun is from
the start node
+
+
+
+Next, let's talk about the topic, we divided the task nodes in the workflow
into two types.
+
+- One is a business node, which corresponds to an actual script or processing
statement, such as a Shell node, an MR node, a Spark node, a dependent node,
and so on.
+- There is also a logical node, which does not do the actual script or
statement processing, but the logical processing of the entire process flow,
such as sub-flow sections.
+
+Each **service node** can configure the number of failed retries. When the
task node fails, it will automatically retry until it succeeds or exceeds the
configured number of retries. **Logical node** does not support failed retry.
But the tasks in the logical nodes support retry.
+
+If there is a task failure in the workflow that reaches the maximum number of
retries, the workflow will fail to stop, and the failed workflow can be
manually rerun or process resumed.
+
+
+
+##### V. Task priority design
+
+In the early scheduling design, if there is no priority design and fair
scheduling design, it will encounter the situation that the task submitted
first may be completed simultaneously with the task submitted subsequently, but
the priority of the process or task cannot be set. We have redesigned this, and
we are currently designing it as follows:
+
+- According to **different process instance priority** prioritizes **same
process instance priority** prioritizes **task priority within the same
process** takes precedence over **same process** commit order from high Go to
low for task processing.
+
+ - The specific implementation is to resolve the priority according to the
json of the task instance, and then save the **process instance priority _
process instance id_task priority _ task id** information in the ZooKeeper task
queue, when obtained from the task queue, Through string comparison, you can
get the task that needs to be executed first.
+
+ - The priority of the process definition is that some processes need to be
processed before other processes. This can be configured at the start of the
process or at the time of scheduled start. There are 5 levels, followed by
HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/process_priority.png"
alt="Process Priority Configuration" width="40%" />
+ </p>
+
+ - The priority of the task is also divided into 5 levels, followed by
HIGHEST, HIGH, MEDIUM, LOW, and LOWEST. As shown below
+
+ <p align="center">
+ <img
src="https://analysys.github.io/easyscheduler_docs_cn/images/task_priority.png"
alt="task priority configuration" width="35%" />
+ </p>
+
+##### VI. Logback and gRPC implement log access
+
+- Since the Web (UI) and Worker are not necessarily on the same machine,
viewing the log is not as it is for querying local files. There are two options:
+ - Put the logs on the ES search engine
+ - Obtain remote log information through gRPC communication
+- Considering the lightweightness of DolphinScheduler as much as possible,
gRPC was chosen to implement remote access log information.
+
+ <p align="center">
+ <img src="https://analysys.github.io/easyscheduler_docs_cn/images/grpc.png"
alt="grpc remote access" width="50%" />
+ </p>
+
+- We use a custom Logback FileAppender and Filter function to generate a log
file for each task instance.
+- The main implementation of FileAppender is as follows:
+
+```java
+ /**
+ * task log appender
+ */
+ Public class TaskLogAppender extends FileAppender<ILoggingEvent {
+
+ ...
+
+ @Override
+ Protected void append(ILoggingEvent event) {
+
+ If (currentlyActiveFile == null){
+ currentlyActiveFile = getFile();
+ }
+ String activeFile = currentlyActiveFile;
+ // thread name:
taskThreadName-processDefineId_processInstanceId_taskInstanceId
+ String threadName = event.getThreadName();
+ String[] threadNameArr = threadName.split("-");
+ // logId = processDefineId_processInstanceId_taskInstanceId
+ String logId = threadNameArr[1];
+ ...
+ super.subAppend(event);
+ }
+}
+```
+
+Generate a log in the form of /process definition id/process instance id/task
instance id.log
+
+- Filter matches the thread name starting with TaskLogInfo:
+- TaskLogFilter is implemented as follows:
+
+```java
+ /**
+ * task log filter
+ */
+Public class TaskLogFilter extends Filter<ILoggingEvent {
+
+ @Override
+ Public FilterReply decide(ILoggingEvent event) {
+ If (event.getThreadName().startsWith("TaskLogInfo-")){
+ Return FilterReply.ACCEPT;
+ }
+ Return FilterReply.DENY;
+ }
+}
+```
+
+
+
+### summary
+
+Starting from the scheduling, this paper introduces the architecture principle
and implementation ideas of the big data distributed workflow scheduling
system-DolphinScheduler. To be continued
diff --git a/docs/en-us/1.2.1/user_doc/metadata-1.2.md
b/docs/en-us/1.2.1/user_doc/metadata-1.2.md
new file mode 100644
index 0000000..2d706f9
--- /dev/null
+++ b/docs/en-us/1.2.1/user_doc/metadata-1.2.md
@@ -0,0 +1,174 @@
+# Dolphin Scheduler 1.2 MetaData
+
+<a name="V5KOl"></a>
+### Dolphin Scheduler 1.2 DB Table Overview
+| Table Name | Comment |
+| :---: | :---: |
+| t_ds_access_token | token for access ds backend |
+| t_ds_alert | alert detail |
+| t_ds_alertgroup | alert group |
+| t_ds_command | command detail |
+| t_ds_datasource | data source |
+| t_ds_error_command | error command detail |
+| t_ds_process_definition | process difinition |
+| t_ds_process_instance | process instance |
+| t_ds_project | project |
+| t_ds_queue | queue |
+| t_ds_relation_datasource_user | datasource related to user |
+| t_ds_relation_process_instance | sub process |
+| t_ds_relation_project_user | project related to user |
+| t_ds_relation_resources_user | resource related to user |
+| t_ds_relation_udfs_user | UDF related to user |
+| t_ds_relation_user_alertgroup | alert group related to user |
+| t_ds_resources | resoruce center file |
+| t_ds_schedules | process difinition schedule |
+| t_ds_session | user login session |
+| t_ds_task_instance | task instance |
+| t_ds_tenant | tenant |
+| t_ds_udfs | UDF resource |
+| t_ds_user | user detail |
+| t_ds_version | ds version |
+| t_ds_worker_group | worker group |
+
+
+---
+
+<a name="XCLy1"></a>
+### E-R Diagram
+<a name="5hWWZ"></a>
+#### User Queue DataSource
+
+
+- Multiple users can belong to one tenant
+- The queue field in t_ds_user table stores the queue_name information in
t_ds_queue table, but t_ds_tenant stores queue infomation using queue_id.
During the execution of the process definition, the user queue has the highest
priority. If the user queue is empty, the tenant queue is used.
+- The user_id field in the t_ds_datasource table indicates the user who
created the data source. The user_id in t_ds_relation_datasource_user indicates
the user who has permission to the data source.
+<a name="7euSN"></a>
+#### Project Resource Alert
+
+
+- User can have multiple projects, User project authorization completes the
relationship binding using project_id and user_id in t_ds_relation_project_user
table
+- The user_id in the t_ds_projcet table represents the user who created the
project, and the user_id in the t_ds_relation_project_user table represents
users who have permission to the project
+- The user_id in the t_ds_resources table represents the user who created the
resource, and the user_id in t_ds_relation_resources_user represents the user
who has permissions to the resource
+- The user_id in the t_ds_udfs table represents the user who created the UDF,
and the user_id in the t_ds_relation_udfs_user table represents a user who has
permission to the UDF
+<a name="JEw4v"></a>
+#### Command Process Task
+<br
/>
+
+- A project has multiple process definitions, a process definition can
generate multiple process instances, and a process instance can generate
multiple task instances
+- The t_ds_schedulers table stores the timing schedule information for process
difinition
+- The data stored in the t_ds_relation_process_instance table is used to deal
with that the process definition contains sub-processes,
parent_process_instance_id field represents the id of the main process instance
containing the child process, process_instance_id field represents the id of
the sub-process instance, parent_task_instance_id field represents the task
instance id of the sub-process node
+- The process instance table and the task instance table correspond to the
t_ds_process_instance table and the t_ds_task_instance table, respectively.
+
+---
+
+<a name="yd79T"></a>
+### Core Table Schema
+<a name="6bVhH"></a>
+#### t_ds_process_definition
+| Field | Type | Comment |
+| --- | --- | --- |
+| id | int | primary key |
+| name | varchar | process definition name |
+| version | int | process definition version |
+| release_state | tinyint | process definition release
state:0:offline,1:online |
+| project_id | int | project id |
+| user_id | int | process definition creator id |
+| process_definition_json | longtext | process definition json content |
+| description | text | process difinition desc |
+| global_params | text | global parameters |
+| flag | tinyint | process is available: 0 not available, 1 available |
+| locations | text | Node location information |
+| connects | text | Node connection information |
+| receivers | text | receivers |
+| receivers_cc | text | carbon copy list |
+| create_time | datetime | create time |
+| timeout | int | timeout |
+| tenant_id | int | tenant id |
+| update_time | datetime | update time |
+
+<a name="t5uxM"></a>
+#### t_ds_process_instance
+| Field | Type | Comment |
+| --- | --- | --- |
+| id | int | primary key |
+| name | varchar | process instance name |
+| process_definition_id | int | process definition id |
+| state | tinyint | process instance Status: 0 commit succeeded, 1 running, 2
prepare to pause, 3 pause, 4 prepare to stop, 5 stop, 6 fail, 7 succeed, 8 need
fault tolerance, 9 kill, 10 wait for thread, 11 wait for dependency to complete
|
+| recovery | tinyint | process instance failover flag:0:normal,1:failover
instance |
+| start_time | datetime | process instance start time |
+| end_time | datetime | process instance end time |
+| run_times | int | process instance run times |
+| host | varchar | process instance host |
+| command_type | tinyint | command type:0 start ,1 Start from the current
node,2 Resume a fault-tolerant process,3 Resume Pause Process, 4 Execute from
the failed node,5 Complement, 6 dispatch, 7 re-run, 8 pause, 9 stop ,10 Resume
waiting thread |
+| command_param | text | json command parameters |
+| task_depend_type | tinyint | task depend type. 0: only current node,1:before
the node,2:later nodes |
+| max_try_times | tinyint | max try times |
+| failure_strategy | tinyint | failure strategy. 0:end the process when node
failed,1:continue running the other nodes when node failed |
+| warning_type | tinyint | warning type. 0:no warning,1:warning if process
success,2:warning if process failed,3:warning if success |
+| warning_group_id | int | warning group id |
+| schedule_time | datetime | schedule time |
+| command_start_time | datetime | command start time |
+| global_params | text | global parameters |
+| process_instance_json | longtext | process instance json(copy的process
definition 的json) |
+| flag | tinyint | process instance is available: 0 not available, 1 available
|
+| update_time | timestamp | update time |
+| is_sub_process | int | whether the process is sub process: 1 sub-process,0
not sub-process |
+| executor_id | int | executor id |
+| locations | text | Node location information |
+| connects | text | Node connection information |
+| history_cmd | text | history commands of process instance operation |
+| dependence_schedule_times | text | depend schedule fire time |
+| process_instance_priority | int | process instance priority. 0 Highest,1
High,2 Medium,3 Low,4 Lowest |
+| worker_group_id | int | worker group id |
+| timeout | int | time out |
+| tenant_id | int | tenant id |
+
+<a name="tHZsY"></a>
+#### t_ds_task_instance
+| Field | Type | Comment |
+| --- | --- | --- |
+| id | int | primary key |
+| name | varchar | task name |
+| task_type | varchar | task type |
+| process_definition_id | int | process definition id |
+| process_instance_id | int | process instance id |
+| task_json | longtext | task content json |
+| state | tinyint | Status: 0 commit succeeded, 1 running, 2 prepare to pause,
3 pause, 4 prepare to stop, 5 stop, 6 fail, 7 succeed, 8 need fault tolerance,
9 kill, 10 wait for thread, 11 wait for dependency to complete |
+| submit_time | datetime | task submit time |
+| start_time | datetime | task start time |
+| end_time | datetime | task end time |
+| host | varchar | host of task running on |
+| execute_path | varchar | task execute path in the host |
+| log_path | varchar | task log path |
+| alert_flag | tinyint | whether alert |
+| retry_times | int | task retry times |
+| pid | int | pid of task |
+| app_link | varchar | yarn app id |
+| flag | tinyint | taskinstance is available: 0 not available, 1 available |
+| retry_interval | int | retry interval when task failed |
+| max_retry_times | int | max retry times |
+| task_instance_priority | int | task instance priority:0 Highest,1 High,2
Medium,3 Low,4 Lowest |
+| worker_group_id | int | worker group id |
+
+<a name="gLGtm"></a>
+#### t_ds_command
+| Field | Type | Comment |
+| --- | --- | --- |
+| id | int | primary key |
+| command_type | tinyint | Command type: 0 start workflow, 1 start execution
from current node, 2 resume fault-tolerant workflow, 3 resume pause process, 4
start execution from failed node, 5 complement, 6 schedule, 7 rerun, 8 pause, 9
stop, 10 resume waiting thread |
+| process_definition_id | int | process definition id |
+| command_param | text | json command parameters |
+| task_depend_type | tinyint | Node dependency type: 0 current node, 1
forward, 2 backward |
+| failure_strategy | tinyint | Failed policy: 0 end, 1 continue |
+| warning_type | tinyint | Alarm type: 0 is not sent, 1 process is sent
successfully, 2 process is sent failed, 3 process is sent successfully and all
failures are sent |
+| warning_group_id | int | warning group |
+| schedule_time | datetime | schedule time |
+| start_time | datetime | start time |
+| executor_id | int | executor id |
+| dependence | varchar | dependence |
+| update_time | datetime | update time |
+| process_instance_priority | int | process instance priority: 0 Highest,1
High,2 Medium,3 Low,4 Lowest |
+| worker_group_id | int | worker group id |
+
+
+
diff --git a/docs/en-us/1.2.1/user_doc/plugin-development.md
b/docs/en-us/1.2.1/user_doc/plugin-development.md
new file mode 100644
index 0000000..eda2d82
--- /dev/null
+++ b/docs/en-us/1.2.1/user_doc/plugin-development.md
@@ -0,0 +1,54 @@
+## Task Plugin Development
+
+Remind:Currently, task plugin development does not support hot deployment.
+
+### Shell-based tasks
+
+#### YARN-based calculations (see MapReduceTask)
+
+- Need to be **cn.dolphinscheduler.server.worker.task** Down **TaskManager**
Create a custom task in the class (also need to register the corresponding task
type in TaskType)
+- Need to inherit**cn.dolphinscheduler.server.worker.task** Down
**AbstractYarnTask**
+- Constructor Scheduling **AbstractYarnTask** Construction method
+- Inherit **AbstractParameters** Custom task parameter entity
+- Rewrite **AbstractTask** of **init** Parsing in method**Custom task
parameters**
+- Rewrite **buildCommand** Encapsulation command
+
+
+
+#### Non-YARN-based calculations (see ShellTask)
+- Need to be **cn.dolphinscheduler.server.worker.task** Down **TaskManager** A
custom task
+
+- Need to inherit**cn.dolphinscheduler.server.worker.task** Down
**AbstractTask**
+
+- Instantiation in constructor **ShellCommandExecutor**
+
+ ```
+ public ShellTask(TaskProps props, Logger logger) {
+ super(props, logger);
+
+ this.taskDir = props.getTaskDir();
+
+ this.processTask = new ShellCommandExecutor(this::logHandle,
+ props.getTaskDir(), props.getTaskAppId(),
+ props.getTenantCode(), props.getEnvFile(), props.getTaskStartTime(),
+ props.getTaskTimeout(), logger);
+ this.processDao = DaoFactory.getDaoInstance(ProcessDao.class);
+ }
+ ```
+
+ Incoming custom tasks **TaskProps**And custom**Logger**,TaskProps
Encapsulate task information, Logger is installed with custom log information
+
+- Inherit **AbstractParameters** Custom task parameter entity
+
+- Rewrite **AbstractTask** of **init** Parsing in method**Custom task
parameter entity**
+
+- Rewrite **handle** method,transfer **ShellCommandExecutor** of **run**
method,The first parameter is passed in**command**,Pass the second parameter to
ProcessDao and set the corresponding **exitStatusCode**
+
+### Non-SHELL-based tasks (see SqlTask)
+
+- Need to be **cn.dolphinscheduler.server.worker.task** Down **TaskManager** A
custom task
+- Need to inherit**cn.dolphinscheduler.server.worker.task** Down
**AbstractTask**
+- Inherit **AbstractParameters** Custom task parameter entity
+- Constructor or override **AbstractTask** of **init** in the method, parse
the custom task parameter entity
+- Rewrite **handle** Methods to implement business logic and set the
corresponding**exitStatusCode**
+
diff --git a/docs/en-us/1.2.1/user_doc/quick-start.md
b/docs/en-us/1.2.1/user_doc/quick-start.md
new file mode 100644
index 0000000..7e4ac7d
--- /dev/null
+++ b/docs/en-us/1.2.1/user_doc/quick-start.md
@@ -0,0 +1,65 @@
+# Quick Start
+
+* Administrator user login
+
+ > Address:192.168.xx.xx:8888 Username and password:admin/dolphinscheduler123
+
+<p align="center">
+ <img src="/img/login_en.png" width="60%" />
+ </p>
+
+* Create queue
+
+<p align="center">
+ <img src="/img/create-queue-en.png" width="60%" />
+ </p>
+
+ * Create tenant
+ <p align="center">
+ <img src="/img/create-tenant-en.png" width="60%" />
+ </p>
+
+ * Creating Ordinary Users
+<p align="center">
+ <img src="/img/create-user-en.png" width="60%" />
+ </p>
+
+ * Create an alarm group
+
+ <p align="center">
+ <img src="/img/alarm-group-en.png" width="60%" />
+ </p>
+
+
+ * Create an worker group
+
+ <p align="center">
+ <img src="/img/worker-group-en.png" width="60%" />
+ </p>
+
+ * Create an token
+
+ <p align="center">
+ <img src="/img/token-en.png" width="60%" />
+ </p>
+
+
+ * Log in with regular users
+ > Click on the user name in the upper right corner to "exit" and re-use the
normal user login.
+
+ * Project Management - > Create Project - > Click on Project Name
+<p align="center">
+ <img src="/img/create_project_en.png" width="60%" />
+ </p>
+
+ * Click Workflow Definition - > Create Workflow Definition - > Online
Process Definition
+
+<p align="center">
+ <img src="/img/process_definition_en.png" width="60%" />
+ </p>
+
+ * Running Process Definition - > Click Workflow Instance - > Click Process
Instance Name - > Double-click Task Node - > View Task Execution Log
+
+ <p align="center">
+ <img src="/img/log_en.png" width="60%" />
+</p>
diff --git a/docs/en-us/1.2.1/user_doc/system-manual.md
b/docs/en-us/1.2.1/user_doc/system-manual.md
new file mode 100644
index 0000000..29492b9
--- /dev/null
+++ b/docs/en-us/1.2.1/user_doc/system-manual.md
@@ -0,0 +1,738 @@
+# System Use Manual
+
+## Operational Guidelines
+
+### Home page
+The homepage contains task status statistics, process status statistics, and
workflow definition statistics for all user projects.
+
+<p align="center">
+ <img src="/img/home_en.png" width="80%" />
+ </p>
+
+### Create a project
+
+ - Click "Project - > Create Project", enter project name, description, and
click "Submit" to create a new project.
+ - Click on the project name to enter the project home page.
+<p align="center">
+ <img src="/img/project_home_en.png" width="80%" />
+ </p>
+
+> The project home page contains task status statistics, process status
statistics, and workflow definition statistics for the project.
+
+ - Task State Statistics: It refers to the statistics of the number of tasks
to be run, failed, running, completed and succeeded in a given time frame.
+ - Process State Statistics: It refers to the statistics of the number of
waiting, failing, running, completing and succeeding process instances in a
specified time range.
+ - Process Definition Statistics: The process definition created by the user
and the process definition granted by the administrator to the user are counted.
+
+
+### Creating Process definitions
+ - Go to the project home page, click "Process definitions" and enter the
list page of process definition.
+ - Click "Create process" to create a new process definition.
+ - Drag the "SHELL" node to the canvas and add a shell task.
+ - Fill in the Node Name, Description, and Script fields.
+ - Selecting "task priority" will give priority to high-level tasks in the
execution queue. Tasks with the same priority will be executed in the
first-in-first-out order.
+ - Timeout alarm. Fill in "Overtime Time". When the task execution time
exceeds the overtime, it can alarm and fail over time.
+ - Fill in "Custom Parameters" and refer to [Custom
Parameters](#CustomParameters)
+ <p align="center">
+ <img src="/img/process_definitions_en.png" width="80%" />
+ </p>
+ - Increase the order of execution between nodes: click "line connection". As
shown, task 2 and task 3 are executed in parallel. When task 1 is executed,
task 2 and task 3 are executed simultaneously.
+
+<p align="center">
+ <img src="/img/task_en.png" width="80%" />
+ </p>
+
+ - Delete dependencies: Click on the arrow icon to "drag nodes and select
items", select the connection line, click on the delete icon to delete
dependencies between nodes.
+<p align="center">
+ <img src="/img/delete_dependencies_en.png" width="80%" />
+ </p>
+
+ - Click "Save", enter the name of the process definition, the description of
the process definition, and set the global parameters.
+
+<p align="center">
+ <img src="/img/global_parameters_en.png" width="80%" />
+ </p>
+
+ - For other types of nodes, refer to [task node types and parameter
settings](#TaskNodeType)
+
+### Execution process definition
+ - **The process definition of the off-line state can be edited, but not
run**, so the on-line workflow is the first step.
+ > Click on the Process definition, return to the list of process
definitions, click on the icon "online", online process definition.
+
+ > Before setting workflow offline, the timed tasks in timed management
should be offline, so that the definition of workflow can be set offline
successfully.
+
+ - Click "Run" to execute the process. Description of operation parameters:
+ * Failure strategy:**When a task node fails to execute, other parallel
task nodes need to execute the strategy**。”Continue "Representation: Other task
nodes perform normally", "End" Representation: Terminate all ongoing tasks and
terminate the entire process.
+ * Notification strategy:When the process is over, send process execution
information notification mail according to the process status.
+ * Process priority: The priority of process running is divided into five
levels:the highest, the high, the medium, the low, and the lowest . High-level
processes are executed first in the execution queue, and processes with the
same priority are executed first in first out order.
+ * Worker group: This process can only be executed in a specified machine
group. Default, by default, can be executed on any worker.
+ * Notification group: When the process ends or fault tolerance occurs,
process information is sent to all members of the notification group by mail.
+ * Recipient: Enter the mailbox and press Enter key to save. When the
process ends and fault tolerance occurs, an alert message is sent to the
recipient list.
+ * Cc: Enter the mailbox and press Enter key to save. When the process is
over and fault-tolerant occurs, alarm messages are copied to the copier list.
+
+<p align="center">
+ <img src="/img/start-process-en.png" width="80%" />
+ </p>
+
+ * Complement: To implement the workflow definition of a specified date, you
can select the time range of the complement (currently only support for
continuous days), such as the data from May 1 to May 10, as shown in the figure:
+
+<p align="center">
+ <img src="/img/complement-en.png" width="80%" />
+ </p>
+
+> Complement execution mode includes serial execution and parallel execution.
In serial mode, the complement will be executed sequentially from May 1 to May
10. In parallel mode, the tasks from May 1 to May 10 will be executed
simultaneously.
+
+### Timing Process Definition
+ - Create Timing: "Process Definition - > Timing"
+ - Choose start-stop time, in the start-stop time range, regular normal work,
beyond the scope, will not continue to produce timed workflow instances.
+
+<p align="center">
+ <img src="/img/timing-en.png" width="80%" />
+ </p>
+
+ - Add a timer to be executed once a day at 5:00 a.m. as shown below:
+<p align="center">
+ <img src="/img/timer-en.png" width="80%" />
+ </p>
+
+ - Timely online,**the newly created timer is offline. You need to click
"Timing Management - >online" to work properly.**
+
+### View process instances
+ > Click on "Process Instances" to view the list of process instances.
+
+ > Click on the process name to see the status of task execution.
+
+ <p align="center">
+ <img src="/img/process-instances-en.png" width="80%" />
+ </p>
+
+ > Click on the task node, click "View Log" to view the task execution log.
+
+ <p align="center">
+ <img src="/img/view-log-en.png" width="80%" />
+ </p>
+
+ > Click on the task instance node, click **View History** to view the list of
task instances that the process instance runs.
+
+ <p align="center">
+ <img src="/img/instance-runs-en.png" width="80%" />
+ </p>
+
+
+ > Operations on workflow instances:
+
+<p align="center">
+ <img src="/img/workflow-instances-en.png" width="80%" />
+</p>
+
+ * Editor: You can edit the terminated process. When you save it after
editing, you can choose whether to update the process definition or not.
+ * Rerun: A process that has been terminated can be re-executed.
+ * Recovery failure: For a failed process, a recovery failure operation can
be performed, starting at the failed node.
+ * Stop: Stop the running process, the background will `kill` he worker
process first, then `kill -9` operation.
+ * Pause:The running process can be **suspended**, the system state becomes
**waiting to be executed**, waiting for the end of the task being executed, and
suspending the next task to be executed.
+ * Restore pause: **The suspended process** can be restored and run directly
from the suspended node
+ * Delete: Delete process instances and task instances under process instances
+ * Gantt diagram: The vertical axis of Gantt diagram is the topological
ordering of task instances under a process instance, and the horizontal axis is
the running time of task instances, as shown in the figure:
+<p align="center">
+ <img src="/img/gantt-en.png" width="80%" />
+</p>
+
+### View task instances
+ > Click on "Task Instance" to enter the Task List page and query the
performance of the task.
+ >
+ >
+
+<p align="center">
+ <img src="/img/task-instances-en.png" width="80%" />
+</p>
+
+ > Click "View Log" in the action column to view the log of task execution.
+
+<p align="center">
+ <img src="/img/task-execution-en.png" width="80%" />
+</p>
+
+### Create data source
+ > Data Source Center supports MySQL, POSTGRESQL, HIVE and Spark data sources.
+
+#### Create and edit MySQL data source
+
+ - Click on "Datasource - > Create Datasources" to create different types of
datasources according to requirements.
+- Datasource: Select MYSQL
+- Datasource Name: Name of Input Datasource
+- Description: Description of input datasources
+- IP: Enter the IP to connect to MySQL
+- Port: Enter the port to connect MySQL
+- User name: Set the username to connect to MySQL
+- Password: Set the password to connect to MySQL
+- Database name: Enter the name of the database connecting MySQL
+- Jdbc connection parameters: parameter settings for MySQL connections, filled
in as JSON
+
+<p align="center">
+ <img src="/img/mysql-en.png" width="80%" />
+ </p>
+
+ > Click "Test Connect" to test whether the data source can be successfully
connected.
+ >
+ >
+
+#### Create and edit POSTGRESQL data source
+
+- Datasource: Select POSTGRESQL
+- Datasource Name: Name of Input Data Source
+- Description: Description of input data sources
+- IP: Enter IP to connect to POSTGRESQL
+- Port: Input port to connect POSTGRESQL
+- Username: Set the username to connect to POSTGRESQL
+- Password: Set the password to connect to POSTGRESQL
+- Database name: Enter the name of the database connecting to POSTGRESQL
+- Jdbc connection parameters: parameter settings for POSTGRESQL connections,
filled in as JSON
+
+<p align="center">
+ <img src="/img/create-datasource-en.png" width="80%" />
+ </p>
+
+#### Create and edit HIVE data source
+
+1.Connect with HiveServer 2
+
+ <p align="center">
+ <img src="/img/hive-en.png" width="80%" />
+ </p>
+
+ - Datasource: Select HIVE
+- Datasource Name: Name of Input Datasource
+- Description: Description of input datasources
+- IP: Enter IP to connect to HIVE
+- Port: Input port to connect to HIVE
+- Username: Set the username to connect to HIVE
+- Password: Set the password to connect to HIVE
+- Database Name: Enter the name of the database connecting to HIVE
+- Jdbc connection parameters: parameter settings for HIVE connections, filled
in in as JSON
+
+2.Connect using Hive Server 2 HA Zookeeper mode
+
+ <p align="center">
+ <img src="/img/zookeeper-en.png" width="80%" />
+ </p>
+
+
+Note: If **kerberos** is turned on, you need to fill in **Principal**
+<p align="center">
+ <img src="/img/principal-en.png" width="80%" />
+ </p>
+
+
+
+
+#### Create and Edit Spark Datasource
+
+<p align="center">
+ <img src="/img/edit-datasource-en.png" width="80%" />
+ </p>
+
+- Datasource: Select Spark
+- Datasource Name: Name of Input Datasource
+- Description: Description of input datasources
+- IP: Enter the IP to connect to Spark
+- Port: Input port to connect Spark
+- Username: Set the username to connect to Spark
+- Password: Set the password to connect to Spark
+- Database name: Enter the name of the database connecting to Spark
+- Jdbc Connection Parameters: Parameter settings for Spark Connections, filled
in as JSON
+
+
+
+Note: If **kerberos** If Kerberos is turned on, you need to fill in
**Principal**
+
+<p align="center">
+ <img src="/img/kerberos-en.png" width="80%" />
+ </p>
+
+### Upload Resources
+ - Upload resource files and udf functions, all uploaded files and resources
will be stored on hdfs, so the following configuration items are required:
+
+```
+conf/common/common.properties
+ # Users who have permission to create directories under the HDFS root path
+ hdfs.root.user=hdfs
+ # data base dir, resource file will store to this hadoop hdfs path, self
configuration, please make sure the directory exists on hdfs and have read
write permissions。"/escheduler" is recommended
+ data.store2hdfs.basepath=/dolphinscheduler
+ # resource upload startup type : HDFS,S3,NONE
+ res.upload.startup.type=HDFS
+ # whether kerberos starts
+ hadoop.security.authentication.startup.state=false
+ # java.security.krb5.conf path
+ java.security.krb5.conf.path=/opt/krb5.conf
+ # loginUserFromKeytab user
+ [email protected]
+ # loginUserFromKeytab path
+ login.user.keytab.path=/opt/hdfs.headless.keytab
+
+conf/common/hadoop.properties
+ # ha or single namenode,If namenode ha needs to copy core-site.xml and
hdfs-site.xml
+ # to the conf directory,support s3,for example : s3a://dolphinscheduler
+ fs.defaultFS=hdfs://mycluster:8020
+ #resourcemanager ha note this need ips , this empty if single
+ yarn.resourcemanager.ha.rm.ids=192.168.xx.xx,192.168.xx.xx
+ # If it is a single resourcemanager, you only need to configure one host
name. If it is resourcemanager HA, the default configuration is fine
+ yarn.application.status.address=http://xxxx:8088/ws/v1/cluster/apps/%s
+
+```
+- yarn.resourcemanager.ha.rm.ids and yarn.application.status.address only need
to configure one address, and the other address is empty.
+- You need to copy core-site.xml and hdfs-site.xml from the conf directory of
the Hadoop cluster to the conf directory of the dolphinscheduler project and
restart the api-server service.
+
+#### File Manage
+
+ > It is the management of various resource files, including creating basic
txt/log/sh/conf files, uploading jar packages and other types of files,
editing, downloading, deleting and other operations.
+ >
+ >
+ > <p align="center">
+ > <img src="/img/file-manage-en.png" width="80%" />
+ > </p>
+
+ * Create file
+ > File formats support the following
types:txt、log、sh、conf、cfg、py、java、sql、xml、hql
+
+<p align="center">
+ <img src="/img/create-file.png" width="80%" />
+ </p>
+
+ * Upload Files
+
+> Upload Files: Click the Upload button to upload, drag the file to the upload
area, and the file name will automatically complete the uploaded file name.
+
+<p align="center">
+ <img src="/img/file-upload-en.png" width="80%" />
+ </p>
+
+
+ * File View
+
+> For viewable file types, click on the file name to view file details
+
+<p align="center">
+ <img src="/img/file-view-en.png" width="80%" />
+ </p>
+
+ * Download files
+
+> You can download a file by clicking the download button in the top right
corner of the file details, or by downloading the file under the download
button after the file list.
+
+ * File rename
+
+<p align="center">
+ <img src="/img/rename-en.png" width="80%" />
+ </p>
+
+#### Delete
+> File List - > Click the Delete button to delete the specified file
+
+#### Resource management
+ > Resource management and file management functions are similar. The
difference is that resource management is the UDF function of uploading, and
file management uploads user programs, scripts and configuration files.
+
+ * Upload UDF resources
+ > The same as uploading files.
+
+#### Function management
+
+ * Create UDF Functions
+ > Click "Create UDF Function", enter parameters of udf function, select UDF
resources, and click "Submit" to create udf function.
+ >
+ >
+ >
+ > Currently only temporary udf functions for HIVE are supported
+ >
+ >
+ >
+ > - UDF function name: name when entering UDF Function
+ > - Package Name: Full Path of Input UDF Function
+ > - Parameter: Input parameters used to annotate functions
+ > - Database Name: Reserved Field for Creating Permanent UDF Functions
+ > - UDF Resources: Set up the resource files corresponding to the created UDF
+ >
+ >
+
+<p align="center">
+ <img src="/img/udf-function.png" width="80%" />
+ </p>
+
+## Security
+
+ - The security has the functions of queue management, tenant management,
user management, warning group management, worker group manager, token manage
and other functions. It can also authorize resources, data sources, projects,
etc.
+- Administrator login, default username password: admin/dolphinscheduler123
+
+
+
+### Create queues
+
+
+
+ - Queues are used to execute spark, mapreduce and other programs, which
require the use of "queue" parameters.
+- "Security" - > "Queue Manage" - > "Create Queue"
+ <p align="center">
+ <img src="/img/create-queue-en.png" width="80%" />
+ </p>
+
+
+### Create Tenants
+ - The tenant corresponds to the account of Linux, which is used by the
worker server to submit jobs. If Linux does not have this user, the worker
would create the account when executing the task.
+ - Tenant Code:**the tenant code is the only account on Linux that can't be
duplicated.**
+
+ <p align="center">
+ <img src="/img/create-tenant-en.png" width="80%" />
+ </p>
+
+### Create Ordinary Users
+ - User types are **ordinary users** and **administrator users**..
+ * Administrators have **authorization and user management** privileges,
and no privileges to **create project and process-defined operations**.
+ * Ordinary users can **create projects and create, edit, and execute
process definitions**.
+ * Note: **If the user switches the tenant, all resources under the tenant
will be copied to the switched new tenant.**
+<p align="center">
+ <img src="/img/create-user-en.png" width="80%" />
+ </p>
+
+### Create alarm group
+ * The alarm group is a parameter set at start-up. After the process is
finished, the status of the process and other information will be sent to the
alarm group by mail.
+ * New and Editorial Warning Group
+ <p align="center">
+ <img src="/img/alarm-group-en.png" width="80%" />
+ </p>
+
+### Create Worker Group
+ - Worker group provides a mechanism for tasks to run on a specified worker.
Administrators create worker groups, which can be specified in task nodes and
operation parameters. If the specified grouping is deleted or no grouping is
specified, the task will run on any worker.
+- Multiple IP addresses within a worker group (**aliases can not be
written**), separated by **commas in English**
+
+ <p align="center">
+ <img src="/img/worker-group-en.png" width="80%" />
+ </p>
+
+### Token manage
+ - Because the back-end interface has login check and token management, it
provides a way to operate the system by calling the interface.
+ <p align="center">
+ <img src="/img/token-en.png" width="80%" />
+ </p>
+- Call examples:
+
+```令牌调用示例
+ /**
+ * test token
+ */
+ public void doPOSTParam()throws Exception{
+ // create HttpClient
+ CloseableHttpClient httpclient = HttpClients.createDefault();
+
+ // create http post request
+ HttpPost httpPost = new
HttpPost("http://127.0.0.1:12345/dolphinscheduler/projects/create");
+ httpPost.setHeader("token", "123");
+ // set parameters
+ List<NameValuePair> parameters = new ArrayList<NameValuePair>();
+ parameters.add(new BasicNameValuePair("projectName", "qzw"));
+ parameters.add(new BasicNameValuePair("desc", "qzw"));
+ UrlEncodedFormEntity formEntity = new UrlEncodedFormEntity(parameters);
+ httpPost.setEntity(formEntity);
+ CloseableHttpResponse response = null;
+ try {
+ // execute
+ response = httpclient.execute(httpPost);
+ // response status code 200
+ if (response.getStatusLine().getStatusCode() == 200) {
+ String content = EntityUtils.toString(response.getEntity(),
"UTF-8");
+ System.out.println(content);
+ }
+ } finally {
+ if (response != null) {
+ response.close();
+ }
+ httpclient.close();
+ }
+ }
+```
+
+### Grant authority
+ - Granting permissions includes project permissions, resource permissions,
datasource permissions, UDF Function permissions.
+> Administrators can authorize projects, resources, data sources and UDF
Functions that are not created by ordinary users. Because project, resource,
data source and UDF Function are all authorized in the same way, the project
authorization is introduced as an example.
+
+> Note:For projects created by the user himself, the user has all the
permissions. The list of items and the list of selected items will not be
reflected
+
+ - 1.Click on the authorization button of the designated person as follows:
+ <p align="center">
+ <img src="/img/operation-en.png" width="80%" />
+ </p>
+
+- 2.Select the project button to authorize the project
+
+<p align="center">
+ <img src="/img/auth-project-en.png" width="80%" />
+ </p>
+
+### Monitor center
+ - Service management is mainly to monitor and display the health status and
basic information of each service in the system.
+
+#### Master monitor
+ - Mainly related information about master.
+<p align="center">
+ <img src="/img/master-monitor-en.png" width="80%" />
+ </p>
+
+#### Worker monitor
+ - Mainly related information of worker.
+
+<p align="center">
+ <img src="/img/worker-monitor-en.png" width="80%" />
+ </p>
+
+#### Zookeeper monitor
+ - Mainly the configuration information of each worker and master in
zookpeeper.
+
+<p align="center">
+ <img src="/img/zookeeper-monitor-en.png" width="80%" />
+ </p>
+
+#### DB monitor
+ - Mainly the health status of DB
+
+<p align="center">
+ <img src="/img/db-monitor-en.png" width="80%" />
+ </p>
+
+#### statistics Manage
+ <p align="center">
+ <img src="/img/statistics-en.png" width="80%" />
+ </p>
+
+ - Commands to be executed: statistics on t_ds_command table
+ - Number of commands that failed to execute: statistics on the
t_ds_error_command table
+ - Number of tasks to run: statistics of task_queue data in zookeeper
+ - Number of tasks to be killed: statistics of task_kill in zookeeper
+
+## <span id=TaskNodeType>Task Node Type and Parameter Setting</span>
+
+### Shell
+
+ - The shell node, when the worker executes, generates a temporary shell
script, which is executed by a Linux user with the same name as the tenant.
+> Drag the

task node in the toolbar onto the palette and double-click the task node as
follows:
+
+<p align="center">
+ <img src="/img/shell-en.png" width="80%" />
+ </p>`
+
+- Node name: The node name in a process definition is unique
+- Run flag: Identify whether the node can be scheduled properly, and if it
does not need to be executed, you can turn on the forbidden execution switch.
+- Description : Describes the function of the node
+- Number of failed retries: Number of failed task submissions, support
drop-down and manual filling
+- Failure Retry Interval: Interval between tasks that fail to resubmit tasks,
support drop-down and manual filling
+- Script: User-developed SHELL program
+- Resources: A list of resource files that need to be invoked in a script
+- Custom parameters: User-defined parameters that are part of SHELL replace
the contents of scripts with ${variables}
+
+### SUB_PROCESS
+ - The sub-process node is to execute an external workflow definition as an
task node.
+> Drag the

task node in the toolbar onto the palette and double-click the task node as
follows:
+
+<p align="center">
+ <img src="/img/sub-process-en.png" width="80%" />
+ </p>
+
+- Node name: The node name in a process definition is unique
+- Run flag: Identify whether the node is scheduled properly
+- Description: Describes the function of the node
+- Sub-node: The process definition of the selected sub-process is selected,
and the process definition of the selected sub-process can be jumped to by
entering the sub-node in the upper right corner.
+
+### DEPENDENT
+
+ - Dependent nodes are **dependent checking nodes**. For example, process A
depends on the successful execution of process B yesterday, and the dependent
node checks whether process B has a successful execution instance yesterday.
+
+> Drag the

ask node in the toolbar onto the palette and double-click the task node as
follows:
+
+<p align="center">
+ <img src="/img/current-node-en.png" width="80%" />
+ </p>
+
+ > Dependent nodes provide logical judgment functions, such as checking
whether yesterday's B process was successful or whether the C process was
successfully executed.
+
+ <p align="center">
+ <img src="/img/weekly-A-en.png" width="80%" />
+ </p>
+
+ > For example, process A is a weekly task and process B and C are daily
tasks. Task A requires that task B and C be successfully executed every day of
the last week, as shown in the figure:
+
+ <p align="center">
+ <img src="/img/weekly-A1-en.png" width="80%" />
+ </p>
+
+ > If weekly A also needs to be implemented successfully on Tuesday:
+
+ <p align="center">
+ <img src="/img/weekly-A2-en.png" width="80%" />
+ </p>
+
+### PROCEDURE
+ - The procedure is executed according to the selected data source.
+> Drag the

task node in the toolbar onto the palette and double-click the task node as
follows:
+
+<p align="center">
+ <img src="/img/node-setting-en.png" width="80%" />
+ </p>
+
+- Datasource: The data source type of stored procedure supports MySQL and
POSTGRESQL, and chooses the corresponding data source.
+- Method: The method name of the stored procedure
+- Custom parameters: Custom parameter types of stored procedures support IN
and OUT, and data types support nine data types: VARCHAR, INTEGER, LONG, FLOAT,
DOUBLE, DATE, TIME, TIMESTAMP and BOOLEAN.
+
+### SQL
+ - Drag the

task node in the toolbar onto the palette.
+ - Execute non-query SQL functionality
+ <p align="center">
+ <img src="/img/dependent-nodes-en.png" width="80%" />
+ </p>
+
+ - Executing the query SQL function, you can choose to send mail in the form
of tables and attachments to the designated recipients.
+
+<p align="center">
+ <img src="/img/double-click-en.png" width="80%" />
+ </p>
+
+- Datasource: Select the corresponding datasource
+- sql type: support query and non-query, query is select type query, there is
a result set returned, you can specify mail notification as table, attachment
or table attachment three templates. Non-query is not returned by result set,
and is for update, delete, insert three types of operations
+- sql parameter: input parameter format is key1 = value1; key2 = value2...
+- sql statement: SQL statement
+- UDF function: For HIVE type data sources, you can refer to UDF functions
created in the resource center, other types of data sources do not support UDF
functions for the time being.
+- Custom parameters: SQL task type, and stored procedure is to customize the
order of parameters to set values for methods. Custom parameter type and data
type are the same as stored procedure task type. The difference is that the
custom parameter of the SQL task type replaces the ${variable} in the SQL
statement.
+- Pre Statement: Pre-sql is executed before the sql statement
+- Post Statement: Post-sql is executed after the sql statement
+
+
+
+### SPARK
+
+ - Through SPARK node, SPARK program can be directly executed. For spark
node, worker will use `spark-submit` mode to submit tasks.
+
+> Drag the

task node in the toolbar onto the palette and double-click the task node as
follows:
+>
+>
+
+<p align="center">
+ <img src="/img/spark-submit-en.png" width="80%" />
+ </p>
+
+- Program Type: Support JAVA, Scala and Python
+- Class of the main function: The full path of Main Class, the entry to the
Spark program
+- Master jar package: It's Spark's jar package
+- Deployment: support three modes: yarn-cluster, yarn-client, and local
+- Driver Kernel Number: Driver Kernel Number and Memory Number can be set
+- Executor Number: Executor Number, Executor Memory Number and Executor Kernel
Number can be set
+- Command Line Parameters: Setting the input parameters of Spark program to
support the replacement of custom parameter variables.
+- Other parameters: support - jars, - files, - archives, - conf format
+- Resource: If a resource file is referenced in other parameters, you need to
select the specified resource.
+- Custom parameters: User-defined parameters in MR locality that replace the
contents in scripts with ${variables}
+
+Note: JAVA and Scala are just used for identification, no difference. If it's
a Spark developed by Python, there's no class of the main function, and
everything else is the same.
+
+### MapReduce(MR)
+ - Using MR nodes, MR programs can be executed directly. For Mr nodes, worker
submits tasks using `hadoop jar`
+
+
+> Drag the

task node in the toolbar onto the palette and double-click the task node as
follows:
+
+ 1. JAVA program
+
+ <p align="center">
+ <img src="/img/java-program-en.png" width="80%" />
+ </p>
+
+- Class of the main function: The full path of the MR program's entry Main
Class
+- Program Type: Select JAVA Language
+- Master jar package: MR jar package
+- Command Line Parameters: Setting the input parameters of MR program to
support the replacement of custom parameter variables
+- Other parameters: support - D, - files, - libjars, - archives format
+- Resource: If a resource file is referenced in other parameters, you need to
select the specified resource.
+- Custom parameters: User-defined parameters in MR locality that replace the
contents in scripts with ${variables}
+
+2. Python program
+
+<p align="center">
+ <img src="/img/python-program-en.png" width="80%" />
+ </p>
+
+- Program Type: Select Python Language
+- Main jar package: Python jar package running MR
+- Other parameters: support - D, - mapper, - reducer, - input - output format,
where user-defined parameters can be set, such as:
+- mapper "mapper.py 1" - file mapper.py-reducer reducer.py-file
reducer.py-input/journey/words.txt-output/journey/out/mr/${current TimeMillis}
+- Among them, mapper. py 1 after - mapper is two parameters, the first
parameter is mapper. py, and the second parameter is 1.
+- Resource: If a resource file is referenced in other parameters, you need to
select the specified resource.
+- Custom parameters: User-defined parameters in MR locality that replace the
contents in scripts with ${variables}
+
+### Python
+ - With Python nodes, Python scripts can be executed directly. For Python
nodes, worker will use `python ** `to submit tasks.
+
+
+
+
+> Drag the

task node in the toolbar onto the palette and double-click the task node as
follows:
+
+<p align="center">
+ <img src="/img/python-en.png" width="80%" />
+ </p>
+
+- Script: User-developed Python program
+- Resource: A list of resource files that need to be invoked in a script
+- Custom parameters: User-defined parameters that are part of Python that
replace the contents in the script with ${variables}
+
+### System parameter
+
+<table>
+ <tr><th>variable</th><th>meaning</th></tr>
+ <tr>
+ <td>${system.biz.date}</td>
+ <td>The timing time of routine dispatching instance is one day before,
in yyyyyMMdd format. When data is supplemented, the date + 1</td>
+ </tr>
+ <tr>
+ <td>${system.biz.curdate}</td>
+ <td> Daily scheduling example timing time, format is yyyyyMMdd, when
supplementing data, the date + 1</td>
+ </tr>
+ <tr>
+ <td>${system.datetime}</td>
+ <td>Daily scheduling example timing time, format is yyyyyMMddHmmss,
when supplementing data, the date + 1</td>
+ </tr>
+</table>
+
+
+### Time Customization Parameters
+
+ - Support code to customize the variable name, declaration: ${variable
name}. It can refer to "system parameters" or specify "constants".
+
+ - When we define this benchmark variable as $[...], [yyyyMMddHHmmss] can be
decomposed and combined arbitrarily, such as:$[yyyyMMdd], $[HHmmss],
$[yyyy-MM-dd] ,etc.
+
+ - Can also do this:
+
+
+
+ * Later N years: $[add_months (yyyyyyMMdd, 12*N)]
+ * The previous N years: $[add_months (yyyyyyMMdd, -12*N)]
+ * Later N months: $[add_months (yyyyyMMdd, N)]
+ * The first N months: $[add_months (yyyyyyMMdd, -N)]
+ * Later N weeks: $[yyyyyyMMdd + 7*N]
+ * The first N weeks: $[yyyyyMMdd-7*N]
+ * The day after that: $[yyyyyyMMdd + N]
+ * The day before yesterday: $[yyyyyMMdd-N]
+ * Later N hours: $[HHmmss + N/24]
+ * First N hours: $[HHmmss-N/24]
+ * After N minutes: $[HHmmss + N/24/60]
+ * First N minutes: $[HHmmss-N/24/60]
+
+
+### <span id=CustomParameters>User-defined parameters</span>
+
+ - User-defined parameters are divided into global parameters and local
parameters. Global parameters are the global parameters passed when the process
definition and process instance are saved. Global parameters can be referenced
by local parameters of any task node in the whole process.
+
+ For example:
+<p align="center">
+ <img src="/img/user-defined-en.png" width="80%" />
+ </p>
+
+ - global_bizdate is a global parameter, referring to system parameters.
+
+<p align="center">
+ <img src="/img/user-defined1-en.png" width="80%" />
+ </p>
+
+ - In tasks, local_param_bizdate refers to global parameters by
\${global_bizdate} for scripts, the value of variable local_param_bizdate can
be referenced by \${local_param_bizdate}, or the value of local_param_bizdate
can be set directly by JDBC.
diff --git a/docs/en-us/1.2.1/user_doc/upgrade.md
b/docs/en-us/1.2.1/user_doc/upgrade.md
new file mode 100644
index 0000000..7118acc
--- /dev/null
+++ b/docs/en-us/1.2.1/user_doc/upgrade.md
@@ -0,0 +1,39 @@
+
+# DolphinScheduler upgrade documentation
+
+## 1. Back up the previous version of the files and database
+
+## 2. Stop all services of dolphinscheduler
+
+ `sh ./script/stop-all.sh`
+
+## 3. Download the new version of the installation package
+
+-
[download](https://dolphinscheduler.apache.org/en-us/docs/user_doc/download.html),
download the latest version of the front and back installation packages
(backend referred to as dolphinscheduler-backend, front end referred to as
dolphinscheduler-front)
+- The following upgrade operations need to be performed in the new version of
the directory
+
+## 4. Database upgrade
+- Modify the following properties in conf/application-dao.properties
+
+```
+ spring.datasource.url
+ spring.datasource.username
+ spring.datasource.password
+```
+
+- Execute database upgrade script
+
+`sh ./script/upgrade-dolphinscheduler.sh`
+
+## 5. Backend service upgrade
+
+- Modify the content of the install.sh configuration and execute the upgrade
script
+
+ `sh install.sh`
+
+## 6. Frontend service upgrade
+
+- Overwrite the previous version of the dist directory
+- Restart the nginx service
+
+ `systemctl restart nginx`
diff --git a/site_config/site.js b/site_config/site.js
index eba850b..ff17e6d 100755
--- a/site_config/site.js
+++ b/site_config/site.js
@@ -18,7 +18,7 @@ export default {
children: [
{
key: 'docs1',
- text: '1.2.0',
+ text: '1.2.1',
link: '/en-us/docs/1.2.1/user_doc/quick-start.html',
},
{
@@ -123,12 +123,12 @@ export default {
link: '/zh-cn/docs/1.2.1/user_doc/quick-start.html',
},
{
- key: 'docs1',
+ key: 'docs2',
text: '1.2.0',
link: '/zh-cn/docs/1.2.0/user_doc/quick-start.html',
},
{
- key: 'docs2',
+ key: 'docs3',
text: '1.1.0(Not Apache Release)',
link: 'https://analysys.github.io/easyscheduler_docs_cn/',
}