+1
------------------ ???????? ------------------
??????:
"dev"
<[email protected]>;
????????: 2020??11??27??(??????) ????4:01
??????: "[email protected]"<[email protected]>;
????: Re: [DISCUSS] Process definithon json split design
hi:
- The Master is only responsible for DAG scheduling. When a job is
executed, it sends the job code to the Worker, and the Worker
is
+ 1
This has a great impact if - 1 is not implemented
--------------------------------------
BoYi ZhangE-mail : [email protected]
On 11/27/2020 15:53??Hemin Wen<[email protected]> wrote??
There are two solutions for the query timing of job details:
- The Master is only responsible for DAG scheduling. When a job is
executed, it sends the job code to the Worker, and the Worker is
responsible for querying the job details and executing the job
- The Master executes DAG scheduling, queries the job details when a job is
executed, and then sends it to the Worker to execute the job
---------------------------------------------------------------------------------------------------------
????????????????????????????????????????
-
Master??????DAG??????????????????????????????????????Worker??Worker??????????????????????????
- Master????DAG??????????????????????????????????????????????Worker????????
--------------------
DolphinScheduler(Incubator) Commtter
Hemin Wen ??????
[email protected]
--------------------
Hemin Wen <[email protected]> ??2020??11??25?????? ????10:01??????
Hi!
About json splitting of workflow definition, The following is the design
plan for splitting three tables.
Everyone can discuss together.
--------------------------------------------------------------------------------------------------------------
## 1. Currently
The workflow definition of the current DS system includes task definition
data and task relationship data. In the design of the database, task data
and task relationship data are stored in the workflow as a string type
field (process_definition_json) Definition table (t_ds_process_definition).
With the increase of workflow and tasks, the following problems will arise:
-Task data, relational data and workflow data are coupled together, which
is not friendly to the scenario of single-task scheduling. The task must be
created in the workflow
-The task cannot be reused because the task is created in the workflow
-The maintenance cost is high. If you move the whole body and modify any
task, you need to update the data in the workflow as a whole, and it also
increases the log cost
-When there are many tasks in the workflow, the efficiency of global
search and statistical analysis is low, such as querying which tasks use
which data source
-Poor scalability, for example, the realization of blood relationship
function in the future will only lead to more and more bloated workflow
definitions
-Tasks, relationships, and workflow boundaries are blurred. Condition
nodes and delay nodes are also regarded as a task, which is actually a
combination of relationships and conditions
Based on the above pain points, we need to redefine the business
boundaries of tasks, relationships, and workflows, and redesign their data
structures based on this
## 2. Design Ideas
### 2.1 Workflow, relation, job
First of all, we set aside the current implementation and clarify the
business boundaries of tasks (the subsequent description is changed to
jobs), relationships, and workflows, and how to decouple
-Job: the task that the scheduling system really needs to execute, the job
only contains the data and resources needed to execute the job
-relation: the relationship between the job and the job and the execution
conditions, including the execution relationship (after A completes,
execute B) and execution conditions (after A completes and succeeds,
execute B; after A completes and fails, execute C; A completes 30 After
minutes, execute D)
-Workflow: the carrier of a set of relationships, the workflow only saves
the relationships between jobs (DAG is a display form of workflow, a way to
create relationships)
Combined with the functions supported by the current DS, we can make a
classification
-Job: Dependency check, sub-process, Shell, stored procedure, Sql, Spark,
Flink, MR, Python, Http, DataX, Sqoop
-Relationship: serial execution, parallel execution, aggregate execution,
conditional branch, delayed execution
-Workflow: the boundary of scheduling execution, including a set of
relationships
#### 2.1.1 Further refinement
The job definition data is not much different from the current job
definition data. Both are composed of public fields and custom fields. You
only need to remove the fields related to the relationship.
The workflow definition data is not much different from the current
workflow definition data, just remove the json field.
Relational data, we can abstract into two nodes and one path according to
classification. The node is the job, and the path includes the conditional
rules that need to be met from the pre-node to the post-node. The
conditional rules include: unconditional, judgment condition, and delay
condition.
### 2.2 Version Management
We clarify the business boundaries. After decoupling, they become a
reference relationship. The workflow and the relationship are one-to-many,
and the relationship and the job are one-to-many. Not only is the
definition of data, we also need to consider instance data. Every time a
workflow is scheduled and executed, a workflow instance will be generated.
Jobs and workflows can be changed, and the workflow instance must support
viewing, rerun, recovery failure, etc. . This requires the introduction of
version management of the definition data. Every time workflow,
relationship, and job changes need to save old version data and generate
new version data.
So the design idea here is:
To define data, you need to add a version field
The definition table needs to add the corresponding log table
When creating definition data, double write to the definition table and
log table. When modifying the definition data, save the modified version to
the log table
There is no need to save version information in the reference data of the
definition table (refer to the latest version), and the version information
at the time of execution is saved in the instance data
### 2.3 Coding Design
This also involves the import and export of workflow and job definition
data. According to the previous community discussion, a coding scheme needs
to be introduced. Each piece of data in workflow, relationship, and job
will have a unique code. Related Issues: https://github
.com/apache/incubator-dolphinscheduler/issues/3820
Resource: RESOURCE_xxx
Task: TASK_xxx
Relation: RELATION_xxx
Workflow: PROCESS_xxx
Project: PROJECT_xxx
## 3. Design plan
### 3.1 Table model design
#### 3.1.1 Job definition table: t_ds_task_definithon
| Column Name | Description |
| ----------------------- | -------------- |
| id | Self-incrementing ID |
| union_code | unique code |
| version | Version |
| name | Job name |
| description | description |
| task_type | Job type |
| task_params | Job custom parameters |
| run_flag | Run flag |
| task_priority | Job priority |
| worker_group | worker group |
| fail_retry_times | Number of failed retries |
| fail_retry_interval | Failure retry interval |
| timeout_flag | Timeout flag |
| timeout_notify_strategy | Timeout notification strategy |
| timeout_duration | Timeout duration |
| create_time | Creation time |
| update_time | Modification time |
#### 3.1.2 Task relation table: t_ds_task_relation
| Column Name | Description |
| ----------------------- | ------------------------- ------------- |
| id | Self-incrementing ID |
| union_code | unique code |
| version | Version |
| process_definition_code | Workflow coding |
| node_code | Node code (workflow code/job code) |
| post_node_code | Post node code (workflow code/job code) |
| condition_type | Condition type 0: None 1: Judgment condition 2: Delay
condition |
| condition_params | Condition parameters |
| create_time | Creation time |
| update_time | Modification time |
#### 3.1.3 Workflow definition table: t_ds_process_definithon
| Column Name | Description |
| ---- | ---- |
| id | Self-incrementing ID |
| union_code | unique code |
| version | Version |
| name | Workflow name |
| project_code | Project code |
| release_state | Release state |
| user_id | Owning user ID |
| description | description |
| global_params | Global parameters |
| flag | Whether the process is available: 0 is not available, 1 is
available |
| receivers | recipients |
| receivers_cc | CC |
| timeout | Timeout time |
| tenant_id | tenant ID |
| create_time | Creation time |
| update_time | Modification time |
#### 3.1.4 Job definition log table: t_ds_task_definithon_log
Add operation type (add, modify, delete), operator, and operation time
based on the job definition table
#### 3.1.5 Job relation log table: t_ds_task_relation_log
Add operation type (add, modify, delete), operator, and operation time
based on the job relationship table
#### 3.1.6 Workflow definition log table: t_ds_process_definithon_log
Add operation type (add, modify, delete), operator, and operation time
based on the workflow definition table
### 3.2 Frontend
*The design here is just a personal idea, and the front-end help is needed
to design the interaction*
Need to add job management related functions, including: job list, job
creation, update, delete, view details operations
To create a workflow page, you need to split json into workflow definition
data and job relationship data to the back-end API layer to save/update
Workflow page, when dragging task nodes, add reference job options
The conditional branch nodes and delay nodes need to be resolved into the
conditional rule data in the relationship; conversely, the conditional rule
data returned by the backend needs to be displayed as the corresponding
node when querying the workflow
### 3.3 Master
When the Master schedules the workflow, you need to modify <Build dag from
json> to <Build dag from relational data>. When executing a workflow,
first
load the relational data in full (no job data is loaded here), generate
DAG, and traverse DAG execution , And then get the job data that needs to
be executed
Other execution processes are consistent with existing processes
--------------------------------------------------------------------------------------------------------------
## 1.????
????DS??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????process_definition_json??????????????????????????????t_ds_process_definition??????
????????????????????????????????????????
-
??????????????????????????????????????????????????????????????????????????????????????????
- ????????????????????????????????????????
-
????????????????????????????????????????????????????????????????????????????????????????????
- ??????????????????????????????????????????????????????????????????????????????
- ??????????????????????????????????????????????????????????????
-
??????????????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????
## 2.????????
### 2.1 ??????????????????
????????????????????????????????????????????????????????????????????????????????????????????
- ??????????????????????????????????????????????????????????????????
-
??????????????????????????????????????????????????????A????????????B??????????????A??????????????????B??A??????????????????C??A????30????????????D??
-
??????????????????????????????????????????????????DAG????????????????????????????????????????????
????????DS??????????????????????????????
-
????????????????????????Shell????????????Sql??Spark??Flink??MR??Python??Http??DataX??Sqoop
- ??????????????????????????????????????????????????????
- ????????????????????????????????????
#### 2.1.1 ??????????
??????????????????????????????????????????????????????????????????????????????????????????????????????????????
??????????????????????????????????????????????????????json??????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
### 2.2 ????????
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
??????????????????????
????????????????????????
??????????????????????????
??????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????
### 2.3 ????????
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????Issue??
https://github.com/apache/incubator-dolphinscheduler/issues/3820
??????RESOURCE_xxx
??????TASK_xxx
??????RELATION_xxx
????????PROCESS_xxx
??????PROJECT_xxx
## 3.????????
### 3.1 ??????????
#### 3.1.1 ????????????t_ds_task_definithon
|
????
| ???? |
| ----------------------- | -------------- |
|
id
| ????ID |
|
union_code
| ???????? |
|
version
| ???? |
|
name
| ???????? |
|
description
| ???? |
|
task_type
| ???????? |
|
task_params
| ?????????????? |
|
run_flag
| ???????? |
| task_priority |
?????????? |
|
worker_group
| worker???? |
| fail_retry_times |
???????????? |
| fail_retry_interval | ???????????? |
|
timeout_flag
| ???????? |
| timeout_notify_strategy | ???????????? |
| timeout_duration |
???????? |
|
create_time
| ???????? |
|
update_time
| ???????? |
#### 3.1.2 ????????????t_ds_task_relation
|
????
|
????
|
| ----------------------- | -------------------------------------- |
|
id
|
????ID
|
|
union_code
|
????????
|
|
version
|
????
|
| process_definition_code |
??????????
|
|
node_code
| ????????????????????/?????????? |
| post_node_code |
????????????????????????/?????????? |
| condition_type |
???????? 0???? 1?????????? 2?????????? |
| condition_params |
????????
|
|
create_time
|
????????
|
|
update_time
|
????????
|
#### 3.1.3 ??????????????t_ds_process_definithon
| ???? | ???? |
| ---- | ---- |
| id |
????ID
|
| union_code |
????????
|
| version |
????
|
| name |
??????????
|
| project_code |
????????
|
| release_state |
????????
|
| user_id |
????????ID
|
| description |
????
|
| global_params |
????????
|
| flag | ??????????????0
????????1 ???? |
| receivers |
??????
|
| receivers_cc |
??????
|
| timeout |
????????
|
| tenant_id |
????ID
|
| create_time |
????????
|
| update_time |
????????
|
#### 3.1.4 ????????????????t_ds_task_definithon_log
??????????????????????????????????????????????????????????????????
#### 3.1.5 ????????????????t_ds_task_relation_log
??????????????????????????????????????????????????????????????????
#### 3.1.6 ??????????????????t_ds_process_definithon_log
????????????????????????????????????????????????????????????????????
### 3.2 ????
*??????????????????????????????????????????????????*
??????????????????????????????????????????????????????????????????????????????
??????????????????????json??????????????????????????????????????????API??????/????
????????????????????????????????????????????
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
### 3.3 Master
Master????????????????????<??json????dag>??????<??????????????dag>????????????????????????????????????????????????????????????????DAG??????DAG????????????????????????????????
??????????????????????????
--------------------
DolphinScheduler(Incubator) Commtter
Hemin Wen ??????
[email protected]
--------------------