and thanks for Furkan and Sheng's reminder, everything happens about DolphinScheduler should be an open way
Best Regards --------------- DolphinScheduler(Incubator) PPMC Lidong Dai 代立冬 [email protected] --------------- On Mon, Nov 23, 2020 at 10:25 AM lidong dai <[email protected]> wrote: > hi, > This is not a formal meeting, just to understand what the developers > think about DolphinScheduler, so I didn’t inform you in advance when the > meeting was held in the email list. I think we could hold online meeting > to discuss the topic about "how to refactor" this week > > > Best Regards > --------------- > DolphinScheduler(Incubator) PPMC > Lidong Dai 代立冬 > [email protected] > --------------- > > > On Mon, Nov 23, 2020 at 10:01 AM Sheng Wu <[email protected]> > wrote: > >> Hi Calvin >> >> I think you misunderstood the point about Furkan's asking. He was asking >> about how do you set up the meeting which should be an open way. >> >> Sheng Wu 吴晟 >> Twitter, wusheng1108 >> >> >> CalvinKirs <[email protected]> 于2020年11月23日周一 上午9:47写道: >> >> > Hi Furkan, welcome very much. This meeting is a Chinese video >> conference. >> > We will send meeting invitations in the Chinese contributor exchange >> group. >> > The relevant design conclusions will be reflected in the dev mailing >> list, >> > and you are very welcome to discuss with you in the mail. >> > >> > >> > In addition, thank you very much for your reminder. :in the later days >> > later online meetings, we will send meeting invitations in advance on >> the >> > mailing list, and you can also communicate with us during the meeting. >> > >> > >> > Best wishes! >> > CalvinKirs >> > >> > >> > On 11/22/2020 23:05,Furkan KAMACI<[email protected]> wrote: >> > Hi BoYi, >> > >> > May I learn at where you decided to do this meeting (slack, mail list >> > etc.)? >> > >> > Kind Regards, >> > Furkan KAMACI >> > >> > On Thu, Nov 19, 2020 at 10:21 AM wu shaoj <[email protected]> wrote: >> > >> > Cool, excellent meeting >> > >> > >> > From: boyi <[email protected]> >> > Date: Thursday, November 19, 2020 at 15:18 >> > To: [email protected] <[email protected]> >> > Subject: [Online Meeting Invitation]DolphinScheduler refactor >> > meeting(Second) >> > [Online Meeting Invitation]DolphinScheduler refactor meeting(Second) >> > >> > >> > Hi, DolphinScheduler Community: >> > >> > >> > >> > >> > We discussed the DolphinScheduler reconstruction workflow definition >> > storage structure (split JSON data) at 2020-11-17 19:00 Beijing time. A >> > total of 10+ partners participated in this meeting. The discussion >> > information of the meeting is as follows: >> > >> > >> > 1: Currently, we provide an idea of splitting the workflow definition >> > table (t_ds_process_definition) into two tables. The data structure >> > information is as follows: >> > >> > >> > t_ds_process_definition[subject table] and t_ds_process_definition_task >> > [task detail table] >> > >> > >> > | t_ds_process_definition | >> > | name | type | describe | >> > | id | int(11) | id | >> > | name | varchar(255) | process definition name | >> > | version | int(11) | process definition version | >> > | release_state | tinyint(4) | 0 is not online, 1 is online | >> > | project_id | int(11) | project id | >> > | user_id | int(11) | user id | >> > | description | text | description | >> > | global_params | text | global params | >> > | flag | tinyint(4) | 0 is not available, 1 is available | >> > | receivers | text | addressee | >> > | receivers_cc | text | CC person | >> > | create_time | datetime | create time | >> > | timeout | int(11) | time out | >> > | tenant_id | int(11) | tenant id | >> > | update_time | datetime | update time | >> > | modify_by | varchar(36) | modify by user name | >> > >> > >> > | t_ds_process_definition_task | >> > | name | type | describe | >> > | id | int(11) | task id | >> > | name | varchar(255) | task name | >> > | type | varchar(64) | type [SHLL,PYTHON,DATAX,SPARK 等等 ] | >> > | process_definition_id | int(11) | process definition id | >> > | params | longtext | custom parameters [JSON ] | >> > | description | text | description | >> > | runFlag | tinyint(4) | operation identification | >> > | conditionResult | longtext | conditional branch [JSON ] | >> > | dependence | longtext | task dependency [JSON ] | >> > | maxRetryTimes | tinyint(4) | max retry times | >> > | retryInterval | tinyint(4) | retry interval | >> > | timeout | varchar(128) | time out [JSON ] | >> > | taskInstancePriority | varchar(16) | task priority | >> > | workerGroup | varchar(64) | worker group name | >> > | preTasks | varchar(128) | pre task | >> > | locations | text | dag location | >> > | connects | text | dag connect | >> > | resource | varchar(255) | resouce mark | >> > | datasource | varchar(255) | datasource mark | >> > >> > >> > >> > >> > 2: Consider whether you need a third table to store the dependencies >> > between tasks, mainly for workflow dependent nodes, condition judgments, >> > and task bloodlines for higher-level abstraction. Split into the third >> > table. >> > >> > >> > >> > >> > >> > >> > 3. Identify the issues to be discussed in the next meeting >> > 3.1. Is the workflow definition table split into two tables or into >> > three tables? >> > 3.2. How to store data sources and resource files >> > 3.3. How to store workflow task instances without affecting >> > re-running, and to support existing functions such as editing workflow >> > definitions. >> > 3.4. Workflow definition version issue >> > >> > >> > We are very grateful to the following friends for their discussions: >> > dailidong, lgcareer, CalvinKirs, Rubik-W, leonbao, zixi0825, >> JinyLeeChina, >> > chenxingchun, BoYiZhang, etc. They provided a lot of effective >> suggestions >> > for this meeting. >> > >> > >> > At the same time, the community also hopes that more people can >> > participate. Thank you very much. >> > >> > >> > Best wishes! >> > BoYiZhang >> > >> > >> > >> > >> > >> > >> > ------------------------------------------------ >> > >> > >> > hi,DolphinScheduler 社区: >> > >> > >> > 我们在北京时间2020-11-17 >> > >> 19:00针对DolphinScheduler重构工作流定义存储结构(拆大JSON)进行了讨论,共有10+位伙伴参与了本次会议,会议讨论信息如下: >> > >> > >> > 1:目前提供一个将工作流定义表(t_ds_process_definition)拆分为两张表的思路,数据结构信息如下 : >> > t_ds_process_definition[主体表] 和 t_ds_process_definition_task [任务详情表] >> > >> > >> > >> > >> > >> > >> > 2:考虑是否需要第三张表存储任务之间的依赖关系, 主要针对工作流依赖节点,条件判断,任务血缘进行更高层级的抽象.拆分成第三张表. >> > >> > >> > | t_ds_process_definition[主体表] | >> > | 序号 | 字段 | 类型 | 描述 | >> > | 1 | id | int(11) | 主键 | >> > | 2 | name | varchar(255) | 流程定义名称 | >> > | 3 | version | int(11) | 流程定义版本 | >> > | 4 | release_state | tinyint(4) | 流程定义的发布状态:0 未上线 , 1已上线 | >> > | 5 | project_id | int(11) | 项目id | >> > | 6 | user_id | int(11) | 流程定义所属用户id | >> > | 7 | description | text | 流程定义描述 | >> > | 8 | global_params | text | 全局参数 | >> > | 9 | flag | tinyint(4) | 流程是否可用:0 不可用,1 可用 | >> > | 10 | receivers | text | 收件人 | >> > | 11 | receivers_cc | text | 抄送人 | >> > | 12 | create_time | datetime | 创建时间 | >> > | 13 | timeout | int(11) | 超时时间 | >> > | 14 | tenant_id | int(11) | 租户id | >> > | 15 | update_time | datetime | 更新时间 | >> > | 16 | modify_by | varchar(36) | 修改用户 | >> > >> > >> > | t_ds_process_definition_task [任务详情表] | >> > | 序号 | 参数名 | 类型 | 描述 | >> > | 1 | id | int(11) | 任务id | >> > | 2 | name | varchar(255) | 任务名称 | >> > | 3 | type | varchar(64) | 类型 [SHLL,PYTHON,DATAX,SPARK 等等 ] | >> > | 4 | process_definition_id | int(11) | 流程定义id | >> > | 5 | params | longtext | 自定义参数 [ Json 格式 保存原有的params字段, >> 自定义参数和资源文件参数是否拆出 >> > ] | >> > | 6 | description | text | 描述 | >> > | 7 | runFlag | tinyint(4) | 运行标识 | >> > | 8 | conditionResult | longtext | 条件分支 [JSON格式] | >> > | 9 | dependence | longtext | 任务依赖 [JSON格式] | >> > | 10 | maxRetryTimes | tinyint(4) | 最大重试次数 | >> > | 11 | retryInterval | tinyint(4) | 重试间隔 | >> > | 12 | timeout | varchar(128) | 超时控制策略 [JSON格式] | >> > | 13 | taskInstancePriority | varchar(16) | 任务优先级 | >> > | 14 | workerGroup | varchar(64) | Worker 分组名称 | >> > | 15 | preTasks | varchar(128) | 前置任务 | >> > | 16 | locations | text | 节点坐标信息 | >> > | 17 | connects | text | 节点连线信息 | >> > | 18 | resource | varchar(255) | 资源文件标识 , 以逗号分隔 | >> > | 19 | datasource | varchar(255) | 数据源标识 , 以逗号分隔 | >> > >> > >> > 3.明确下次会议待讨论问题 >> > 3.1. 工作流定义表是拆成两张表还是拆成三张表? >> > 3.2. 数据源和资源文件如何存储 >> > 3.3. 工作流任务实例如何存储,不影响重跑 ,要支持编辑工作流定义等现有的功能. >> > 3.4. 工作流定义版本问题 >> > >> > >> > >> > >> > >> > >> 我们很感谢以下朋友的讨论:dailidong、lgcareer、CalvinKirs、Rubik-W、leonbao、zixi0825、JinyLeeChina、chenxingchun >> > 、 BoYiZhang 等,他们对本次会议提供了很多有效的建议。 >> > >> > >> > 同时社区也希望更多的人能够参与进来。非常感谢你们。 >> > >> > >> > -------------------------------------- >> > BoYi ZhangE-mail : [email protected] >> > >> > >> >
