Re: [Online Meeting Invitation]DolphinScheduler refactor meeting(Second)

lidong dai Sun, 22 Nov 2020 18:39:41 -0800

Got it,  we will keep in mind


Best Regards
---------------
DolphinScheduler(Incubator) PPMC
Lidong Dai 代立冬
[email protected]
---------------


On Mon, Nov 23, 2020 at 10:33 AM Sheng Wu <[email protected]> wrote:

> This is about informing anyone, this is about open discussion and decision.
> People may or may not join the meeting, regarding the language, TZ, or time
> slot. But, the key is, you are sharing the information, which means, you
> are welcome anyone to join.
>
> The conclusion should be back to the mail list, yes, that is true. But the
> workflow is also important, the way of making concensus.
>
> Sheng Wu 吴晟
> Twitter, wusheng1108
>
>
> lidong dai <[email protected]> 于2020年11月23日周一 上午10:26写道：
>
> > hi,
> >  This is not a formal meeting, just to understand what the developers
> think
> > about DolphinScheduler, so I didn’t inform you in advance when the
> meeting
> > was held in the email list.   I think we could hold online meeting to
> > discuss the topic about "how to refactor"  this week
> >
> >
> > Best Regards
> > ---------------
> > DolphinScheduler(Incubator) PPMC
> > Lidong Dai 代立冬
> > [email protected]
> > ---------------
> >
> >
> > On Mon, Nov 23, 2020 at 10:01 AM Sheng Wu <[email protected]>
> > wrote:
> >
> > > Hi Calvin
> > >
> > > I think you misunderstood the point about Furkan's asking. He was
> asking
> > > about how do you set up the meeting which should be an open way.
> > >
> > > Sheng Wu 吴晟
> > > Twitter, wusheng1108
> > >
> > >
> > > CalvinKirs <[email protected]> 于2020年11月23日周一 上午9:47写道：
> > >
> > > > Hi Furkan, welcome very much. This meeting is a Chinese video
> > conference.
> > > > We will send meeting invitations in the Chinese contributor exchange
> > > group.
> > > > The relevant design conclusions will be reflected in the dev mailing
> > > list,
> > > > and you are very welcome to discuss with you in the mail.
> > > >
> > > >
> > > > In addition, thank you very much for your reminder. :in the later
> days
> > > > later online meetings, we will send meeting invitations in advance on
> > the
> > > > mailing list, and you can also communicate with us during the
> meeting.
> > > >
> > > >
> > > > Best  wishes！
> > > > CalvinKirs
> > > >
> > > >
> > > > On 11/22/2020 23:05，Furkan KAMACI<[email protected]> wrote：
> > > > Hi BoYi,
> > > >
> > > > May I learn at where you decided to do this meeting (slack, mail list
> > > > etc.)?
> > > >
> > > > Kind Regards,
> > > > Furkan KAMACI
> > > >
> > > > On Thu, Nov 19, 2020 at 10:21 AM wu shaoj <[email protected]> wrote:
> > > >
> > > > Cool, excellent meeting
> > > >
> > > >
> > > > From: boyi <[email protected]>
> > > > Date: Thursday, November 19, 2020 at 15:18
> > > > To: [email protected] <[email protected]
> >
> > > > Subject: [Online Meeting Invitation]DolphinScheduler refactor
> > > > meeting(Second)
> > > > [Online Meeting Invitation]DolphinScheduler refactor meeting(Second)
> > > >
> > > >
> > > > Hi, DolphinScheduler Community:
> > > >
> > > >
> > > >
> > > >
> > > > We discussed the DolphinScheduler reconstruction workflow definition
> > > > storage structure (split JSON data) at 2020-11-17 19:00 Beijing
> time. A
> > > > total of 10+ partners participated in this meeting. The discussion
> > > > information of the meeting is as follows:
> > > >
> > > >
> > > > 1: Currently, we provide an idea of splitting the workflow definition
> > > > table (t_ds_process_definition) into two tables. The data structure
> > > > information is as follows:
> > > >
> > > >
> > > > t_ds_process_definition[subject table] and
> t_ds_process_definition_task
> > > > [task detail table]
> > > >
> > > >
> > > > |  t_ds_process_definition |
> > > > | name | type | describe |
> > > > | id | int(11) | id |
> > > > | name | varchar(255) | process definition name |
> > > > | version | int(11) | process definition version |
> > > > | release_state | tinyint(4) | 0 is not online, 1 is online |
> > > > | project_id | int(11) | project id |
> > > > | user_id | int(11) | user id |
> > > > | description | text | description |
> > > > | global_params | text | global params |
> > > > | flag | tinyint(4) | 0 is not available, 1 is available |
> > > > | receivers | text | addressee |
> > > > | receivers_cc | text | CC person |
> > > > | create_time | datetime | create time |
> > > > | timeout | int(11) | time out |
> > > > | tenant_id | int(11) | tenant id |
> > > > | update_time | datetime | update time |
> > > > | modify_by | varchar(36) | modify by user name |
> > > >
> > > >
> > > > | t_ds_process_definition_task  |
> > > > | name | type | describe |
> > > > | id | int(11) | task id |
> > > > | name | varchar(255) | task name |
> > > > | type | varchar(64) | type  [SHLL,PYTHON,DATAX,SPARK 等等 ] |
> > > > | process_definition_id | int(11) | process definition id |
> > > > | params | longtext | custom parameters  [JSON ] |
> > > > | description | text | description |
> > > > | runFlag | tinyint(4) | operation identification |
> > > > | conditionResult | longtext | conditional branch [JSON ] |
> > > > | dependence | longtext | task dependency [JSON ] |
> > > > | maxRetryTimes | tinyint(4) | max retry times |
> > > > | retryInterval | tinyint(4) | retry interval |
> > > > | timeout | varchar(128) | time out [JSON ] |
> > > > | taskInstancePriority | varchar(16) | task priority |
> > > > | workerGroup | varchar(64) | worker group name |
> > > > | preTasks | varchar(128) | pre task |
> > > > | locations | text | dag location |
> > > > | connects | text | dag connect |
> > > > | resource | varchar(255) | resouce mark |
> > > > | datasource | varchar(255) | datasource mark |
> > > >
> > > >
> > > >
> > > >
> > > > 2: Consider whether you need a third table to store the dependencies
> > > > between tasks, mainly for workflow dependent nodes, condition
> > judgments,
> > > > and task bloodlines for higher-level abstraction. Split into the
> third
> > > > table.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > 3. Identify the issues to be discussed in the next meeting
> > > > 3.1. Is the workflow definition table split into two tables or into
> > > > three tables?
> > > > 3.2. How to store data sources and resource files
> > > > 3.3. How to store workflow task instances without affecting
> > > > re-running, and to support existing functions such as editing
> workflow
> > > > definitions.
> > > > 3.4. Workflow definition version issue
> > > >
> > > >
> > > > We are very grateful to the following friends for their discussions:
> > > > dailidong, lgcareer, CalvinKirs, Rubik-W, leonbao, zixi0825,
> > > JinyLeeChina,
> > > > chenxingchun, BoYiZhang, etc. They provided a lot of effective
> > > suggestions
> > > > for this meeting.
> > > >
> > > >
> > > > At the same time, the community also hopes that more people can
> > > > participate. Thank you very much.
> > > >
> > > >
> > > > Best  wishes！
> > > > BoYiZhang
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > ------------------------------------------------
> > > >
> > > >
> > > > hi，DolphinScheduler 社区：
> > > >
> > > >
> > > > 我们在北京时间2020-11-17
> > > >
> > 19:00针对DolphinScheduler重构工作流定义存储结构(拆大JSON)进行了讨论，共有10+位伙伴参与了本次会议，会议讨论信息如下：
> > > >
> > > >
> > > > 1:目前提供一个将工作流定义表(t_ds_process_definition)拆分为两张表的思路,数据结构信息如下 :
> > > > t_ds_process_definition[主体表] 和 t_ds_process_definition_task [任务详情表]
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > 2:考虑是否需要第三张表存储任务之间的依赖关系, 主要针对工作流依赖节点,条件判断,任务血缘进行更高层级的抽象.拆分成第三张表.
> > > >
> > > >
> > > > |  t_ds_process_definition[主体表]  |
> > > > | 序号 | 字段 | 类型 | 描述 |
> > > > | 1 | id | int(11) | 主键 |
> > > > | 2 | name | varchar(255) | 流程定义名称 |
> > > > | 3 | version | int(11) | 流程定义版本 |
> > > > | 4 | release_state | tinyint(4) | 流程定义的发布状态：0 未上线 , 1已上线 |
> > > > | 5 | project_id | int(11) | 项目id |
> > > > | 6 | user_id | int(11) | 流程定义所属用户id |
> > > > | 7 | description | text | 流程定义描述 |
> > > > | 8 | global_params | text | 全局参数 |
> > > > | 9 | flag | tinyint(4) | 流程是否可用：0 不可用，1 可用 |
> > > > | 10 | receivers | text | 收件人 |
> > > > | 11 | receivers_cc | text | 抄送人 |
> > > > | 12 | create_time | datetime | 创建时间 |
> > > > | 13 | timeout | int(11) | 超时时间 |
> > > > | 14 | tenant_id | int(11) | 租户id |
> > > > | 15 | update_time | datetime | 更新时间 |
> > > > | 16 | modify_by | varchar(36) | 修改用户 |
> > > >
> > > >
> > > > | t_ds_process_definition_task [任务详情表] |
> > > > | 序号 | 参数名 | 类型 | 描述 |
> > > > | 1 | id | int(11) | 任务id |
> > > > | 2 | name | varchar(255) | 任务名称 |
> > > > | 3 | type | varchar(64) | 类型 [SHLL,PYTHON,DATAX,SPARK 等等 ] |
> > > > | 4 | process_definition_id | int(11) | 流程定义id |
> > > > | 5 | params | longtext | 自定义参数 [ Json 格式 保存原有的params字段,
> > > 自定义参数和资源文件参数是否拆出
> > > > ] |
> > > > | 6 | description | text | 描述 |
> > > > | 7 | runFlag | tinyint(4) | 运行标识 |
> > > > | 8 | conditionResult | longtext | 条件分支 [JSON格式] |
> > > > | 9 | dependence | longtext | 任务依赖  [JSON格式] |
> > > > | 10 | maxRetryTimes | tinyint(4) | 最大重试次数 |
> > > > | 11 | retryInterval | tinyint(4) | 重试间隔 |
> > > > | 12 | timeout | varchar(128) | 超时控制策略 [JSON格式] |
> > > > | 13 | taskInstancePriority | varchar(16) | 任务优先级 |
> > > > | 14 | workerGroup | varchar(64) | Worker 分组名称 |
> > > > | 15 | preTasks | varchar(128) | 前置任务 |
> > > > | 16 | locations | text | 节点坐标信息 |
> > > > | 17 | connects | text | 节点连线信息 |
> > > > | 18 | resource | varchar(255) | 资源文件标识 , 以逗号分隔 |
> > > > | 19 | datasource | varchar(255) | 数据源标识 , 以逗号分隔 |
> > > >
> > > >
> > > > 3.明确下次会议待讨论问题
> > > > 3.1. 工作流定义表是拆成两张表还是拆成三张表?
> > > > 3.2. 数据源和资源文件如何存储
> > > > 3.3. 工作流任务实例如何存储,不影响重跑  ,要支持编辑工作流定义等现有的功能.
> > > > 3.4. 工作流定义版本问题
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
> 我们很感谢以下朋友的讨论：dailidong、lgcareer、CalvinKirs、Rubik-W、leonbao、zixi0825、JinyLeeChina、chenxingchun
> > > > 、 BoYiZhang 等，他们对本次会议提供了很多有效的建议。
> > > >
> > > >
> > > > 同时社区也希望更多的人能够参与进来。非常感谢你们。
> > > >
> > > >
> > > > --------------------------------------
> > > > BoYi ZhangE-mail : [email protected]
> > > >
> > > >
> > >
> >
>

Re: [Online Meeting Invitation]DolphinScheduler refactor meeting(Second)

Reply via email to