hi, you can send mail to [email protected] to unsubscribe the mailing list
Best Regards --------------- Apache DolphinScheduler PMC Chair David Dai [email protected] Linkedin: https://www.linkedin.com/in/dailidong Twitter: @WorkflowEasy --------------- On Sat, Oct 2, 2021 at 9:18 PM Marlone <[email protected]> wrote: > > 提示邮箱满了,哪位负责处理的? > > > > ---原始邮件--- > 发件人: "感谢政府"<[email protected]> > 发送时间: 2021年9月30日(周四) 上午10:42 > 收件人: "dev"<[email protected]>; > 主题: 回复:回复: [PROPOSAL] Add Python API implementation of workflows-as-code > > > 在开始发送给你的邮件里有取消订阅邮箱链接 > > > > > > ------------------ 原始邮件 ------------------ > 发件人: > "dev" > > <[email protected]>; > 发送时间: 2021年9月30日(星期四) 上午10:36 > 收件人: "dev"<[email protected]>; > > 主题: 回复:回复: [PROPOSAL] Add Python API implementation of workflows-as-code > > > > 无法取消订阅? > > > > ---原始邮件--- > 发件人: "zhang junfan"<[email protected]&gt; > 发送时间: 2021年9月30日(周四) 上午10:35 > 收件人: > "[email protected]"<[email protected]&gt;; > 主题: 回复: [PROPOSAL] Add Python API implementation of workflows-as-code > > > Good job, thanks focusing on multi-lang support. > > Minor discussion. > > &nbsp; 1.&nbsp; Could you please provide some spark/flink process > examples? > &nbsp; 2.&nbsp; I'm confused with workflow-as-code, you means it just > define the DAG and workflow parameters? Could we combine workflow and user > task code(like spark/flink programs)? > > ________________________________ > 发件人: Jiajie Zhong <[email protected]&gt; > 发送时间: 2021年9月28日 11:42 > 收件人: [email protected] <[email protected]&gt; > 主题: [PROPOSAL] Add Python API implementation of workflows-as-code > > Hey guys, > > &nbsp;&nbsp;&nbsp; Apache DolphinScheduler is a good tool for > workflow scheduler, it’s easy-to-extend, > distributed and have nice UI to create and maintain workflow. Our workflow > only support > define in UI, which is easy to use and user friendly, it’s good but could be > batter by > adding extend API and make workflow could define as code or yaml file. And > consider yaml > file it’s hard to maintain manually I think it better to use code to define > it, aka workflows-as-code. > > &nbsp;&nbsp;&nbsp; When workflow definitions as code, we could > easy to modify some configure and do > some batch change for it. It’s could more easy to define similar task by loop > statement, > and it give ability adding unittest for workflow too. I hope Apache > DolphinScheduler could > combine the benefit of define by code and by UI, so I raise proposal for > adding > workflows-as-code to Apache DolphinScheduler. > > &nbsp;&nbsp;&nbsp; Actually, I already start it by adding POC > PR[1]. In this PR, I adding Python API give > user define workflow by Python code. This feature use *Py4J* connect Java and > Python, > which mean I never add any new database model and infra to Apache > DolphinScheduler, > I just reuse layer service in dolphinscheduler-api package to create > workflow. And we could > consider Python API just another interface for Apache DolphinScheduler, just > like our UI, it > allow we define and maintain workflow follow their rule. > > &nbsp;&nbsp;&nbsp; Here it’s an tutorial workflow definitions by > Python API, which you could find it in PR file[2] > > ```python > from pydolphinscheduler.core.process_definition import ProcessDefinition > from pydolphinscheduler.tasks.shell import Shell > > with ProcessDefinition(name="tutorial") as pd: > &nbsp;&nbsp;&nbsp; task_parent = Shell(name="task_parent", > command="echo hello pydolphinscheduler") > &nbsp;&nbsp;&nbsp; task_child_one = Shell(name="task_child_one", > command="echo 'child one'") > &nbsp;&nbsp;&nbsp; task_child_two = Shell(name="task_child_two", > command="echo 'child two'") > &nbsp;&nbsp;&nbsp; task_union = Shell(name="task_union", > command="echo union") > > &nbsp;&nbsp;&nbsp; task_group = [task_child_one, task_child_two] > &nbsp;&nbsp;&nbsp; task_parent.set_downstream(task_group) > > &nbsp;&nbsp;&nbsp; task_union << task_group > > &nbsp;&nbsp;&nbsp; pd.run() > ``` > > &nbsp;&nbsp;&nbsp; In tutorial, we define a new ProcessDefinition > named ‘tutorial’ using python context, > and then we add four Shell tasks to ‘tutorial’, just five line we could > create one process > definition with four tasks. > &nbsp;&nbsp;&nbsp; Beside process definition and tasks, another > think we have to > add to workflow it’s task dependent, we add function `set_downstream` and > `set_upstream` > to describe task dependent. At the same time, we overwrite bit operator and > add a shortcut > `&gt;&gt;` and&nbsp; `<<` to do it. > &nbsp;&nbsp; After dependent set, we done our workflow definition, > but all definition are in Python API > side, which mean it not persist to Apache DolphinScheduler database, and it > could not runs > by Apache DolphinScheduler until declare `pd.submit()` or directly run it by > `pd.run()` > > > [1]: https://github.com/apache/dolphinscheduler/pull/6269 > <https://github.com/apache/dolphinscheduler/pull/6269&gt; > [2]: > https://github.com/apache/dolphinscheduler/pull/6269/files#diff-5561fec6b57cc611bee2b0d8f030965d76bdd202801d9f8a1e2e74c21769bc41 > > <https://github.com/apache/dolphinscheduler/pull/6269/files#diff-5561fec6b57cc611bee2b0d8f030965d76bdd202801d9f8a1e2e74c21769bc41&gt; > > > Best Wish > — Jiajie
