glad to hear that you will implement this feature
Best Regards --------------- DolphinScheduler(Incubator) PPMC Lidong Dai 代立冬 dailidon...@gmail.com --------------- 裴龙武 <peilon...@qq.com> 于2020年5月20日周三 下午3:47写道: > My code is not perfect yet. I will write a detailed design document. Then > I will realize this feature about our discussion result. > > > ------------------ 原始邮件 ------------------ > 发件人: "wenhemin"<whm_...@163.com>; > 发送时间: 2020年5月18日(星期一) 晚上7:50 > 收件人: "裴龙武"<peilon...@qq.com>;"dev"<dev@dolphinscheduler.apache.org > >; > > 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow > > > > Thanks for writing detailed documentation. I think this is also a missing > feature of DS. > About the extension point: > 1.Can ssh tasks be merged into shell tasks. Essentially, they all execute > shell commands. > 2.About dummy task, DS has the function of disable nodes, I do n’t know if > this requirement is met. > > The script from AirFlow to Dolphin is great. > > > 在 2020年5月18日,09:28,裴龙武 <peilon...@qq.com> 写道: > > > > > > OK, 3Q! > > > > First, I will ensure that open source can use. > > > > Second, I think we must discuss deeply. I write a more detailed > document. You can check the attachment. I also send the document to > DaiLidong. > > > > Third, I'll give you the error of not using SSH connection pool. > > > > > > > > > > ------------------ 原始邮件 ------------------ > > 发件人: "wenhemin"<whm_...@163.com>; > > 发送时间: 2020年5月14日(星期四) 晚上7:26 > > 收件人: "裴龙武"<peilon...@qq.com>; > > 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow > > > > Great! > > I think, Can ssh tasks be merged into shell tasks, execute > script locally or remotely, Configure on the front end. > > About ssh connect pool, I did not find it necessary to use the > connection pool. > > > > BTW, Look at the code to introduce additional jar packages, You also > need to ensure that open source can use the license of this jar package. > > > >> 在 2020年5月14日,16:20,裴龙武 <peilon...@qq.com > <mailto:peilon...@qq.com>> > 写道: > >> > >> > >> 1. The priority between these tasks is also depended on the > dolphin DAG define. When the front task is not finished, it not execute > next task. > >> > >> 2. I extend ssh task. I also use local params to config ssh host, > user and password. > >> > >> E.g: > >> public static AbstractTask newTask(TaskExecutionContext > taskExecutionContext, Logger logger) > >> throws IllegalArgumentException { > >> Boolean enable = > JSONUtils.parseObject(taskExecutionContext.getTaskParams()).getBoolean("enable"); > >> if (enable != null && enable == false ) { > >> return new > DummyTask(taskExecutionContext, logger); > >> } > >> switch > (EnumUtils.getEnum(TaskType.class,taskExecutionContext.getTaskType())) { > >> case SHELL: > >> return new > ShellTask(taskExecutionContext, logger); > >> case PROCEDURE: > >> return new > ProcedureTask(taskExecutionContext, logger); > >> case SQL: > >> return new > SqlTask(taskExecutionContext, logger); > >> case MR: > >> return new > MapReduceTask(taskExecutionContext, logger); > >> case SPARK: > >> return new > SparkTask(taskExecutionContext, logger); > >> case FLINK: > >> return new > FlinkTask(taskExecutionContext, logger); > >> case PYTHON: > >> return new > PythonTask(taskExecutionContext, logger); > >> case HTTP: > >> return new > HttpTask(taskExecutionContext, logger); > >> case DATAX: > >> return new > DataxTask(taskExecutionContext, logger); > >> case SQOOP: > >> return new > SqoopTask(taskExecutionContext, logger); > >> case SSH: > >> return new > SSHTask(taskExecutionContext, logger); > >> default: > >> logger.error("unsupport task > type: {}", taskExecutionContext.getTaskType()); > >> throw new > IllegalArgumentException("not support task type"); > >> } > >> } > >> 3. I am not sure that it supports window or not. > >> > >> > >> > >> ------------------ 原始邮件 ------------------ > >> 发件人: "wenhemin"<whm_...@163.com <mailto:whm_...@163.com>>; > >> 发送时间: 2020年5月14日(星期四) 下午3:46 > >> 收件人: "裴龙武"<peilon...@qq.com <mailto:peilon...@qq.com>>; > >> 主题: Re: [Feature] Support SSH Task and Support dummy task like > airflow > >> > >> Sorry, My previous description is not very clear. > >> > >> I want to ask some questions: > >> 1.How to control the priority between ssh tasks? There may be > some ssh tasks that have been waiting for execution. > >> 2.I understand what you want to solve is the problem of executing > remote ssh scripts in batches. > >> So, not sure how to use this function. > >> 3.I don't know if this supports windows system. > >> > >>> 在 2020年5月13日,20:56,裴龙武 <peilon...@qq.com <mailto: > peilon...@qq.com>> 写道: > >>> > >>> > >>> I use spin lock. Here is my code. Of course , it's not > perfect. I just do a test. To my surprise, it is the result of the > execution is the same as the AirFlow > >>> > >>> 我通过模拟自选锁方式实现,附件中是我的代码,当然,这并不完善。我拿这个做了测试。令我惊喜的是,我得到了和 AirFlow > 相同的结果。 > >>> > >>> > >>> > >>> > >>> ------------------ 原始邮件 ------------------ > >>> 发件人: "whm_777"<whm_...@163.com <mailto:whm_...@163.com > >>; > >>> 发送时间: 2020年5月13日(星期三) 晚上7:21 > >>> 收件人: "裴龙武"<peilon...@qq.com <mailto:peilon...@qq.com>>; > >>> 主题: Re: [Feature] Support SSH Task and Support dummy task > like airflow > >>> > >>> You can modify the maximum number of linux ssh connections. > >>> If use ssh connection pool, How to control the priority of > ssh? > >>> > >>>> 在 2020年5月13日,18:01,裴龙武 <peilon...@qq.com <mailto: > peilon...@qq.com>> 写道: > >>>> > >>>> > >>>> First 3Q, > >>>> > >>>> I use more than 100 task node. But SSH connections > are limited. > >>>> > >>>> > 我是使用了100多个任务节点,但服务器SSH连接是有限制的,超过后,就会报错了。下面是我扩展SSH任务节点后的一张截图,另外这个DAG是我从AirFlow转换过来的。 > >>>> <330be...@f7f80e73.76c5bb5e.jpg> > >>>> > >>>> > >>>> > >>>> ------------------ 原始邮件 ------------------ > >>>> 发件人: "whm_777"<whm_...@163.com <mailto:whm_...@163.com > >>; > >>>> 发送时间: 2020年5月13日(星期三) 下午5:50 > >>>> 收件人: "裴龙武"<peilon...@qq.com <mailto:peilon...@qq.com > >>; > >>>> 主题: Re: [Feature] Support SSH Task and Support dummy task > like airflow > >>>> > >>>> E.g. > >>>> rtn_code=`ssh -o ServerAliveInterval=60 -p xxxx > r...@xxx.xxx.xxx.xxx <mailto:r...@xxx.xxx.xxx.xxx> ‘shell > command >/dev/null 2>&1; echo $?'` > >>>> if [ "$rtn_code" -eq 0 ]; then > >>>> echo "成功" > >>>> exit 0 > >>>> else > >>>> echo "失败" > >>>> exit 1 > >>>> fi > >>>> > >>>> Batch shell command is not supported. > >>>> Multiple servers can be split into multiple task nodes. > >>>> > >>>>> 在 2020年5月13日,17:40,裴龙武 <peilon...@qq.com <mailto: > peilon...@qq.com>> 写道: > >>>>> > >>>>> > >>>>> Could you give me a example,3Q. 能否给我一个例子,谢谢! > >>>>> > >>>>> By the way, I have more than 100 tasks in one DAG. > These tasks connect two other server to execute. So SSH tasks must have > pool to manager. Now I use JSch and realize a simple pool. > >>>>> > >>>>> 顺带说一下,在我的实际场景中,我有100多个 SSH 任务,这些任务连接两台任务服务器进行任务执行。所以 > SSH 任务进行连接时,必须使用连接池进行管理。当前我使用 JSch,并实现了一个简单的连接池。 > >>>>> > >>>>> ------------------ 原始邮件 ------------------ > >>>>> 发件人: "wenhemin"<whm_...@163.com <mailto: > whm_...@163.com>>; > >>>>> 发送时间: 2020年5月13日(星期三) 下午5:24 > >>>>> 收件人: "dev"<dev@dolphinscheduler.apache.org <mailto: > dev@dolphinscheduler.apache.org>>; > >>>>> 主题: Re: [Feature] Support SSH Task and Support dummy > task like airflow > >>>>> > >>>>> The shell node is supports remote calling, and get > the remote command result code. > >>>>> > >>>>> > >>>>> > 在 2020年5月13日,15:16,裴龙武 <peilon...@qq.com > <mailto:peilon...@qq.com>> 写道: > >>>>> > > >>>>> > Dear ALL: > >>>>> > > >>>>> > > >>>>> > Support Linux SSH Task 支持 Linux SSH 任务 > >>>>> > > >>>>> > 场景描述:当前项目中,工作流的任务的目标是执行不同服务器 Shell 脚本,Shell > 脚本是保存在业务服务器的固定目录。当 Worker 调度执行时,需要通过固定用户登录这些服务器,然后执行 Shell > 脚本并获取这些任务执行的状态,其中服务器地址、用户名、密码可配置。 > >>>>> > > >>>>> > For example, in my project, the workflow's tasks > want to execute shell scripts where are in different server's different > directory. When worker execute these shell scripts, it must use the same > user to login these server. Also, the worker can get the executing state of > these server. We can config these server 's host,user and password. > >>>>> > > >>>>> > SSH Task is very useful for most user SSH > 任务对大多数用户是非常有用的 > >>>>> > > >>>>> > 分布式调度任务所执行的 Shell > 脚本是处于不同的业务服务器,都有其固定的业务,这些业务服务器不是 Worker,只是需要 Worker > 调度执行,我们只需要传递不同的参数,让服务器执行任务脚本即可。 > >>>>> > > >>>>> > In dolphinscheduler, the most executing tasks > are in different servers who are not workers. These servers also have their > different fixed services. We just have to pass different parameters to > schedule these shell scripts to execute. > >>>>> > > >>>>> > Python has a module to execute ssh script Python > 有固定的工具包,可执行这些SSH Shell 脚本 > >>>>> > > >>>>> > Python 有一个可执行远程服务器SSH Shell脚本的模块,其名字为:paramiko。 > >>>>> > > >>>>> > Python has a module that can execute SSH Shell > script. It's paramiko. > >>>>> > > >>>>> > Others 其他内容 > >>>>> > > >>>>> > 我发现之前的改进功能中也有关于这个的描述,不过相对简单。功能更新地址 > >>>>> > > >>>>> > I found this described in previous feature, but > it was relatively simple. > >>>>> > Feature URL > >>>>> > > >>>>> > 另外,我通过 Shell Task > 方式去执行远程任务会非常不便,下面是我的脚本,不知道是否有更好的方式。 > >>>>> > > >>>>> > In addition, it is very inconvenient for me to > perform remote tasks through Shell Task. Here is my script. I don't know if > there's a better way. > >>>>> > sshpass -p 'password' ssh user@host echo 'ssh > success' echo 'Hello World' -&gt; /home/dolphinscheduler/test/hello.txt > echo 'end' > >>>>> > > >>>>> > > >>>>> > > >>>>> > Support dummy task like airflow 支持像 Airflow > 中的虚拟任务 > >>>>> > > >>>>> > 场景描述:项目中,有已经产品化的 DAG 文件,DAG > 文件中包括不同的模块,这些模块之间的有些点是相互依赖的,有些不是,在用户购买不同模块时,需要把未购买模块且其他已购模块未依赖的点设置为 Dummy > Task,这样实际这些任务就不会执行,这样设置的好处是产品统一性和图的完整性,在AirFlow中,这些是通过DummyOperator完成的。 > >>>>> > > >>>>> > For example, in my project, it has a productized > DAG file. The file contains different modules, some of which are > interdependent and some of which are not. When customers purchase different > modules, we need to set some tasks as dummy tasks, which some modules are > not purchased and the purchased module is not dependent. Because of this > setting, these dummy tasks are actually not executed. The benefits of this > setup are product unity and diagram integrity. In airflow, these task > execute by dummy operator. > >>>>> > > >>>>> > ** Realize 实现方式** > >>>>> > > >>>>> > Dummy Task 本身实现很简单,只是需要与其他任务配合使用,但任务执行方式设置为 > dummy 时,实际的任务不执行,执行 Dummy Task。 > >>>>> > > >>>>> > Dummy Task is easy to realize, but it need to > use with other different tasks. When the task's executed type is set to > dummy type, the task are executed as a dummy task and the real task is not > executed. > >>>>> > > >>>>> > > >>>>> > > >>>>> > > >>>>> > > 顺带说一下,因为项目着急测试使用,我Fork了开发版本,实现两种任务类型。在后续的版本中是否能够支持。 > >>>>> > > >>>>> > By the way,I already realize these two&nbsp; > features in my fork branch.&nbsp;Whether the follow-up release can be > supported > >>>>> > >>>> > >>> > >>> <SSHClient.java><SSHPool.java><SSHTask.java> > >> > > > > <项目场景中关于Dolphin的一些扩展点.pdf>