Thanks for writing detailed documentation. I think this is also a missing 
feature of DS.
About the extension point:
1.Can ssh tasks be merged into shell tasks. Essentially, they all execute shell 
commands.
2.About dummy task, DS has the function of disable nodes, I do n’t know if this 
requirement is met.

The script from AirFlow to Dolphin is great.

> 在 2020年5月18日,09:28,裴龙武 <peilon...@qq.com> 写道:
> 
> 
> OK, 3Q! 
> 
> First, I will ensure that open source can use.
> 
> Second, I think we must discuss deeply. I write a more detailed document. You 
> can check the attachment. I also send the document to DaiLidong.
> 
> Third,  I'll give you the error of not using SSH connection pool.
> 
> 
> 
> 
> ------------------ 原始邮件 ------------------
> 发件人: "wenhemin"<whm_...@163.com>;
> 发送时间: 2020年5月14日(星期四) 晚上7:26
> 收件人: "裴龙武"<peilon...@qq.com>;
> 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow
> 
> Great!
> I think, Can ssh tasks be merged into shell tasks,  execute script locally or 
> remotely, Configure on the front end.
> About ssh connect pool, I did not find it necessary to use the connection 
> pool.
> 
> BTW, Look at the code to introduce additional jar packages, You also need to 
> ensure that open source can use the license of this jar package.
> 
>> 在 2020年5月14日,16:20,裴龙武 <peilon...@qq.com <mailto:peilon...@qq.com>> 写道:
>> 
>> 
>> 1. The priority between these tasks is also depended on the dolphin DAG 
>> define. When the front task is not finished, it not execute next task.
>> 
>> 2. I extend ssh task. I also use local params to config ssh host, user and 
>> password.
>> 
>> E.g:
>> public static AbstractTask newTask(TaskExecutionContext 
>> taskExecutionContext, Logger logger)
>>     throws IllegalArgumentException {
>>   Boolean enable = 
>> JSONUtils.parseObject(taskExecutionContext.getTaskParams()).getBoolean("enable");
>>   if (enable != null && enable == false ) {
>>     return new DummyTask(taskExecutionContext, logger);
>>   }
>>   switch 
>> (EnumUtils.getEnum(TaskType.class,taskExecutionContext.getTaskType())) {
>>     case SHELL:
>>       return new ShellTask(taskExecutionContext, logger);
>>     case PROCEDURE:
>>       return new ProcedureTask(taskExecutionContext, logger);
>>     case SQL:
>>       return new SqlTask(taskExecutionContext, logger);
>>     case MR:
>>       return new MapReduceTask(taskExecutionContext, logger);
>>     case SPARK:
>>       return new SparkTask(taskExecutionContext, logger);
>>     case FLINK:
>>       return new FlinkTask(taskExecutionContext, logger);
>>     case PYTHON:
>>       return new PythonTask(taskExecutionContext, logger);
>>     case HTTP:
>>       return new HttpTask(taskExecutionContext, logger);
>>     case DATAX:
>>       return new DataxTask(taskExecutionContext, logger);
>>     case SQOOP:
>>       return new SqoopTask(taskExecutionContext, logger);
>>     case SSH:
>>       return new SSHTask(taskExecutionContext, logger);
>>     default:
>>       logger.error("unsupport task type: {}", 
>> taskExecutionContext.getTaskType());
>>       throw new IllegalArgumentException("not support task type");
>>   }
>> }
>> 3. I am not sure that it supports window or not.
>> 
>> 
>> 
>> ------------------ 原始邮件 ------------------
>> 发件人: "wenhemin"<whm_...@163.com <mailto:whm_...@163.com>>;
>> 发送时间: 2020年5月14日(星期四) 下午3:46
>> 收件人: "裴龙武"<peilon...@qq.com <mailto:peilon...@qq.com>>;
>> 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow
>> 
>> Sorry, My previous description is not very clear.
>> 
>> I want to ask some questions:
>> 1.How to control the priority between ssh tasks? There may be some ssh tasks 
>> that have been waiting for execution.
>> 2.I understand what you want to solve is the problem of executing remote ssh 
>> scripts in batches.
>>   So, not sure how to use this function.
>> 3.I don't know if this supports windows system.
>> 
>>> 在 2020年5月13日,20:56,裴龙武 <peilon...@qq.com <mailto:peilon...@qq.com>> 写道:
>>> 
>>> 
>>> I use spin lock. Here is my code. Of course , it's not perfect. I just do a 
>>> test. To my surprise, it is the result of the execution is the same as the 
>>> AirFlow
>>> 
>>> 我通过模拟自选锁方式实现,附件中是我的代码,当然,这并不完善。我拿这个做了测试。令我惊喜的是,我得到了和 AirFlow 相同的结果。
>>> 
>>> 
>>> 
>>> 
>>> ------------------ 原始邮件 ------------------
>>> 发件人: "whm_777"<whm_...@163.com <mailto:whm_...@163.com>>;
>>> 发送时间: 2020年5月13日(星期三) 晚上7:21
>>> 收件人: "裴龙武"<peilon...@qq.com <mailto:peilon...@qq.com>>;
>>> 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow
>>> 
>>> You can modify the maximum number of linux ssh connections.
>>> If use ssh connection pool, How to control the priority of ssh?
>>> 
>>>> 在 2020年5月13日,18:01,裴龙武 <peilon...@qq.com <mailto:peilon...@qq.com>> 写道:
>>>> 
>>>> 
>>>> First 3Q,
>>>> 
>>>> I  use more than 100 task node. But SSH connections are limited.
>>>> 
>>>> 我是使用了100多个任务节点,但服务器SSH连接是有限制的,超过后,就会报错了。下面是我扩展SSH任务节点后的一张截图,另外这个DAG是我从AirFlow转换过来的。
>>>> <330be...@f7f80e73.76c5bb5e.jpg>
>>>> 
>>>> 
>>>> 
>>>> ------------------ 原始邮件 ------------------
>>>> 发件人: "whm_777"<whm_...@163.com <mailto:whm_...@163.com>>;
>>>> 发送时间: 2020年5月13日(星期三) 下午5:50
>>>> 收件人: "裴龙武"<peilon...@qq.com <mailto:peilon...@qq.com>>;
>>>> 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow
>>>> 
>>>> E.g.
>>>> rtn_code=`ssh -o ServerAliveInterval=60 -p xxxx r...@xxx.xxx.xxx.xxx 
>>>> <mailto:r...@xxx.xxx.xxx.xxx> ‘shell command  >/dev/null 2>&1; echo $?'`
>>>> if [ "$rtn_code" -eq 0 ]; then
>>>>         echo "成功"
>>>>         exit 0
>>>> else
>>>>         echo "失败"
>>>>         exit 1
>>>> fi
>>>> 
>>>> Batch shell command is not supported.
>>>> Multiple servers can be split into multiple task nodes.
>>>> 
>>>>> 在 2020年5月13日,17:40,裴龙武 <peilon...@qq.com <mailto:peilon...@qq.com>> 写道:
>>>>> 
>>>>> 
>>>>> Could you give me a example,3Q. 能否给我一个例子,谢谢!
>>>>> 
>>>>> By the way, I have more than 100 tasks in one DAG. These tasks connect 
>>>>> two other server to execute. So SSH tasks must have pool to manager. Now 
>>>>> I use JSch and realize a simple pool.
>>>>> 
>>>>> 顺带说一下,在我的实际场景中,我有100多个 SSH 任务,这些任务连接两台任务服务器进行任务执行。所以 SSH 
>>>>> 任务进行连接时,必须使用连接池进行管理。当前我使用 JSch,并实现了一个简单的连接池。
>>>>> 
>>>>> ------------------ 原始邮件 ------------------
>>>>> 发件人: "wenhemin"<whm_...@163.com <mailto:whm_...@163.com>>;
>>>>> 发送时间: 2020年5月13日(星期三) 下午5:24
>>>>> 收件人: "dev"<dev@dolphinscheduler.apache.org 
>>>>> <mailto:dev@dolphinscheduler.apache.org>>;
>>>>> 主题: Re: [Feature] Support SSH Task and Support dummy task like airflow
>>>>> 
>>>>> The shell node is supports remote calling, and get the remote command 
>>>>> result code.
>>>>> 
>>>>> 
>>>>> > 在 2020年5月13日,15:16,裴龙武 <peilon...@qq.com <mailto:peilon...@qq.com>> 写道:
>>>>> > 
>>>>> > Dear ALL:
>>>>> > 
>>>>> > 
>>>>> > Support Linux SSH Task 支持 Linux SSH 任务
>>>>> > 
>>>>> > 场景描述:当前项目中,工作流的任务的目标是执行不同服务器 Shell 脚本,Shell 脚本是保存在业务服务器的固定目录。当 Worker 
>>>>> > 调度执行时,需要通过固定用户登录这些服务器,然后执行 Shell 脚本并获取这些任务执行的状态,其中服务器地址、用户名、密码可配置。
>>>>> > 
>>>>> > For example, in my project, the workflow's tasks want to execute shell 
>>>>> > scripts where are in different server's different directory. When 
>>>>> > worker execute these shell scripts, it must use the same user to login 
>>>>> > these server. Also, the worker can get the executing state of these 
>>>>> > server. We can config these server 's host,user and password.
>>>>> > 
>>>>> > SSH Task is very useful for most user SSH 任务对大多数用户是非常有用的
>>>>> > 
>>>>> > 分布式调度任务所执行的 Shell 脚本是处于不同的业务服务器,都有其固定的业务,这些业务服务器不是 Worker,只是需要 Worker 
>>>>> > 调度执行,我们只需要传递不同的参数,让服务器执行任务脚本即可。
>>>>> > 
>>>>> > In dolphinscheduler, the most executing tasks are in different servers 
>>>>> > who are not workers. These servers also have their different fixed 
>>>>> > services. We just have to pass different parameters to schedule these 
>>>>> > shell scripts to execute.
>>>>> > 
>>>>> > Python has a module to execute ssh script Python 有固定的工具包,可执行这些SSH Shell 
>>>>> > 脚本
>>>>> > 
>>>>> > Python 有一个可执行远程服务器SSH Shell脚本的模块,其名字为:paramiko。
>>>>> > 
>>>>> > Python has a module that can execute SSH Shell script. It's paramiko.
>>>>> > 
>>>>> > Others 其他内容
>>>>> > 
>>>>> > 我发现之前的改进功能中也有关于这个的描述,不过相对简单。功能更新地址
>>>>> > 
>>>>> > I found this described in previous feature, but it was relatively 
>>>>> > simple.
>>>>> > Feature URL
>>>>> > 
>>>>> > 另外,我通过 Shell Task 方式去执行远程任务会非常不便,下面是我的脚本,不知道是否有更好的方式。
>>>>> > 
>>>>> > In addition, it is very inconvenient for me to perform remote tasks 
>>>>> > through Shell Task. Here is my script. I don't know if there's a better 
>>>>> > way.
>>>>> > sshpass -p 'password' ssh user@host echo 'ssh success' echo 'Hello 
>>>>> > World' -&gt; /home/dolphinscheduler/test/hello.txt echo 'end'
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > Support dummy task like airflow 支持像 Airflow 中的虚拟任务
>>>>> > 
>>>>> > 场景描述:项目中,有已经产品化的 DAG 文件,DAG 
>>>>> > 文件中包括不同的模块,这些模块之间的有些点是相互依赖的,有些不是,在用户购买不同模块时,需要把未购买模块且其他已购模块未依赖的点设置为 
>>>>> > Dummy 
>>>>> > Task,这样实际这些任务就不会执行,这样设置的好处是产品统一性和图的完整性,在AirFlow中,这些是通过DummyOperator完成的。
>>>>> > 
>>>>> > For example, in my project, it has a productized DAG file. The file 
>>>>> > contains different modules, some of which are interdependent and some 
>>>>> > of which are not. When customers purchase different modules, we need to 
>>>>> > set some tasks as dummy tasks, which some modules are not purchased and 
>>>>> > the purchased module is not dependent. Because of this setting, these 
>>>>> > dummy tasks are actually not executed. The benefits of this setup are 
>>>>> > product unity and diagram integrity. In airflow, these task execute by 
>>>>> > dummy operator.
>>>>> > 
>>>>> > ** Realize 实现方式**
>>>>> > 
>>>>> > Dummy Task 本身实现很简单,只是需要与其他任务配合使用,但任务执行方式设置为 dummy 时,实际的任务不执行,执行 Dummy 
>>>>> > Task。
>>>>> > 
>>>>> > Dummy Task is easy to realize, but it need to use with other different 
>>>>> > tasks. When the task's executed type is set to dummy type, the task are 
>>>>> > executed as a dummy task and the real task is not executed.
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > 顺带说一下,因为项目着急测试使用,我Fork了开发版本,实现两种任务类型。在后续的版本中是否能够支持。
>>>>> > 
>>>>> > By the way,I already realize these two&nbsp; features in my fork 
>>>>> > branch.&nbsp;Whether the follow-up release can be supported
>>>>> 
>>>> 
>>> 
>>> <SSHClient.java><SSHPool.java><SSHTask.java>
>> 
> 
> <项目场景中关于Dolphin的一些扩展点.pdf>

Reply via email to