Hi Yann, The picture cannot display, could you please upload this to the issue#11652 or other place and add links?
Thanks, Wenjun On Mon, Oct 17, 2022 at 9:06 PM yann ann <[email protected]> wrote: > Dear All, > > Relate issue: #11652 > <https://github.com/apache/dolphinscheduler/issues/11652> > > According to the conclusion of the discussion at the regular meeting last > week (2022/10/12), the development of this function is approved. And this > solution has updated to this issue. > > - Add table named *t_ds_task_remote_server *in DB. > - Add*TaskRemoteServer *entity in dolphinscheduler-dao model. > - Add a*task remote server* manage page in Security menu, such as > create, delete, edit and test connect. > - Add atask remote server select input (field: taskRemoteServerCode) > in Shell and Python task form. > - Add*task_remote_server_code* column in > *t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log* > tables. > - Add*taskRemoteServerCode *field in > *TaskInstance, TaskDefintion, TaskDefinitionLog, TaskNode*, and add > taskRemoteServerInfo field in TaskInstance (not a table field). > - Add *taskRemoteServerInfo*(just contain ip, user, password, name) > entity in dolphinscheduler-task-plugin model. And add > taskRemoteServerInfofield in TaskExecutionContext. > - Shell and Python task will check > TaskExecutionContext.getTaskRemoteServerInfo(), if not null, it will > scp the command files and resource files to the task remote server, and > send the start command to remote to exec it. If other task plugin need this > feture, can also check this task server field. > - Use JSch 3rd to ssh and scp (scp resource files and command files to > remote server). > - The task will ssh and scp to the task server as the DS running user, > not tenant. > > Answers to questions that you are more concerned about: > > *1. How to avoid causing IO overload?* > The IO load will be protected by the following measures: > > 1. When the IO usage of Worker is high, the current task will not be > executed. > 2. Set the total size threshold for single-task transfer files. > 3. Control transfer rate. > 4. Monitor the file transfer process, if the IO usage is too high, the > transfer process will be terminated and the task will be set to fail. > > *2. Why not use a remote jar and let the remote server download the > resource files from the resource center?* > I think this will increase the complexity of the remote server's > environment, because the remote server is not part of the DS cluster. For > example, the remote user need have the permission to download the resource > files, and need install JRE. > > > If there are no more suggestions and questions, I will start development > and submit relevant PRs asap for everyone to review. Thanks. > > B. R. > Yann > > On Wed, Aug 31, 2022 at 7:54 PM yann ann <[email protected]> wrote: > >> Dear All, >> >> Sorry for the inaccuracy of the previous description about Issue #11652 >> <https://github.com/apache/dolphinscheduler/issues/11652> . >> >> *What is the purpose of this feature?* >> DS can support the task instance running on the remote servers (task >> server), not just worker nodes. >> >> - Users can manage these task servers on the page, such as create, >> edit, delete and test connect. >> - Each task node in DAG can specify the task server that needs to be >> executed. >> - The task server property belongs only to task instances, not >> workflow instances. >> >> *MOP* >> 1. Add *TaskServer *Entity in *dolphinscheduler-dao* model, and create a >> table named *t_ds_task_server *in DB. Add TaskServer API. >> [image: image.png] >> 2. Add a task server manage page in Secerity menu, such as create, >> delete, edit and test connect. >> 3. Add a task server select input (field: taskServerCode) in Shell >> and Python task form. >> 4. Add *taskServerCode *column in* >> t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log* >> tables. And change these APIs. >> 5. Add *taskServerCode *field in *TaskInstance, TaskDefintion, >> TaskDefinitionLog, TaskNode*, and add *taskServerInfo *field in *TaskInstance >> *(not a table field). >> 6. Add *taskServerInfo *(just contain ip, user, password, name) entity >> in *dolphinscheduler-task-plugin* model. And add taskServerInfo field in >> *TaskExecutionContext*. >> 7. Shell and Python task will >> check TaskExecutionContext.getTaskServerInfo(), if not null, it will scp >> the command files and resource files to the task server, and send the start >> command to remote to exec it. If other task plugin need this feture, can >> also check this task server field. >> 8. The task will ssh and scp to the task server as the DS running user, >> not tenant. >> >> Look forward to your comments. If no any comment, I will try to complete >> the test and submit a PR. Thanks >> >> B. R. >> Yann >> >> On Sat, Aug 27, 2022 at 9:31 PM yann ann <[email protected]> >> wrote: >> >>> Hi, >>> >>> The current DS only supports tasks running on the Worker servers, but in >>> most real business scenarios, tasks need to be run on the specified server, >>> and the server does not need to deploy the Worker service. Please check the >>> Issue >>> #11652 <https://github.com/apache/dolphinscheduler/issues/11652> . >>> Thanks to @SbloodyS for suggesting me to provide the details of the >>> architecture design. >>> >>> Please check the architecture design. I'm not sure if DS needs to use a >>> fixed design template, so I'll try to be as detailed as possible. >>> This design mainly consists of four parts: >>> >>> *Part 1: Create a host table.* >>> Create a table named t_ds_host in db. Please check the following E-R >>> diagram. T_DS_HOST table The purpose of this table design is to save the >>> host information configured in the UI. >>> >>> [image: image.png] >>> >>> *Part 2: Add host manage page.* >>> Add a *Host Manage* module in the Security Page, such as the >>> *Environment Manage* page. Users can add the host records and test >>> connect it. >>> [image: image.png] >>> *Part 3: Add host manage API* >>> Add host manage API, same as environment manage model. >>> About the connect test function, I think that DS needs to test SSH and >>> SFTP at the same time. Because we need to scp our task script to this host >>> and run shell commands. I plan to use the JSch or SSNJ 3rd. >>> >>> *Part 4: Add host selection* >>> Add host selection on Shell and Python task pages. >>> Then shellTask or pythonTask can check this host param, if host param is >>> not null, the task will scp the command files to the remote server and >>> dispatch the exec command. >>> >>> Why just add to Shell and Python tasks? >>> Because I think each task needs to decide for itself if it needs to be >>> executed on the remote server, I think shell and python should add it. >>> >>> look forward to your comments. >>> >>> B. R. >>> Yann >>> >>>
