Hi Wenjun, Thanks for your reminder. And i has update the images in https://github.com/apache/dolphinscheduler/issues/11652
Thanks. B. R. Yann On Tue, Oct 18, 2022 at 11:00 AM wenjun <[email protected]> wrote: > Hi Yann, > > The picture cannot display, could you please upload this to the issue#11652 > or other place and add links? > > Thanks, > Wenjun > > On Mon, Oct 17, 2022 at 9:06 PM yann ann <[email protected]> wrote: > > > Dear All, > > > > Relate issue: #11652 > > <https://github.com/apache/dolphinscheduler/issues/11652> > > > > According to the conclusion of the discussion at the regular meeting last > > week (2022/10/12), the development of this function is approved. And this > > solution has updated to this issue. > > > > - Add table named *t_ds_task_remote_server *in DB. > > - Add*TaskRemoteServer *entity in dolphinscheduler-dao model. > > - Add a*task remote server* manage page in Security menu, such as > > create, delete, edit and test connect. > > - Add atask remote server select input (field: taskRemoteServerCode) > > in Shell and Python task form. > > - Add*task_remote_server_code* column in > > *t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log* > > tables. > > - Add*taskRemoteServerCode *field in > > *TaskInstance, TaskDefintion, TaskDefinitionLog, TaskNode*, and add > > taskRemoteServerInfo field in TaskInstance (not a table field). > > - Add *taskRemoteServerInfo*(just contain ip, user, password, name) > > entity in dolphinscheduler-task-plugin model. And add > > taskRemoteServerInfofield in TaskExecutionContext. > > - Shell and Python task will check > > TaskExecutionContext.getTaskRemoteServerInfo(), if not null, it will > > scp the command files and resource files to the task remote server, > and > > send the start command to remote to exec it. If other task plugin > need this > > feture, can also check this task server field. > > - Use JSch 3rd to ssh and scp (scp resource files and command files to > > remote server). > > - The task will ssh and scp to the task server as the DS running user, > > not tenant. > > > > Answers to questions that you are more concerned about: > > > > *1. How to avoid causing IO overload?* > > The IO load will be protected by the following measures: > > > > 1. When the IO usage of Worker is high, the current task will not be > > executed. > > 2. Set the total size threshold for single-task transfer files. > > 3. Control transfer rate. > > 4. Monitor the file transfer process, if the IO usage is too high, the > > transfer process will be terminated and the task will be set to fail. > > > > *2. Why not use a remote jar and let the remote server download the > > resource files from the resource center?* > > I think this will increase the complexity of the remote server's > > environment, because the remote server is not part of the DS cluster. For > > example, the remote user need have the permission to download the > resource > > files, and need install JRE. > > > > > > If there are no more suggestions and questions, I will start development > > and submit relevant PRs asap for everyone to review. Thanks. > > > > B. R. > > Yann > > > > On Wed, Aug 31, 2022 at 7:54 PM yann ann <[email protected]> > wrote: > > > >> Dear All, > >> > >> Sorry for the inaccuracy of the previous description about Issue #11652 > >> <https://github.com/apache/dolphinscheduler/issues/11652> . > >> > >> *What is the purpose of this feature?* > >> DS can support the task instance running on the remote servers (task > >> server), not just worker nodes. > >> > >> - Users can manage these task servers on the page, such as create, > >> edit, delete and test connect. > >> - Each task node in DAG can specify the task server that needs to be > >> executed. > >> - The task server property belongs only to task instances, not > >> workflow instances. > >> > >> *MOP* > >> 1. Add *TaskServer *Entity in *dolphinscheduler-dao* model, and create a > >> table named *t_ds_task_server *in DB. Add TaskServer API. > >> [image: image.png] > >> 2. Add a task server manage page in Secerity menu, such as create, > >> delete, edit and test connect. > >> 3. Add a task server select input (field: taskServerCode) in Shell > >> and Python task form. > >> 4. Add *taskServerCode *column in* > >> t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log* > >> tables. And change these APIs. > >> 5. Add *taskServerCode *field in *TaskInstance, TaskDefintion, > >> TaskDefinitionLog, TaskNode*, and add *taskServerInfo *field in > *TaskInstance > >> *(not a table field). > >> 6. Add *taskServerInfo *(just contain ip, user, password, name) entity > >> in *dolphinscheduler-task-plugin* model. And add taskServerInfo field in > >> *TaskExecutionContext*. > >> 7. Shell and Python task will > >> check TaskExecutionContext.getTaskServerInfo(), if not null, it will scp > >> the command files and resource files to the task server, and send the > start > >> command to remote to exec it. If other task plugin need this feture, > can > >> also check this task server field. > >> 8. The task will ssh and scp to the task server as the DS running user, > >> not tenant. > >> > >> Look forward to your comments. If no any comment, I will try to complete > >> the test and submit a PR. Thanks > >> > >> B. R. > >> Yann > >> > >> On Sat, Aug 27, 2022 at 9:31 PM yann ann <[email protected]> > >> wrote: > >> > >>> Hi, > >>> > >>> The current DS only supports tasks running on the Worker servers, but > in > >>> most real business scenarios, tasks need to be run on the specified > server, > >>> and the server does not need to deploy the Worker service. Please > check the Issue > >>> #11652 <https://github.com/apache/dolphinscheduler/issues/11652> . > >>> Thanks to @SbloodyS for suggesting me to provide the details of the > >>> architecture design. > >>> > >>> Please check the architecture design. I'm not sure if DS needs to use a > >>> fixed design template, so I'll try to be as detailed as possible. > >>> This design mainly consists of four parts: > >>> > >>> *Part 1: Create a host table.* > >>> Create a table named t_ds_host in db. Please check the following E-R > >>> diagram. T_DS_HOST table The purpose of this table design is to save > the > >>> host information configured in the UI. > >>> > >>> [image: image.png] > >>> > >>> *Part 2: Add host manage page.* > >>> Add a *Host Manage* module in the Security Page, such as the > >>> *Environment Manage* page. Users can add the host records and test > >>> connect it. > >>> [image: image.png] > >>> *Part 3: Add host manage API* > >>> Add host manage API, same as environment manage model. > >>> About the connect test function, I think that DS needs to test SSH and > >>> SFTP at the same time. Because we need to scp our task script to this > host > >>> and run shell commands. I plan to use the JSch or SSNJ 3rd. > >>> > >>> *Part 4: Add host selection* > >>> Add host selection on Shell and Python task pages. > >>> Then shellTask or pythonTask can check this host param, if host param > is > >>> not null, the task will scp the command files to the remote server and > >>> dispatch the exec command. > >>> > >>> Why just add to Shell and Python tasks? > >>> Because I think each task needs to decide for itself if it needs to be > >>> executed on the remote server, I think shell and python should add it. > >>> > >>> look forward to your comments. > >>> > >>> B. R. > >>> Yann > >>> > >>> >
