Hi  Wenjun,

Thanks for your reminder. And i has update the images in
https://github.com/apache/dolphinscheduler/issues/11652

Thanks.

B. R.
Yann

On Tue, Oct 18, 2022 at 11:00 AM wenjun <[email protected]> wrote:

> Hi Yann,
>
> The picture cannot display, could you please upload this to the issue#11652
> or other place and add links?
>
> Thanks,
> Wenjun
>
> On Mon, Oct 17, 2022 at 9:06 PM yann ann <[email protected]> wrote:
>
> > Dear All,
> >
> > Relate issue: #11652
> > <https://github.com/apache/dolphinscheduler/issues/11652>
> >
> > According to the conclusion of the discussion at the regular meeting last
> > week (2022/10/12), the development of this function is approved. And this
> > solution has updated to this issue.
> >
> >    - Add table named *t_ds_task_remote_server *in DB.
> >    - Add*TaskRemoteServer *entity in dolphinscheduler-dao model.
> >    - Add a*task remote server* manage page in Security menu, such as
> >    create, delete, edit and test connect.
> >    - Add atask remote server select input (field: taskRemoteServerCode)
> >    in Shell and Python task form.
> >    - Add*task_remote_server_code* column in
> >    *t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log*
> >     tables.
> >    - Add*taskRemoteServerCode *field in
> >    *TaskInstance, TaskDefintion, TaskDefinitionLog, TaskNode*, and add
> >    taskRemoteServerInfo field in TaskInstance (not a table field).
> >    - Add *taskRemoteServerInfo*(just contain ip, user, password, name)
> >    entity in dolphinscheduler-task-plugin model. And add
> >    taskRemoteServerInfofield in TaskExecutionContext.
> >    - Shell and Python task will check
> >    TaskExecutionContext.getTaskRemoteServerInfo(), if not null, it will
> >    scp the command files and resource files to the task remote server,
> and
> >    send the start command to remote to exec it. If other task plugin
> need this
> >    feture, can also check this task server field.
> >    - Use JSch 3rd to ssh and scp (scp resource files and command files to
> >    remote server).
> >    - The task will ssh and scp to the task server as the DS running user,
> >    not tenant.
> >
> > Answers to questions that you are more concerned about:
> >
> > *1. How to avoid causing IO overload?*
> > The IO load will be protected by the following measures:
> >
> >    1. When the IO usage of Worker is high, the current task will not be
> >    executed.
> >    2. Set the total size threshold for single-task transfer files.
> >    3. Control transfer rate.
> >    4. Monitor the file transfer process, if the IO usage is too high, the
> >    transfer process will be terminated and the task will be set to fail.
> >
> > *2. Why not use a remote jar and let the remote server download the
> > resource files from the resource center?*
> > I think this will increase the complexity of the remote server's
> > environment, because the remote server is not part of the DS cluster. For
> > example, the remote user need have the permission to download the
> resource
> > files, and need install JRE.
> >
> >
> > If there are no more suggestions and questions, I will start development
> > and submit relevant PRs asap for everyone to review. Thanks.
> >
> > B. R.
> > Yann
> >
> > On Wed, Aug 31, 2022 at 7:54 PM yann ann <[email protected]>
> wrote:
> >
> >> Dear All,
> >>
> >> Sorry for the inaccuracy of the previous description about  Issue #11652
> >> <https://github.com/apache/dolphinscheduler/issues/11652> .
> >>
> >> *What is the purpose of this feature?*
> >> DS can support the task instance running on the remote servers (task
> >> server), not just worker nodes.
> >>
> >>    - Users can manage these task servers on the page, such as create,
> >>    edit, delete and test connect.
> >>    - Each task node in DAG can specify the task server that needs to be
> >>    executed.
> >>    - The task server property belongs only to task instances, not
> >>    workflow instances.
> >>
> >> *MOP*
> >> 1. Add *TaskServer *Entity in *dolphinscheduler-dao* model, and create a
> >> table named *t_ds_task_server *in DB. Add TaskServer API.
> >> [image: image.png]
> >> 2. Add a task server manage page in Secerity menu, such as create,
> >> delete, edit and test connect.
> >> 3. Add a task server select input (field: taskServerCode) in Shell
> >> and Python task form.
> >> 4. Add *taskServerCode *column in*
> >> t_ds_task_instance、t_ds_task_definition、t_ds_task_definition_log*
> >> tables. And change these APIs.
> >> 5. Add *taskServerCode *field in *TaskInstance, TaskDefintion,
> >> TaskDefinitionLog, TaskNode*, and add *taskServerInfo *field in
> *TaskInstance
> >> *(not a table field).
> >> 6. Add *taskServerInfo *(just contain ip, user, password, name) entity
> >> in *dolphinscheduler-task-plugin* model. And add taskServerInfo field in
> >> *TaskExecutionContext*.
> >> 7. Shell and Python task will
> >> check TaskExecutionContext.getTaskServerInfo(), if not null, it will scp
> >> the command files and resource files to the task server, and send the
> start
> >> command to remote to exec it. If other task plugin need this feture,
> can
> >> also check this task server field.
> >> 8. The task will ssh and scp to the task server as the DS running user,
> >> not tenant.
> >>
> >> Look forward to your comments. If no any comment, I will try to complete
> >> the test and submit a PR. Thanks
> >>
> >> B. R.
> >> Yann
> >>
> >> On Sat, Aug 27, 2022 at 9:31 PM yann ann <[email protected]>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> The current DS only supports tasks running on the Worker servers, but
> in
> >>> most real business scenarios, tasks need to be run on the specified
> server,
> >>> and the server does not need to deploy the Worker service. Please
> check the Issue
> >>> #11652 <https://github.com/apache/dolphinscheduler/issues/11652> .
> >>> Thanks to @SbloodyS for suggesting me to provide the details of the
> >>> architecture design.
> >>>
> >>> Please check the architecture design. I'm not sure if DS needs to use a
> >>> fixed design template, so I'll try to be as detailed as possible.
> >>> This design mainly consists of four parts:
> >>>
> >>> *Part 1: Create a host table.*
> >>> Create a table named t_ds_host in db. Please check the following  E-R
> >>> diagram.  T_DS_HOST table The purpose of this table design is to save
> the
> >>> host information configured in the UI.
> >>>
> >>> [image: image.png]
> >>>
> >>> *Part 2: Add host manage page.*
> >>> Add a *Host Manage* module in the Security Page, such as the
> >>> *Environment Manage* page. Users can add the host records and test
> >>> connect it.
> >>> [image: image.png]
> >>> *Part 3: Add host manage API*
> >>> Add host manage API, same as environment manage model.
> >>> About the connect test function, I think that DS needs to test SSH and
> >>> SFTP at the same time. Because we need to scp our task script to this
> host
> >>> and run shell commands. I plan to use the JSch or SSNJ 3rd.
> >>>
> >>> *Part 4: Add host selection*
> >>> Add host selection on Shell and Python task pages.
> >>> Then shellTask or pythonTask can check this host param, if host param
> is
> >>> not null, the task will scp the command files to the remote server and
> >>> dispatch the exec command.
> >>>
> >>> Why just add to Shell and Python tasks?
> >>> Because I think each task needs to decide for itself if it needs to be
> >>> executed on the remote server, I think shell and python should add it.
> >>>
> >>> look forward to your comments.
> >>>
> >>> B. R.
> >>> Yann
> >>>
> >>>
>

Reply via email to