I agree with the current implementation plan of global parameters, this is
a great job.

Thanks,
Wenjun Ruan

wenjun ruan <[email protected]> 于2021年6月6日周日 上午1:30写道:

> Hi,
>
> Xingjie's plan is great, I don't oppose the current plan.
>
> But there seems to be something in the current plan that may be confusing.
>
> 1. In the current plan, we seem to store both the in param and out param
> in varpool. When the in param and the out param have the same name, the out
> param will override the in param. This may be reasonable, but when we are
> troubleshooting, it's hard to know where is current in param from, it may
> come from the upstream or generate by the current task.
>
> 2. If there are multiple upstream tasks want to transport the same
> parameter, only one will be kept in the varpool, and it's hard to tell
> which one will be kept.
>
> 3. If we store out param in all varpool, this may cause the large varpool
> in post node.
>
> You can get detail at
> https://github.com/apache/dolphinscheduler/issues/5565
>
> To be honest, I am not sure if the user will have these problems.
>
> And My suggestion is not to store all the param at varpool, each task
> saves only its own out parameter in the varpool. I am not sure if this is
> consistent with @Xingjie's second plan.
>
>
> This is my personal opinion, for reference only.
>
> Thanks,
> Wenjun Ruan
>
>
> Lidong Dai <[email protected]> 于2021年6月5日周六 下午11:22写道:
>
>> hi,
>>   any progress? Do we need a meeting to solve this?
>> By the way, the picture can't show in the apache mailing list, you can
>> upload the pic to github, then paste the url address to the mail.
>>
>>
>> Best Regards
>> ---------------
>> DolphinScheduler PMC
>> Lidong Dai
>> [email protected]
>> ---------------
>>
>>
>> On Tue, Jun 1, 2021 at 5:04 PM Xingjie Wang(联通集团联通数字科技有限公司) <
>> [email protected]> wrote:
>>
>> > This really doesn't satisfy those scene.
>> > For the first one ,If we want to do this, we should save the varPool to
>> > the level of processInstance, so the Task4 can get the varPool from
>> Task1
>> > and do not by Task2 and Task3.
>> > This one will obscure the relation between globalParam and localParam
>> and
>> > varPool.
>> > Other plan ,when user define the Task4 IN param ,user should chonse the
>> > Task1 that this param is the Task1's OUT param .When init the Task4's
>> > varPool ,get the Task1 form completeTaskList,then get the varPool.And
>> when
>> > the taskInstance have a new property, mark this taskInstance'name into
>> this
>> > property ,and put this property into the varPool.
>> > It will satisfy those scene.
>> > How do you think?
>> > -----邮件原件-----
>> > 发件人: Ruan, Wenjun <[email protected]>
>> > 发送时间: 2021年6月1日 16:07
>> > 收件人: [email protected]
>> > 主题: Re: [DISCUSS]The new Plan of global params
>> >
>> > Sorry, it seems that the picture cannot be displayed well, the dag
>> > structure is as follow:
>> >
>> > Task1   ->  Task2   ->  Task3   ->  Task4
>> >
>> > From: Ruan, Wenjun <[email protected]>
>> > Date: Tuesday, June 1, 2021 at 3:55 PM
>> > To: [email protected] <[email protected]>
>> > Subject: Re: [DISCUSS]The new Plan of global params External Email Hi
>> > Xingjie,
>> >
>> > I have two things want to confirm.
>> > In your plan it seems that we need to store varpool in all the post
>> nodes?
>> > For example, if I have a simple dag like below:
>> > [cid:[email protected]]
>> >
>> > If I want to get the out param of Task1 in Task4, for example we call it
>> > parameterA, then we need to store parameterA in Task2’s varpool and
>> Task3’s
>> > varpool, even if we don’t need parameterA in Task2 and Task3, am I
>> right?
>> > Can we take it directly from Task1?
>> >
>> > The second thing is that if I want to troubleshouting, can I find the
>> > source of the parameters in the varpool? It seems I need to look forward
>> > and find the first node that contains the parameter? It seems not
>> > convenient, if I have much tasks in a dag.
>> >
>> > Thanks,
>> > Wenjun Ruan
>> >
>> >
>> > From: Xingjie Wang(联通集团联通数字科技有限公司) <[email protected]>
>> > Date: Tuesday, June 1, 2021 at 3:21 PM
>> > To: [email protected] <[email protected]>
>> > Subject: [DISCUSS]The new Plan of global params External Email
>> >
>> >
>> >
>> > Hi Dev Team
>> >
>> > The scheme of The global params that Task need change as blow。
>> > Here is the ISSUE about this DISCUSS。
>> >
>> >
>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fdolphinscheduler%2Fissues%2F5565&amp;data=04%7C01%7Cweruan%40ebay.com%7C71c6b15115994fd6463808d924cdbf7b%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637581288947116284%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Na6C8tAQODl%2BrfThuaGmtCcVqqwt1eSBpQTFgaRuVXc%3D&amp;reserved=0
>> > <
>> >
>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fdolphinscheduler%2Fissues%2F5565&data=04%7C01%7Cweruan%40ebay.com%7C2a73397f8b394100b21508d924d28b37%7C46326bff992841a0baca17c16c94ea99%7C0%7C0%7C637581309371279320%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=1B%2BKIBz%2BCfhTp0nRiK3oKFz4KKgYQDN%2B46xF091o0BQ%3D&reserved=0
>> > >
>> >
>> > I will change the scheme of global param that depend on the relation of
>> > the task.
>> > Here is the detail about this .
>> >
>> > 1.     create taskInstance
>> >
>> > Get the previous tasks ,get the varPool of those tasks, put those into
>> > this varPool. If the previous task has the same varPool name,use the
>> value
>> > that is not null.If all of the  values  are null, use the earlier one.
>> >
>> >   1.  Worker get the param
>> >
>> > Master will send the varPool to the Worker ,and taskPorcessor get the
>> > varPool with the format of List<Property>, varPool will do the same as
>> the
>> > localParam.
>> >
>> >   1.  Worker response out Parm
>> >
>> > When the user define the OUT param in the page of Task Definition.Worker
>> > will get the result of Processor .
>> >
>> > The different Processor return the different format ,for example SQL
>> > return the format of List<Map<String,String>>,users could get more than
>> one
>> > line or more than one column;SHELL get the Map<String,String> or
>> > String.This out Param will add into the varPool ,and send to the Master
>> > ,save into the databases.Also ,this value will save into the localParam
>> >
>> >
>> >
>> > If you have some question or have the better plan ,please contact me
>> > .thank you.
>> >
>> > 如果您错误接收了该邮件,请通过电子邮件立即通知我们。请回复邮件到 [email protected]
>> ,即可以退订此邮件。我们将立即将您的信息从我们的发送目录中删除。
>> > If you have received this email in error please notify us immediately by
>> > e-mail. Please reply to [email protected] ,you can unsubscribe
>> from
>> > this mail. We will immediately remove your information from send
>> catalogue
>> > of our.
>> > 如果您错误接收了该邮件,请通过电子邮件立即通知我们。请回复邮件到 [email protected]
>> ,即可以退订此邮件。我们将立即将您的信息从我们的发送目录中删除。
>> > If you have received this email in error please notify us immediately by
>> > e-mail. Please reply to [email protected] ,you can unsubscribe
>> from
>> > this mail. We will immediately remove your information from send
>> catalogue
>> > of our.
>> >
>>
>

Reply via email to