Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-28 Thread jincheng sun
Sure, Thank you for your confirmation Luke! :)

Best,
Jincheng

Luke Cwik  于2019年10月29日周二 上午1:20写道:

> I would go with creating JIRAs and PRs directly since this doesn't seem to
> be contentious since you have received feedback from a few folks and they
> are all suggesting the same thing.
>
> On Sun, Oct 27, 2019 at 9:27 PM jincheng sun 
> wrote:
>
>> Hi all,
>>
>> Thanks a lot for your feedback. It seems that we have reached consensus
>> that both "Approach 2" and "Approach 3" are needed. "Approach 3" is a good
>> supplement for "Approach 2" and I also prefer "Approach 2" and "Approach 3"
>> for now.
>>
>> Do we need to vote on this discussion or I can create JIRAs and submit
>> the PRs directly?
>>
>> Best,
>> Jincheng
>>
>> Luke Cwik  于2019年10月26日周六 上午4:01写道:
>>
>>> Approach 3 is about caching the bundle descriptor forever but tearing
>>> down a "live" instance of the DoFns at some SDK chosen arbitrary point in
>>> time. This way if a future ProcessBundleRequest comes in, a new "live"
>>> instance can be constructed.
>>> Approach 2 is still needed so that when the workers are being
>>> shutdown all the "live" instances are torn down.
>>>
>>> On Fri, Oct 25, 2019 at 12:56 PM Robert Burke 
>>> wrote:
>>>
 Approach 2 isn't incompatible with approach 3. 3 simple sets down
 convention/configuration for the conditions when the SDK will do this after
 process bundle has completed.



 On Fri, Oct 25, 2019, 12:34 PM Robert Bradshaw 
 wrote:

> I think we'll still need approach (2) for when the pipeline finishes
> and a runner is tearing down workers.
>
> On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels 
> wrote:
> >
> > Hi Jincheng,
> >
> > Thanks for bringing this up and capturing the ideas in the doc.
> >
> > Intuitively, I would have also considered adding a new Proto message
> for
> > the teardown, but I think the idea to trigger this logic when the SDK
> > Harness evicts process bundle descriptors is more elegant.
> >
> > Thanks,
> > Max
> >
> > On 25.10.19 17:23, Luke Cwik wrote:
> > > I like approach 3 since it doesn't add additional complexity to
> the API
> > > and individual SDKs can choose to implement any clean-up strategy
> they
> > > want or none at all which is the simplest.
> > >
> > > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun <
> sunjincheng...@gmail.com
> > > > wrote:
> > >
> > > Hi,
> > >
> > > Thanks for your comments in doc, I have add Approach 3 which
> you
> > > mentioned! @Luke
> > >
> > > For now, we should do a decision for Approach 3 and Approach 1.
> > > Detail can be found in doc [1]
> > >
> > > Welcome anyone's feedback :)
> > >
> > > Regards,
> > > Jincheng
> > >
> > > [1]
> > >
> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
> > >
> > > jincheng sun  > > > 于2019年10月25日周五 上午10:40写道:
> > >
> > > Hi,
> > >
> > > Functionally capable of `abort`, but it will be called at
> the
> > > end of operator. So, I prefer `dispose` semantics. i.e.,
> all
> > > normal logic has been executed.
> > >
> > > Best,
> > > Jincheng
> > >
> > > Harsh Vardhan  anan...@google.com>>
> > > 于2019年10月23日周三 上午12:14写道:
> > >
> > > Would approach 1 be akin to abort semantics?
> > >
> > > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun
> > >  sunjincheng...@gmail.com>>
> > > wrote:
> > >
> > > Hi Luke,
> > >
> > > Thanks a lot for your reply. Since it allows to
> share
> > > one SDK harness between multiple executable
> stages, the
> > > control service termination may occur much later
> than
> > > the completion of an executable stage. This is the
> main
> > > reason I prefer runners to control the teardown of
> DoFns.
> > >
> > > Regarding to "SDK harnesses can terminate
> instances any
> > > time they want and start new instances anytime as
> > > well.", personally I think it's not conflict with
> the
> > > proposed Approach 1 as the SDK harness could
> decide what
> > > to do when receiving the teardown request. It
> could do
> > > nothing if the DoFns has already been teared down
> and
> > > could also tear down the DoFns if needed.
> > >
> > > What do you think?
> > >
> > >   

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-28 Thread Luke Cwik
I would go with creating JIRAs and PRs directly since this doesn't seem to
be contentious since you have received feedback from a few folks and they
are all suggesting the same thing.

On Sun, Oct 27, 2019 at 9:27 PM jincheng sun 
wrote:

> Hi all,
>
> Thanks a lot for your feedback. It seems that we have reached consensus
> that both "Approach 2" and "Approach 3" are needed. "Approach 3" is a good
> supplement for "Approach 2" and I also prefer "Approach 2" and "Approach 3"
> for now.
>
> Do we need to vote on this discussion or I can create JIRAs and submit the
> PRs directly?
>
> Best,
> Jincheng
>
> Luke Cwik  于2019年10月26日周六 上午4:01写道:
>
>> Approach 3 is about caching the bundle descriptor forever but tearing
>> down a "live" instance of the DoFns at some SDK chosen arbitrary point in
>> time. This way if a future ProcessBundleRequest comes in, a new "live"
>> instance can be constructed.
>> Approach 2 is still needed so that when the workers are being
>> shutdown all the "live" instances are torn down.
>>
>> On Fri, Oct 25, 2019 at 12:56 PM Robert Burke  wrote:
>>
>>> Approach 2 isn't incompatible with approach 3. 3 simple sets down
>>> convention/configuration for the conditions when the SDK will do this after
>>> process bundle has completed.
>>>
>>>
>>>
>>> On Fri, Oct 25, 2019, 12:34 PM Robert Bradshaw 
>>> wrote:
>>>
 I think we'll still need approach (2) for when the pipeline finishes
 and a runner is tearing down workers.

 On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels 
 wrote:
 >
 > Hi Jincheng,
 >
 > Thanks for bringing this up and capturing the ideas in the doc.
 >
 > Intuitively, I would have also considered adding a new Proto message
 for
 > the teardown, but I think the idea to trigger this logic when the SDK
 > Harness evicts process bundle descriptors is more elegant.
 >
 > Thanks,
 > Max
 >
 > On 25.10.19 17:23, Luke Cwik wrote:
 > > I like approach 3 since it doesn't add additional complexity to the
 API
 > > and individual SDKs can choose to implement any clean-up strategy
 they
 > > want or none at all which is the simplest.
 > >
 > > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun <
 sunjincheng...@gmail.com
 > > > wrote:
 > >
 > > Hi,
 > >
 > > Thanks for your comments in doc, I have add Approach 3 which you
 > > mentioned! @Luke
 > >
 > > For now, we should do a decision for Approach 3 and Approach 1.
 > > Detail can be found in doc [1]
 > >
 > > Welcome anyone's feedback :)
 > >
 > > Regards,
 > > Jincheng
 > >
 > > [1]
 > >
 https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
 > >
 > > jincheng sun >>> > > > 于2019年10月25日周五 上午10:40写道:
 > >
 > > Hi,
 > >
 > > Functionally capable of `abort`, but it will be called at
 the
 > > end of operator. So, I prefer `dispose` semantics. i.e., all
 > > normal logic has been executed.
 > >
 > > Best,
 > > Jincheng
 > >
 > > Harsh Vardhan >>> anan...@google.com>>
 > > 于2019年10月23日周三 上午12:14写道:
 > >
 > > Would approach 1 be akin to abort semantics?
 > >
 > > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun
 > > >>> sunjincheng...@gmail.com>>
 > > wrote:
 > >
 > > Hi Luke,
 > >
 > > Thanks a lot for your reply. Since it allows to
 share
 > > one SDK harness between multiple executable stages,
 the
 > > control service termination may occur much later
 than
 > > the completion of an executable stage. This is the
 main
 > > reason I prefer runners to control the teardown of
 DoFns.
 > >
 > > Regarding to "SDK harnesses can terminate instances
 any
 > > time they want and start new instances anytime as
 > > well.", personally I think it's not conflict with
 the
 > > proposed Approach 1 as the SDK harness could decide
 what
 > > to do when receiving the teardown request. It could
 do
 > > nothing if the DoFns has already been teared down
 and
 > > could also tear down the DoFns if needed.
 > >
 > > What do you think?
 > >
 > > Best,
 > > Jincheng
 > >
 > > Luke Cwik >>> lc...@google.com>>
 > > 于2019年10月22日周二 上午2:05写道:
 > >
 > > Approach 2 is currently the suggested
 approach[1]
 

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-27 Thread jincheng sun
Hi all,

Thanks a lot for your feedback. It seems that we have reached consensus
that both "Approach 2" and "Approach 3" are needed. "Approach 3" is a good
supplement for "Approach 2" and I also prefer "Approach 2" and "Approach 3"
for now.

Do we need to vote on this discussion or I can create JIRAs and submit the
PRs directly?

Best,
Jincheng

Luke Cwik  于2019年10月26日周六 上午4:01写道:

> Approach 3 is about caching the bundle descriptor forever but tearing down
> a "live" instance of the DoFns at some SDK chosen arbitrary point in time.
> This way if a future ProcessBundleRequest comes in, a new "live" instance
> can be constructed.
> Approach 2 is still needed so that when the workers are being shutdown all
> the "live" instances are torn down.
>
> On Fri, Oct 25, 2019 at 12:56 PM Robert Burke  wrote:
>
>> Approach 2 isn't incompatible with approach 3. 3 simple sets down
>> convention/configuration for the conditions when the SDK will do this after
>> process bundle has completed.
>>
>>
>>
>> On Fri, Oct 25, 2019, 12:34 PM Robert Bradshaw 
>> wrote:
>>
>>> I think we'll still need approach (2) for when the pipeline finishes
>>> and a runner is tearing down workers.
>>>
>>> On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels 
>>> wrote:
>>> >
>>> > Hi Jincheng,
>>> >
>>> > Thanks for bringing this up and capturing the ideas in the doc.
>>> >
>>> > Intuitively, I would have also considered adding a new Proto message
>>> for
>>> > the teardown, but I think the idea to trigger this logic when the SDK
>>> > Harness evicts process bundle descriptors is more elegant.
>>> >
>>> > Thanks,
>>> > Max
>>> >
>>> > On 25.10.19 17:23, Luke Cwik wrote:
>>> > > I like approach 3 since it doesn't add additional complexity to the
>>> API
>>> > > and individual SDKs can choose to implement any clean-up strategy
>>> they
>>> > > want or none at all which is the simplest.
>>> > >
>>> > > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun <
>>> sunjincheng...@gmail.com
>>> > > > wrote:
>>> > >
>>> > > Hi,
>>> > >
>>> > > Thanks for your comments in doc, I have add Approach 3 which you
>>> > > mentioned! @Luke
>>> > >
>>> > > For now, we should do a decision for Approach 3 and Approach 1.
>>> > > Detail can be found in doc [1]
>>> > >
>>> > > Welcome anyone's feedback :)
>>> > >
>>> > > Regards,
>>> > > Jincheng
>>> > >
>>> > > [1]
>>> > >
>>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>>> > >
>>> > > jincheng sun >> > > > 于2019年10月25日周五 上午10:40写道:
>>> > >
>>> > > Hi,
>>> > >
>>> > > Functionally capable of `abort`, but it will be called at the
>>> > > end of operator. So, I prefer `dispose` semantics. i.e., all
>>> > > normal logic has been executed.
>>> > >
>>> > > Best,
>>> > > Jincheng
>>> > >
>>> > > Harsh Vardhan mailto:anan...@google.com
>>> >>
>>> > > 于2019年10月23日周三 上午12:14写道:
>>> > >
>>> > > Would approach 1 be akin to abort semantics?
>>> > >
>>> > > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun
>>> > > >> sunjincheng...@gmail.com>>
>>> > > wrote:
>>> > >
>>> > > Hi Luke,
>>> > >
>>> > > Thanks a lot for your reply. Since it allows to share
>>> > > one SDK harness between multiple executable stages,
>>> the
>>> > > control service termination may occur much later than
>>> > > the completion of an executable stage. This is the
>>> main
>>> > > reason I prefer runners to control the teardown of
>>> DoFns.
>>> > >
>>> > > Regarding to "SDK harnesses can terminate instances
>>> any
>>> > > time they want and start new instances anytime as
>>> > > well.", personally I think it's not conflict with the
>>> > > proposed Approach 1 as the SDK harness could decide
>>> what
>>> > > to do when receiving the teardown request. It could
>>> do
>>> > > nothing if the DoFns has already been teared down and
>>> > > could also tear down the DoFns if needed.
>>> > >
>>> > > What do you think?
>>> > >
>>> > > Best,
>>> > > Jincheng
>>> > >
>>> > > Luke Cwik mailto:lc...@google.com
>>> >>
>>> > > 于2019年10月22日周二 上午2:05写道:
>>> > >
>>> > > Approach 2 is currently the suggested approach[1]
>>> > > for DoFn's to shutdown.
>>> > > Note that SDK harnesses can terminate instances
>>> any
>>> > > time they want and start new instances anytime
>>> as well.
>>> > >
>>> > > Why do you want to expose this logic so that
>>> Runners
>>> > > could control it?
>>> > >
>>> > > 

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-25 Thread Luke Cwik
Approach 3 is about caching the bundle descriptor forever but tearing down
a "live" instance of the DoFns at some SDK chosen arbitrary point in time.
This way if a future ProcessBundleRequest comes in, a new "live" instance
can be constructed.
Approach 2 is still needed so that when the workers are being shutdown all
the "live" instances are torn down.

On Fri, Oct 25, 2019 at 12:56 PM Robert Burke  wrote:

> Approach 2 isn't incompatible with approach 3. 3 simple sets down
> convention/configuration for the conditions when the SDK will do this after
> process bundle has completed.
>
>
>
> On Fri, Oct 25, 2019, 12:34 PM Robert Bradshaw 
> wrote:
>
>> I think we'll still need approach (2) for when the pipeline finishes
>> and a runner is tearing down workers.
>>
>> On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels 
>> wrote:
>> >
>> > Hi Jincheng,
>> >
>> > Thanks for bringing this up and capturing the ideas in the doc.
>> >
>> > Intuitively, I would have also considered adding a new Proto message for
>> > the teardown, but I think the idea to trigger this logic when the SDK
>> > Harness evicts process bundle descriptors is more elegant.
>> >
>> > Thanks,
>> > Max
>> >
>> > On 25.10.19 17:23, Luke Cwik wrote:
>> > > I like approach 3 since it doesn't add additional complexity to the
>> API
>> > > and individual SDKs can choose to implement any clean-up strategy they
>> > > want or none at all which is the simplest.
>> > >
>> > > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun <
>> sunjincheng...@gmail.com
>> > > > wrote:
>> > >
>> > > Hi,
>> > >
>> > > Thanks for your comments in doc, I have add Approach 3 which you
>> > > mentioned! @Luke
>> > >
>> > > For now, we should do a decision for Approach 3 and Approach 1.
>> > > Detail can be found in doc [1]
>> > >
>> > > Welcome anyone's feedback :)
>> > >
>> > > Regards,
>> > > Jincheng
>> > >
>> > > [1]
>> > >
>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>> > >
>> > > jincheng sun > > > > 于2019年10月25日周五 上午10:40写道:
>> > >
>> > > Hi,
>> > >
>> > > Functionally capable of `abort`, but it will be called at the
>> > > end of operator. So, I prefer `dispose` semantics. i.e., all
>> > > normal logic has been executed.
>> > >
>> > > Best,
>> > > Jincheng
>> > >
>> > > Harsh Vardhan mailto:anan...@google.com
>> >>
>> > > 于2019年10月23日周三 上午12:14写道:
>> > >
>> > > Would approach 1 be akin to abort semantics?
>> > >
>> > > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun
>> > > > sunjincheng...@gmail.com>>
>> > > wrote:
>> > >
>> > > Hi Luke,
>> > >
>> > > Thanks a lot for your reply. Since it allows to share
>> > > one SDK harness between multiple executable stages,
>> the
>> > > control service termination may occur much later than
>> > > the completion of an executable stage. This is the
>> main
>> > > reason I prefer runners to control the teardown of
>> DoFns.
>> > >
>> > > Regarding to "SDK harnesses can terminate instances
>> any
>> > > time they want and start new instances anytime as
>> > > well.", personally I think it's not conflict with the
>> > > proposed Approach 1 as the SDK harness could decide
>> what
>> > > to do when receiving the teardown request. It could do
>> > > nothing if the DoFns has already been teared down and
>> > > could also tear down the DoFns if needed.
>> > >
>> > > What do you think?
>> > >
>> > > Best,
>> > > Jincheng
>> > >
>> > > Luke Cwik mailto:lc...@google.com
>> >>
>> > > 于2019年10月22日周二 上午2:05写道:
>> > >
>> > > Approach 2 is currently the suggested approach[1]
>> > > for DoFn's to shutdown.
>> > > Note that SDK harnesses can terminate instances
>> any
>> > > time they want and start new instances anytime as
>> well.
>> > >
>> > > Why do you want to expose this logic so that
>> Runners
>> > > could control it?
>> > >
>> > > 1:
>> > >
>> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
>> > >
>> > > On Mon, Oct 21, 2019 at 4:27 AM jincheng sun
>> > > > > > > wrote:
>> > >
>> > > Hi,
>> > > I found that in `SdkHarness` do not  stop the
>> > > `SdkWorker` when finish.  We should add the
>> > > logic for stop the `SdkWorker` in

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-25 Thread Robert Burke
Approach 2 isn't incompatible with approach 3. 3 simple sets down
convention/configuration for the conditions when the SDK will do this after
process bundle has completed.



On Fri, Oct 25, 2019, 12:34 PM Robert Bradshaw  wrote:

> I think we'll still need approach (2) for when the pipeline finishes
> and a runner is tearing down workers.
>
> On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels 
> wrote:
> >
> > Hi Jincheng,
> >
> > Thanks for bringing this up and capturing the ideas in the doc.
> >
> > Intuitively, I would have also considered adding a new Proto message for
> > the teardown, but I think the idea to trigger this logic when the SDK
> > Harness evicts process bundle descriptors is more elegant.
> >
> > Thanks,
> > Max
> >
> > On 25.10.19 17:23, Luke Cwik wrote:
> > > I like approach 3 since it doesn't add additional complexity to the API
> > > and individual SDKs can choose to implement any clean-up strategy they
> > > want or none at all which is the simplest.
> > >
> > > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun  > > > wrote:
> > >
> > > Hi,
> > >
> > > Thanks for your comments in doc, I have add Approach 3 which you
> > > mentioned! @Luke
> > >
> > > For now, we should do a decision for Approach 3 and Approach 1.
> > > Detail can be found in doc [1]
> > >
> > > Welcome anyone's feedback :)
> > >
> > > Regards,
> > > Jincheng
> > >
> > > [1]
> > >
> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
> > >
> > > jincheng sun  > > > 于2019年10月25日周五 上午10:40写道:
> > >
> > > Hi,
> > >
> > > Functionally capable of `abort`, but it will be called at the
> > > end of operator. So, I prefer `dispose` semantics. i.e., all
> > > normal logic has been executed.
> > >
> > > Best,
> > > Jincheng
> > >
> > > Harsh Vardhan mailto:anan...@google.com>>
> > > 于2019年10月23日周三 上午12:14写道:
> > >
> > > Would approach 1 be akin to abort semantics?
> > >
> > > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun
> > > mailto:sunjincheng...@gmail.com
> >>
> > > wrote:
> > >
> > > Hi Luke,
> > >
> > > Thanks a lot for your reply. Since it allows to share
> > > one SDK harness between multiple executable stages, the
> > > control service termination may occur much later than
> > > the completion of an executable stage. This is the main
> > > reason I prefer runners to control the teardown of
> DoFns.
> > >
> > > Regarding to "SDK harnesses can terminate instances any
> > > time they want and start new instances anytime as
> > > well.", personally I think it's not conflict with the
> > > proposed Approach 1 as the SDK harness could decide
> what
> > > to do when receiving the teardown request. It could do
> > > nothing if the DoFns has already been teared down and
> > > could also tear down the DoFns if needed.
> > >
> > > What do you think?
> > >
> > > Best,
> > > Jincheng
> > >
> > > Luke Cwik mailto:lc...@google.com>>
> > > 于2019年10月22日周二 上午2:05写道:
> > >
> > > Approach 2 is currently the suggested approach[1]
> > > for DoFn's to shutdown.
> > > Note that SDK harnesses can terminate instances any
> > > time they want and start new instances anytime as
> well.
> > >
> > > Why do you want to expose this logic so that
> Runners
> > > could control it?
> > >
> > > 1:
> > >
> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
> > >
> > > On Mon, Oct 21, 2019 at 4:27 AM jincheng sun
> > >  > > > wrote:
> > >
> > > Hi,
> > > I found that in `SdkHarness` do not  stop the
> > > `SdkWorker` when finish.  We should add the
> > > logic for stop the `SdkWorker` in `SdkHarness`.
> > > More detail can be found [1].
> > >
> > > There are two approaches to solve this issue:
> > >
> > > Approach 1:  We can add a Fn API for teardown
> > > purpose and the runner will teardown a specific
> > > bundle descriptor via this teardown Fn API
> > > during disposing.
> > > Approach 2: The control service termination
> > > could be seen as a signal and once SDK 

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-25 Thread Robert Bradshaw
I think we'll still need approach (2) for when the pipeline finishes
and a runner is tearing down workers.

On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels  wrote:
>
> Hi Jincheng,
>
> Thanks for bringing this up and capturing the ideas in the doc.
>
> Intuitively, I would have also considered adding a new Proto message for
> the teardown, but I think the idea to trigger this logic when the SDK
> Harness evicts process bundle descriptors is more elegant.
>
> Thanks,
> Max
>
> On 25.10.19 17:23, Luke Cwik wrote:
> > I like approach 3 since it doesn't add additional complexity to the API
> > and individual SDKs can choose to implement any clean-up strategy they
> > want or none at all which is the simplest.
> >
> > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun  > > wrote:
> >
> > Hi,
> >
> > Thanks for your comments in doc, I have add Approach 3 which you
> > mentioned! @Luke
> >
> > For now, we should do a decision for Approach 3 and Approach 1.
> > Detail can be found in doc [1]
> >
> > Welcome anyone's feedback :)
> >
> > Regards,
> > Jincheng
> >
> > [1]
> > 
> > https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
> >
> > jincheng sun  > > 于2019年10月25日周五 上午10:40写道:
> >
> > Hi,
> >
> > Functionally capable of `abort`, but it will be called at the
> > end of operator. So, I prefer `dispose` semantics. i.e., all
> > normal logic has been executed.
> >
> > Best,
> > Jincheng
> >
> > Harsh Vardhan mailto:anan...@google.com>>
> > 于2019年10月23日周三 上午12:14写道:
> >
> > Would approach 1 be akin to abort semantics?
> >
> > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun
> > mailto:sunjincheng...@gmail.com>>
> > wrote:
> >
> > Hi Luke,
> >
> > Thanks a lot for your reply. Since it allows to share
> > one SDK harness between multiple executable stages, the
> > control service termination may occur much later than
> > the completion of an executable stage. This is the main
> > reason I prefer runners to control the teardown of DoFns.
> >
> > Regarding to "SDK harnesses can terminate instances any
> > time they want and start new instances anytime as
> > well.", personally I think it's not conflict with the
> > proposed Approach 1 as the SDK harness could decide what
> > to do when receiving the teardown request. It could do
> > nothing if the DoFns has already been teared down and
> > could also tear down the DoFns if needed.
> >
> > What do you think?
> >
> > Best,
> > Jincheng
> >
> > Luke Cwik mailto:lc...@google.com>>
> > 于2019年10月22日周二 上午2:05写道:
> >
> > Approach 2 is currently the suggested approach[1]
> > for DoFn's to shutdown.
> > Note that SDK harnesses can terminate instances any
> > time they want and start new instances anytime as well.
> >
> > Why do you want to expose this logic so that Runners
> > could control it?
> >
> > 1:
> > 
> > https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
> >
> > On Mon, Oct 21, 2019 at 4:27 AM jincheng sun
> >  > > wrote:
> >
> > Hi,
> > I found that in `SdkHarness` do not  stop the
> > `SdkWorker` when finish.  We should add the
> > logic for stop the `SdkWorker` in `SdkHarness`.
> > More detail can be found [1].
> >
> > There are two approaches to solve this issue:
> >
> > Approach 1:  We can add a Fn API for teardown
> > purpose and the runner will teardown a specific
> > bundle descriptor via this teardown Fn API
> > during disposing.
> > Approach 2: The control service termination
> > could be seen as a signal and once SDK harness
> > receives this signal, the teardown of the bundle
> > descriptor will be performed.
> >
> > More detail can be found in [2].
> >
> > As the Approach 2, SDK harness could be shared
> > between multiple executable stages. The control
> > service termination only occurs when all the
> >

Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-25 Thread Luke Cwik
I like approach 3 since it doesn't add additional complexity to the API and
individual SDKs can choose to implement any clean-up strategy they want or
none at all which is the simplest.

On Thu, Oct 24, 2019 at 8:46 PM jincheng sun 
wrote:

> Hi,
>
> Thanks for your comments in doc, I have add Approach 3 which you
> mentioned! @Luke
>
> For now, we should do a decision for Approach 3 and Approach 1. Detail can
> be found in doc [1]
>
> Welcome anyone's feedback :)
>
> Regards,
> Jincheng
>
> [1]
> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>
> jincheng sun  于2019年10月25日周五 上午10:40写道:
>
>> Hi,
>>
>> Functionally capable of `abort`, but it will be called at the end of
>> operator. So, I prefer `dispose` semantics. i.e., all normal logic has been
>> executed.
>>
>> Best,
>> Jincheng
>>
>> Harsh Vardhan  于2019年10月23日周三 上午12:14写道:
>>
>>> Would approach 1 be akin to abort semantics?
>>>
>>> On Mon, Oct 21, 2019 at 8:01 PM jincheng sun 
>>> wrote:
>>>
 Hi Luke,

 Thanks a lot for your reply. Since it allows to share one SDK harness
 between multiple executable stages, the control service termination may
 occur much later than the completion of an executable stage. This is the
 main reason I prefer runners to control the teardown of DoFns.

 Regarding to "SDK harnesses can terminate instances any time they want
 and start new instances anytime as well.", personally I think it's not
 conflict with the proposed Approach 1 as the SDK harness could decide what
 to do when receiving the teardown request. It could do nothing if the DoFns
 has already been teared down and could also tear down the DoFns if needed.

 What do you think?

 Best,
 Jincheng

 Luke Cwik  于2019年10月22日周二 上午2:05写道:

> Approach 2 is currently the suggested approach[1] for DoFn's to
> shutdown.
> Note that SDK harnesses can terminate instances any time they want and
> start new instances anytime as well.
>
> Why do you want to expose this logic so that Runners could control it?
>
> 1:
> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
>
> On Mon, Oct 21, 2019 at 4:27 AM jincheng sun 
> wrote:
>
>> Hi,
>> I found that in `SdkHarness` do not  stop the `SdkWorker` when
>> finish.  We should add the logic for stop the `SdkWorker` in 
>> `SdkHarness`.
>> More detail can be found [1].
>>
>> There are two approaches to solve this issue:
>>
>> Approach 1:  We can add a Fn API for teardown purpose and the runner
>> will teardown a specific bundle descriptor via this teardown Fn API 
>> during
>> disposing.
>> Approach 2: The control service termination could be seen as a signal
>> and once SDK harness receives this signal, the teardown of the bundle
>> descriptor will be performed.
>>
>> More detail can be found in [2].
>>
>> As the Approach 2, SDK harness could be shared between multiple
>> executable stages. The control service termination only occurs when all 
>> the
>> executable stages sharing the same SDK harness finished. This means that
>> the teardown of DoFns may not be executed immediately after an executable
>> stage is finished.
>>
>> So, I prefer Approach 1. Welcome any feedback :)
>>
>> Best,
>> Jincheng
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
>> [2]
>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>>
> --
>>>
>>> Got feedback? go/harsh-feedback 
>>>
>>


Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-24 Thread jincheng sun
Hi,

Thanks for your comments in doc, I have add Approach 3 which you
mentioned! @Luke

For now, we should do a decision for Approach 3 and Approach 1. Detail can
be found in doc [1]

Welcome anyone's feedback :)

Regards,
Jincheng

[1]
https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing

jincheng sun  于2019年10月25日周五 上午10:40写道:

> Hi,
>
> Functionally capable of `abort`, but it will be called at the end of
> operator. So, I prefer `dispose` semantics. i.e., all normal logic has been
> executed.
>
> Best,
> Jincheng
>
> Harsh Vardhan  于2019年10月23日周三 上午12:14写道:
>
>> Would approach 1 be akin to abort semantics?
>>
>> On Mon, Oct 21, 2019 at 8:01 PM jincheng sun 
>> wrote:
>>
>>> Hi Luke,
>>>
>>> Thanks a lot for your reply. Since it allows to share one SDK harness
>>> between multiple executable stages, the control service termination may
>>> occur much later than the completion of an executable stage. This is the
>>> main reason I prefer runners to control the teardown of DoFns.
>>>
>>> Regarding to "SDK harnesses can terminate instances any time they want
>>> and start new instances anytime as well.", personally I think it's not
>>> conflict with the proposed Approach 1 as the SDK harness could decide what
>>> to do when receiving the teardown request. It could do nothing if the DoFns
>>> has already been teared down and could also tear down the DoFns if needed.
>>>
>>> What do you think?
>>>
>>> Best,
>>> Jincheng
>>>
>>> Luke Cwik  于2019年10月22日周二 上午2:05写道:
>>>
 Approach 2 is currently the suggested approach[1] for DoFn's to
 shutdown.
 Note that SDK harnesses can terminate instances any time they want and
 start new instances anytime as well.

 Why do you want to expose this logic so that Runners could control it?

 1:
 https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#

 On Mon, Oct 21, 2019 at 4:27 AM jincheng sun 
 wrote:

> Hi,
> I found that in `SdkHarness` do not  stop the `SdkWorker` when
> finish.  We should add the logic for stop the `SdkWorker` in `SdkHarness`.
> More detail can be found [1].
>
> There are two approaches to solve this issue:
>
> Approach 1:  We can add a Fn API for teardown purpose and the runner
> will teardown a specific bundle descriptor via this teardown Fn API during
> disposing.
> Approach 2: The control service termination could be seen as a signal
> and once SDK harness receives this signal, the teardown of the bundle
> descriptor will be performed.
>
> More detail can be found in [2].
>
> As the Approach 2, SDK harness could be shared between multiple
> executable stages. The control service termination only occurs when all 
> the
> executable stages sharing the same SDK harness finished. This means that
> the teardown of DoFns may not be executed immediately after an executable
> stage is finished.
>
> So, I prefer Approach 1. Welcome any feedback :)
>
> Best,
> Jincheng
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
> [2]
> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>
 --
>>
>> Got feedback? go/harsh-feedback 
>>
>


Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-24 Thread jincheng sun
Hi,

Functionally capable of `abort`, but it will be called at the end of
operator. So, I prefer `dispose` semantics. i.e., all normal logic has been
executed.

Best,
Jincheng

Harsh Vardhan  于2019年10月23日周三 上午12:14写道:

> Would approach 1 be akin to abort semantics?
>
> On Mon, Oct 21, 2019 at 8:01 PM jincheng sun 
> wrote:
>
>> Hi Luke,
>>
>> Thanks a lot for your reply. Since it allows to share one SDK harness
>> between multiple executable stages, the control service termination may
>> occur much later than the completion of an executable stage. This is the
>> main reason I prefer runners to control the teardown of DoFns.
>>
>> Regarding to "SDK harnesses can terminate instances any time they want
>> and start new instances anytime as well.", personally I think it's not
>> conflict with the proposed Approach 1 as the SDK harness could decide what
>> to do when receiving the teardown request. It could do nothing if the DoFns
>> has already been teared down and could also tear down the DoFns if needed.
>>
>> What do you think?
>>
>> Best,
>> Jincheng
>>
>> Luke Cwik  于2019年10月22日周二 上午2:05写道:
>>
>>> Approach 2 is currently the suggested approach[1] for DoFn's to shutdown.
>>> Note that SDK harnesses can terminate instances any time they want and
>>> start new instances anytime as well.
>>>
>>> Why do you want to expose this logic so that Runners could control it?
>>>
>>> 1:
>>> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
>>>
>>> On Mon, Oct 21, 2019 at 4:27 AM jincheng sun 
>>> wrote:
>>>
 Hi,
 I found that in `SdkHarness` do not  stop the `SdkWorker` when finish.
 We should add the logic for stop the `SdkWorker` in `SdkHarness`.  More
 detail can be found [1].

 There are two approaches to solve this issue:

 Approach 1:  We can add a Fn API for teardown purpose and the runner
 will teardown a specific bundle descriptor via this teardown Fn API during
 disposing.
 Approach 2: The control service termination could be seen as a signal
 and once SDK harness receives this signal, the teardown of the bundle
 descriptor will be performed.

 More detail can be found in [2].

 As the Approach 2, SDK harness could be shared between multiple
 executable stages. The control service termination only occurs when all the
 executable stages sharing the same SDK harness finished. This means that
 the teardown of DoFns may not be executed immediately after an executable
 stage is finished.

 So, I prefer Approach 1. Welcome any feedback :)

 Best,
 Jincheng

 [1]
 https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
 [2]
 https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing

>>> --
>
> Got feedback? go/harsh-feedback 
>


Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-22 Thread Harsh Vardhan
Would approach 1 be akin to abort semantics?

On Mon, Oct 21, 2019 at 8:01 PM jincheng sun 
wrote:

> Hi Luke,
>
> Thanks a lot for your reply. Since it allows to share one SDK harness
> between multiple executable stages, the control service termination may
> occur much later than the completion of an executable stage. This is the
> main reason I prefer runners to control the teardown of DoFns.
>
> Regarding to "SDK harnesses can terminate instances any time they want and
> start new instances anytime as well.", personally I think it's not conflict
> with the proposed Approach 1 as the SDK harness could decide what to do
> when receiving the teardown request. It could do nothing if the DoFns has
> already been teared down and could also tear down the DoFns if needed.
>
> What do you think?
>
> Best,
> Jincheng
>
> Luke Cwik  于2019年10月22日周二 上午2:05写道:
>
>> Approach 2 is currently the suggested approach[1] for DoFn's to shutdown.
>> Note that SDK harnesses can terminate instances any time they want and
>> start new instances anytime as well.
>>
>> Why do you want to expose this logic so that Runners could control it?
>>
>> 1:
>> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
>>
>> On Mon, Oct 21, 2019 at 4:27 AM jincheng sun 
>> wrote:
>>
>>> Hi,
>>> I found that in `SdkHarness` do not  stop the `SdkWorker` when finish.
>>> We should add the logic for stop the `SdkWorker` in `SdkHarness`.  More
>>> detail can be found [1].
>>>
>>> There are two approaches to solve this issue:
>>>
>>> Approach 1:  We can add a Fn API for teardown purpose and the runner
>>> will teardown a specific bundle descriptor via this teardown Fn API during
>>> disposing.
>>> Approach 2: The control service termination could be seen as a signal
>>> and once SDK harness receives this signal, the teardown of the bundle
>>> descriptor will be performed.
>>>
>>> More detail can be found in [2].
>>>
>>> As the Approach 2, SDK harness could be shared between multiple
>>> executable stages. The control service termination only occurs when all the
>>> executable stages sharing the same SDK harness finished. This means that
>>> the teardown of DoFns may not be executed immediately after an executable
>>> stage is finished.
>>>
>>> So, I prefer Approach 1. Welcome any feedback :)
>>>
>>> Best,
>>> Jincheng
>>>
>>> [1]
>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
>>> [2]
>>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>>>
>> --

Got feedback? go/harsh-feedback 


Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-21 Thread jincheng sun
Hi Luke,

Thanks a lot for your reply. Since it allows to share one SDK harness
between multiple executable stages, the control service termination may
occur much later than the completion of an executable stage. This is the
main reason I prefer runners to control the teardown of DoFns.

Regarding to "SDK harnesses can terminate instances any time they want and
start new instances anytime as well.", personally I think it's not conflict
with the proposed Approach 1 as the SDK harness could decide what to do
when receiving the teardown request. It could do nothing if the DoFns has
already been teared down and could also tear down the DoFns if needed.

What do you think?

Best,
Jincheng

Luke Cwik  于2019年10月22日周二 上午2:05写道:

> Approach 2 is currently the suggested approach[1] for DoFn's to shutdown.
> Note that SDK harnesses can terminate instances any time they want and
> start new instances anytime as well.
>
> Why do you want to expose this logic so that Runners could control it?
>
> 1:
> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
>
> On Mon, Oct 21, 2019 at 4:27 AM jincheng sun 
> wrote:
>
>> Hi,
>> I found that in `SdkHarness` do not  stop the `SdkWorker` when finish.
>> We should add the logic for stop the `SdkWorker` in `SdkHarness`.  More
>> detail can be found [1].
>>
>> There are two approaches to solve this issue:
>>
>> Approach 1:  We can add a Fn API for teardown purpose and the runner will
>> teardown a specific bundle descriptor via this teardown Fn API during
>> disposing.
>> Approach 2: The control service termination could be seen as a signal and
>> once SDK harness receives this signal, the teardown of the bundle
>> descriptor will be performed.
>>
>> More detail can be found in [2].
>>
>> As the Approach 2, SDK harness could be shared between multiple
>> executable stages. The control service termination only occurs when all the
>> executable stages sharing the same SDK harness finished. This means that
>> the teardown of DoFns may not be executed immediately after an executable
>> stage is finished.
>>
>> So, I prefer Approach 1. Welcome any feedback :)
>>
>> Best,
>> Jincheng
>>
>> [1]
>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
>> [2]
>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>>
>


Re: [DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-21 Thread Luke Cwik
Approach 2 is currently the suggested approach[1] for DoFn's to shutdown.
Note that SDK harnesses can terminate instances any time they want and
start new instances anytime as well.

Why do you want to expose this logic so that Runners could control it?

1:
https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#

On Mon, Oct 21, 2019 at 4:27 AM jincheng sun 
wrote:

> Hi,
> I found that in `SdkHarness` do not  stop the `SdkWorker` when finish.  We
> should add the logic for stop the `SdkWorker` in `SdkHarness`.  More detail
> can be found [1].
>
> There are two approaches to solve this issue:
>
> Approach 1:  We can add a Fn API for teardown purpose and the runner will
> teardown a specific bundle descriptor via this teardown Fn API during
> disposing.
> Approach 2: The control service termination could be seen as a signal and
> once SDK harness receives this signal, the teardown of the bundle
> descriptor will be performed.
>
> More detail can be found in [2].
>
> As the Approach 2, SDK harness could be shared between multiple executable
> stages. The control service termination only occurs when all the executable
> stages sharing the same SDK harness finished. This means that the teardown
> of DoFns may not be executed immediately after an executable stage is
> finished.
>
> So, I prefer Approach 1. Welcome any feedback :)
>
> Best,
> Jincheng
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
> [2]
> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>


[DISCUSS] How to stopp SdkWorker in SdkHarness

2019-10-21 Thread jincheng sun
Hi,
I found that in `SdkHarness` do not  stop the `SdkWorker` when finish.  We
should add the logic for stop the `SdkWorker` in `SdkHarness`.  More detail
can be found [1].

There are two approaches to solve this issue:

Approach 1:  We can add a Fn API for teardown purpose and the runner will
teardown a specific bundle descriptor via this teardown Fn API during
disposing.
Approach 2: The control service termination could be seen as a signal and
once SDK harness receives this signal, the teardown of the bundle
descriptor will be performed.

More detail can be found in [2].

As the Approach 2, SDK harness could be shared between multiple executable
stages. The control service termination only occurs when all the executable
stages sharing the same SDK harness finished. This means that the teardown
of DoFns may not be executed immediately after an executable stage is
finished.

So, I prefer Approach 1. Welcome any feedback :)

Best,
Jincheng

[1]
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
[2]
https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing