I like approach 3 since it doesn't add additional complexity to the API and
individual SDKs can choose to implement any clean-up strategy they want or
none at all which is the simplest.

On Thu, Oct 24, 2019 at 8:46 PM jincheng sun <[email protected]>
wrote:

> Hi,
>
> Thanks for your comments in doc, I have add Approach 3 which you
> mentioned! @Luke
>
> For now, we should do a decision for Approach 3 and Approach 1. Detail can
> be found in doc [1]
>
> Welcome anyone's feedback :)
>
> Regards,
> Jincheng
>
> [1]
> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>
> jincheng sun <[email protected]> 于2019年10月25日周五 上午10:40写道:
>
>> Hi,
>>
>> Functionally capable of `abort`, but it will be called at the end of
>> operator. So, I prefer `dispose` semantics. i.e., all normal logic has been
>> executed.
>>
>> Best,
>> Jincheng
>>
>> Harsh Vardhan <[email protected]> 于2019年10月23日周三 上午12:14写道:
>>
>>> Would approach 1 be akin to abort semantics?
>>>
>>> On Mon, Oct 21, 2019 at 8:01 PM jincheng sun <[email protected]>
>>> wrote:
>>>
>>>> Hi Luke,
>>>>
>>>> Thanks a lot for your reply. Since it allows to share one SDK harness
>>>> between multiple executable stages, the control service termination may
>>>> occur much later than the completion of an executable stage. This is the
>>>> main reason I prefer runners to control the teardown of DoFns.
>>>>
>>>> Regarding to "SDK harnesses can terminate instances any time they want
>>>> and start new instances anytime as well.", personally I think it's not
>>>> conflict with the proposed Approach 1 as the SDK harness could decide what
>>>> to do when receiving the teardown request. It could do nothing if the DoFns
>>>> has already been teared down and could also tear down the DoFns if needed.
>>>>
>>>> What do you think?
>>>>
>>>> Best,
>>>> Jincheng
>>>>
>>>> Luke Cwik <[email protected]> 于2019年10月22日周二 上午2:05写道:
>>>>
>>>>> Approach 2 is currently the suggested approach[1] for DoFn's to
>>>>> shutdown.
>>>>> Note that SDK harnesses can terminate instances any time they want and
>>>>> start new instances anytime as well.
>>>>>
>>>>> Why do you want to expose this logic so that Runners could control it?
>>>>>
>>>>> 1:
>>>>> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit#
>>>>>
>>>>> On Mon, Oct 21, 2019 at 4:27 AM jincheng sun <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>> I found that in `SdkHarness` do not  stop the `SdkWorker` when
>>>>>> finish.  We should add the logic for stop the `SdkWorker` in 
>>>>>> `SdkHarness`.
>>>>>> More detail can be found [1].
>>>>>>
>>>>>> There are two approaches to solve this issue:
>>>>>>
>>>>>> Approach 1:  We can add a Fn API for teardown purpose and the runner
>>>>>> will teardown a specific bundle descriptor via this teardown Fn API 
>>>>>> during
>>>>>> disposing.
>>>>>> Approach 2: The control service termination could be seen as a signal
>>>>>> and once SDK harness receives this signal, the teardown of the bundle
>>>>>> descriptor will be performed.
>>>>>>
>>>>>> More detail can be found in [2].
>>>>>>
>>>>>> As the Approach 2, SDK harness could be shared between multiple
>>>>>> executable stages. The control service termination only occurs when all 
>>>>>> the
>>>>>> executable stages sharing the same SDK harness finished. This means that
>>>>>> the teardown of DoFns may not be executed immediately after an executable
>>>>>> stage is finished.
>>>>>>
>>>>>> So, I prefer Approach 1. Welcome any feedback :)
>>>>>>
>>>>>> Best,
>>>>>> Jincheng
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py
>>>>>> [2]
>>>>>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing
>>>>>>
>>>>> --
>>>
>>> Got feedback? go/harsh-feedback <https://goto.google.com/harsh-feedback>
>>>
>>

Reply via email to