Re: @TearDown guarantees

Reuven Lax Fri, 16 Feb 2018 08:44:15 -0800

+1 I think @FinishBundle is the right thing to look at here.

On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <[email protected]> wrote:


> Hi Romain
>
> Is it not @FinishBundle your solution ?
>
> Regards
> JB
> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <[email protected]> a
> écrit:
>>
>> I see Reuven, so it is actually a broken contract for end users more than
>> a bug. Concretely a user must have a way to execute code once the teardown
>> is no more used and a teardown is populated by the user in the context of
>> an execution.
>> It means that if the environment wants to pool (cache) the instances it
>> must provide a postBorrowFromCache and preReturnToCache to let the user
>> handle that - we'll get back to EJB and passivation ;).
>>
>> Personally I think it is fine to cache the instances for the duration of
>> an execution but not accross execution. Concretely if you check out the API
>> it should just not be possible for a runner since the lifecycle is not
>> covered and the fact teardown can not be called today is an implementation
>> bug/leak surfacing the API.
>>
>> So I see 2 options:
>>
>> 1. make it mandatory and get rid of the caching - which shouldnt help
>> much in current state in terms of perf
>> 2. keep teardown a final release object (which is not that useful cause
>> of the end of the sentence) and add a clean cache lifecycle management
>>
>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>> and users already use it this way.
>>
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> |  Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <[email protected]>:
>>
>>> So the concern is that @TearDown might not be called?
>>>
>>> Let's understand the reason for @TearDown. The runner is free to cache
>>> the DoFn object across many invocations, and indeed in streaming this is
>>> often a critical optimization. However if the runner does decide to destroy
>>> the DoFn object (e.g. because it's being evicted from cache), often users
>>> need a callback to tear down associated resources (file handles, RPC
>>> connections, etc.).
>>>
>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>> runner might never tear down the DoFn object! The runner might well decide
>>> to cache the object forever, in which case there is never .a time to call
>>> @TearDown. There is no violation of semantics here.
>>>
>>> Also, the point about not calling teardown if the JVM crashes might well
>>> sound implicit with no need to mention it. However empirically users do
>>> misunderstand even this, so it's worth mentioning.
>>>
>>> Reuven
>>>
>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>> [email protected]> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I'm a bit concerned of this PR
>>>> https://github.com/apache/beam/pull/4637
>>>>
>>>> I understand the intent but I'd like to share how I see it and why it
>>>> is an issue for me:
>>>>
>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>> recover but it never prooved useful and got dropped recently. This is a
>>>> good example you can't do anything if there is a cataclism and therefore
>>>> any framework or lib will not be blamed for it
>>>> 2. if you expose an API, its behavior must be well defined. In the case
>>>> of a portable library like Beam it is even more important otherwise it
>>>> leads users to not use the API or the projet :(.
>>>>
>>>>
>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>> call teardown and it is even implicit in any programming environment so no
>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>> and not a feature or something intended because it can have a huge impact
>>>> on the user flow.
>>>>
>>>> The user workarounds are to use custom threads with timeouts to execute
>>>> the actions or things like that, all bad solutions to replace a buggy API,
>>>> if you remove the contract guarantee.
>>>>
>>>> To make it obvious: substring(from, to): will substring the current
>>>> string between from and to...or not. Would you use the function?
>>>>
>>>> What I ask is to add in the javadoc that the contract enforces the
>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>> user and let runners document they don't respect - yet - the API when
>>>> relevant.
>>>>
>>>> wdyt?
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> |  Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>
>>>
>>

Re: @TearDown guarantees

Reply via email to