+1 I think @FinishBundle is the right thing to look at here. On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <j...@nanthrax.net> wrote:
> Hi Romain > > Is it not @FinishBundle your solution ? > > Regards > JB > Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rmannibu...@gmail.com> a > écrit: >> >> I see Reuven, so it is actually a broken contract for end users more than >> a bug. Concretely a user must have a way to execute code once the teardown >> is no more used and a teardown is populated by the user in the context of >> an execution. >> It means that if the environment wants to pool (cache) the instances it >> must provide a postBorrowFromCache and preReturnToCache to let the user >> handle that - we'll get back to EJB and passivation ;). >> >> Personally I think it is fine to cache the instances for the duration of >> an execution but not accross execution. Concretely if you check out the API >> it should just not be possible for a runner since the lifecycle is not >> covered and the fact teardown can not be called today is an implementation >> bug/leak surfacing the API. >> >> So I see 2 options: >> >> 1. make it mandatory and get rid of the caching - which shouldnt help >> much in current state in terms of perf >> 2. keep teardown a final release object (which is not that useful cause >> of the end of the sentence) and add a clean cache lifecycle management >> >> tempted to say 1 is saner short terms, in particular cause beam is 2.x >> and users already use it this way. >> >> >> >> Romain Manni-Bucau >> @rmannibucau <https://twitter.com/rmannibucau> | Blog >> <https://rmannibucau.metawerx.net/> | Old Blog >> <http://rmannibucau.wordpress.com> | Github >> <https://github.com/rmannibucau> | LinkedIn >> <https://www.linkedin.com/in/rmannibucau> | Book >> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >> >> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>: >> >>> So the concern is that @TearDown might not be called? >>> >>> Let's understand the reason for @TearDown. The runner is free to cache >>> the DoFn object across many invocations, and indeed in streaming this is >>> often a critical optimization. However if the runner does decide to destroy >>> the DoFn object (e.g. because it's being evicted from cache), often users >>> need a callback to tear down associated resources (file handles, RPC >>> connections, etc.). >>> >>> Now @TearDown isn't guaranteed to be called for a simple reason: the >>> runner might never tear down the DoFn object! The runner might well decide >>> to cache the object forever, in which case there is never .a time to call >>> @TearDown. There is no violation of semantics here. >>> >>> Also, the point about not calling teardown if the JVM crashes might well >>> sound implicit with no need to mention it. However empirically users do >>> misunderstand even this, so it's worth mentioning. >>> >>> Reuven >>> >>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau < >>> rmannibu...@gmail.com> wrote: >>> >>>> Hi guys, >>>> >>>> I'm a bit concerned of this PR >>>> https://github.com/apache/beam/pull/4637 >>>> >>>> I understand the intent but I'd like to share how I see it and why it >>>> is an issue for me: >>>> >>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to >>>> preallocate some memory for instance to free it in case of OOME and try to >>>> recover but it never prooved useful and got dropped recently. This is a >>>> good example you can't do anything if there is a cataclism and therefore >>>> any framework or lib will not be blamed for it >>>> 2. if you expose an API, its behavior must be well defined. In the case >>>> of a portable library like Beam it is even more important otherwise it >>>> leads users to not use the API or the projet :(. >>>> >>>> >>>> These two points lead to say that if the JVM crashes it is ok to not >>>> call teardown and it is even implicit in any programming environment so no >>>> need to mention it. However that a runner doesn't call teardown is a bug >>>> and not a feature or something intended because it can have a huge impact >>>> on the user flow. >>>> >>>> The user workarounds are to use custom threads with timeouts to execute >>>> the actions or things like that, all bad solutions to replace a buggy API, >>>> if you remove the contract guarantee. >>>> >>>> To make it obvious: substring(from, to): will substring the current >>>> string between from and to...or not. Would you use the function? >>>> >>>> What I ask is to add in the javadoc that the contract enforces the >>>> runner to call that. Which means the core doesn't guarantee it but imposes >>>> the runner to do so. This way the not portable behavior is where it belongs >>>> to, in the vendor specific code. It leads to a reliable API for the end >>>> user and let runners document they don't respect - yet - the API when >>>> relevant. >>>> >>>> wdyt? >>>> >>>> Romain Manni-Bucau >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>> <http://rmannibucau.wordpress.com> | Github >>>> <https://github.com/rmannibucau> | LinkedIn >>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>> >>> >>> >>