I see Reuven, so it is actually a broken contract for end users more than a bug. Concretely a user must have a way to execute code once the teardown is no more used and a teardown is populated by the user in the context of an execution. It means that if the environment wants to pool (cache) the instances it must provide a postBorrowFromCache and preReturnToCache to let the user handle that - we'll get back to EJB and passivation ;).
Personally I think it is fine to cache the instances for the duration of an execution but not accross execution. Concretely if you check out the API it should just not be possible for a runner since the lifecycle is not covered and the fact teardown can not be called today is an implementation bug/leak surfacing the API. So I see 2 options: 1. make it mandatory and get rid of the caching - which shouldnt help much in current state in terms of perf 2. keep teardown a final release object (which is not that useful cause of the end of the sentence) and add a clean cache lifecycle management tempted to say 1 is saner short terms, in particular cause beam is 2.x and users already use it this way. Romain Manni-Bucau @rmannibucau <https://twitter.com/rmannibucau> | Blog <https://rmannibucau.metawerx.net/> | Old Blog <http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> | LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book <https://www.packtpub.com/application-development/java-ee-8-high-performance> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>: > So the concern is that @TearDown might not be called? > > Let's understand the reason for @TearDown. The runner is free to cache the > DoFn object across many invocations, and indeed in streaming this is often > a critical optimization. However if the runner does decide to destroy the > DoFn object (e.g. because it's being evicted from cache), often users need > a callback to tear down associated resources (file handles, RPC > connections, etc.). > > Now @TearDown isn't guaranteed to be called for a simple reason: the > runner might never tear down the DoFn object! The runner might well decide > to cache the object forever, in which case there is never .a time to call > @TearDown. There is no violation of semantics here. > > Also, the point about not calling teardown if the JVM crashes might well > sound implicit with no need to mention it. However empirically users do > misunderstand even this, so it's worth mentioning. > > Reuven > > On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <rmannibu...@gmail.com > > wrote: > >> Hi guys, >> >> I'm a bit concerned of this PR https://github.com/apache/beam/pull/4637 >> >> I understand the intent but I'd like to share how I see it and why it is >> an issue for me: >> >> 1. you can't help if the JVM crash in any case. Tomcat had a try to >> preallocate some memory for instance to free it in case of OOME and try to >> recover but it never prooved useful and got dropped recently. This is a >> good example you can't do anything if there is a cataclism and therefore >> any framework or lib will not be blamed for it >> 2. if you expose an API, its behavior must be well defined. In the case >> of a portable library like Beam it is even more important otherwise it >> leads users to not use the API or the projet :(. >> >> >> These two points lead to say that if the JVM crashes it is ok to not call >> teardown and it is even implicit in any programming environment so no need >> to mention it. However that a runner doesn't call teardown is a bug and not >> a feature or something intended because it can have a huge impact on the >> user flow. >> >> The user workarounds are to use custom threads with timeouts to execute >> the actions or things like that, all bad solutions to replace a buggy API, >> if you remove the contract guarantee. >> >> To make it obvious: substring(from, to): will substring the current >> string between from and to...or not. Would you use the function? >> >> What I ask is to add in the javadoc that the contract enforces the runner >> to call that. Which means the core doesn't guarantee it but imposes the >> runner to do so. This way the not portable behavior is where it belongs to, >> in the vendor specific code. It leads to a reliable API for the end user >> and let runners document they don't respect - yet - the API when relevant. >> >> wdyt? >> >> Romain Manni-Bucau >> @rmannibucau <https://twitter.com/rmannibucau> | Blog >> <https://rmannibucau.metawerx.net/> | Old Blog >> <http://rmannibucau.wordpress.com> | Github >> <https://github.com/rmannibucau> | LinkedIn >> <https://www.linkedin.com/in/rmannibucau> | Book >> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >> > >