@TearDown refers to DoFn teardown not process teardown (it's basically a destructor). So it's also runner defined.
There may be a place for a container that lives as long as the process (not tied to the DoFn life). However that would be something new to add. On Fri, Feb 16, 2018, 8:52 AM Romain Manni-Bucau <rmannibu...@gmail.com> wrote: > finish bundle is well defined and must be called, right, not at the end so > you still miss teardown as a user. Bundles are defined by the runner and > you can have 100000 bundles per batch (even more for a stream ;)) so you > dont want to release your resources or handle you execution auditing in it, > you want it at the end so in tear down. > > So yes we must have teardown reliable somehow. > > > Romain Manni-Bucau > @rmannibucau <https://twitter.com/rmannibucau> | Blog > <https://rmannibucau.metawerx.net/> | Old Blog > <http://rmannibucau.wordpress.com> | Github > <https://github.com/rmannibucau> | LinkedIn > <https://www.linkedin.com/in/rmannibucau> | Book > <https://www.packtpub.com/application-development/java-ee-8-high-performance> > > 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>: > >> +1 I think @FinishBundle is the right thing to look at here. >> >> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <j...@nanthrax.net> >> wrote: >> >>> Hi Romain >>> >>> Is it not @FinishBundle your solution ? >>> >>> Regards >>> JB >>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rmannibu...@gmail.com> a >>> écrit: >>>> >>>> I see Reuven, so it is actually a broken contract for end users more >>>> than a bug. Concretely a user must have a way to execute code once the >>>> teardown is no more used and a teardown is populated by the user in the >>>> context of an execution. >>>> It means that if the environment wants to pool (cache) the instances it >>>> must provide a postBorrowFromCache and preReturnToCache to let the user >>>> handle that - we'll get back to EJB and passivation ;). >>>> >>>> Personally I think it is fine to cache the instances for the duration >>>> of an execution but not accross execution. Concretely if you check out the >>>> API it should just not be possible for a runner since the lifecycle is not >>>> covered and the fact teardown can not be called today is an implementation >>>> bug/leak surfacing the API. >>>> >>>> So I see 2 options: >>>> >>>> 1. make it mandatory and get rid of the caching - which shouldnt help >>>> much in current state in terms of perf >>>> 2. keep teardown a final release object (which is not that useful cause >>>> of the end of the sentence) and add a clean cache lifecycle management >>>> >>>> tempted to say 1 is saner short terms, in particular cause beam is 2.x >>>> and users already use it this way. >>>> >>>> >>>> >>>> Romain Manni-Bucau >>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>> <http://rmannibucau.wordpress.com> | Github >>>> <https://github.com/rmannibucau> | LinkedIn >>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>> >>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>: >>>> >>>>> So the concern is that @TearDown might not be called? >>>>> >>>>> Let's understand the reason for @TearDown. The runner is free to cache >>>>> the DoFn object across many invocations, and indeed in streaming this is >>>>> often a critical optimization. However if the runner does decide to >>>>> destroy >>>>> the DoFn object (e.g. because it's being evicted from cache), often users >>>>> need a callback to tear down associated resources (file handles, RPC >>>>> connections, etc.). >>>>> >>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the >>>>> runner might never tear down the DoFn object! The runner might well decide >>>>> to cache the object forever, in which case there is never .a time to call >>>>> @TearDown. There is no violation of semantics here. >>>>> >>>>> Also, the point about not calling teardown if the JVM crashes might >>>>> well sound implicit with no need to mention it. However empirically users >>>>> do misunderstand even this, so it's worth mentioning. >>>>> >>>>> Reuven >>>>> >>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau < >>>>> rmannibu...@gmail.com> wrote: >>>>> >>>>>> Hi guys, >>>>>> >>>>>> I'm a bit concerned of this PR >>>>>> https://github.com/apache/beam/pull/4637 >>>>>> >>>>>> I understand the intent but I'd like to share how I see it and why it >>>>>> is an issue for me: >>>>>> >>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to >>>>>> preallocate some memory for instance to free it in case of OOME and try >>>>>> to >>>>>> recover but it never prooved useful and got dropped recently. This is a >>>>>> good example you can't do anything if there is a cataclism and therefore >>>>>> any framework or lib will not be blamed for it >>>>>> 2. if you expose an API, its behavior must be well defined. In the >>>>>> case of a portable library like Beam it is even more important otherwise >>>>>> it >>>>>> leads users to not use the API or the projet :(. >>>>>> >>>>>> >>>>>> These two points lead to say that if the JVM crashes it is ok to not >>>>>> call teardown and it is even implicit in any programming environment so >>>>>> no >>>>>> need to mention it. However that a runner doesn't call teardown is a bug >>>>>> and not a feature or something intended because it can have a huge impact >>>>>> on the user flow. >>>>>> >>>>>> The user workarounds are to use custom threads with timeouts to >>>>>> execute the actions or things like that, all bad solutions to replace a >>>>>> buggy API, if you remove the contract guarantee. >>>>>> >>>>>> To make it obvious: substring(from, to): will substring the current >>>>>> string between from and to...or not. Would you use the function? >>>>>> >>>>>> What I ask is to add in the javadoc that the contract enforces the >>>>>> runner to call that. Which means the core doesn't guarantee it but >>>>>> imposes >>>>>> the runner to do so. This way the not portable behavior is where it >>>>>> belongs >>>>>> to, in the vendor specific code. It leads to a reliable API for the end >>>>>> user and let runners document they don't respect - yet - the API when >>>>>> relevant. >>>>>> >>>>>> wdyt? >>>>>> >>>>>> Romain Manni-Bucau >>>>>> @rmannibucau <https://twitter.com/rmannibucau> | Blog >>>>>> <https://rmannibucau.metawerx.net/> | Old Blog >>>>>> <http://rmannibucau.wordpress.com> | Github >>>>>> <https://github.com/rmannibucau> | LinkedIn >>>>>> <https://www.linkedin.com/in/rmannibucau> | Book >>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance> >>>>>> >>>>> >>>>> >>>> >