Re: @TearDown guarantees

2018-02-21 Thread Ismaël Mejía
Hello, thanks Eugene for improving the documentation so we can close this thread. Reuven, I understood the semantics of the methods, what surprised me was that I interpreted the new documentation as if a runner could simply ignore to call @Teardown, and we already have dealt with the issues of not

Re: @TearDown guarantees

2018-02-20 Thread Romain Manni-Bucau
Le 21 févr. 2018 07:26, "Reuven Lax" a écrit : To close the loop here: Romain, I think your actual concern was that the Javadoc made it sound like a runner could simply decide not to call Teardown. If so, then I agree with you - the Javadoc was misleading (and appears it was confusing to Ismael

Re: @TearDown guarantees

2018-02-20 Thread Reuven Lax
To close the loop here: Romain, I think your actual concern was that the Javadoc made it sound like a runner could simply decide not to call Teardown. If so, then I agree with you - the Javadoc was misleading (and appears it was confusing to Ismael as well). If a runner destroys a DoFn, it _must_

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
+1 This PR clarifies the semantics quite a bit. On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov wrote: > I've sent out a PR editing the Javadoc https://github.com/ > apache/beam/pull/4711 . Hopefully, that should be sufficient. > > On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax wrote: > >> Ismael

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
+1, thks Eugene Romain Manni-Bucau @rmannibucau | Blog | Old Blog | Github | LinkedIn | Book

Re: @TearDown guarantees

2018-02-19 Thread Eugene Kirpichov
I've sent out a PR editing the Javadoc https://github.com/apache/beam/pull/4711 . Hopefully, that should be sufficient. On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax wrote: > Ismael, your understanding is appropriate for FinishBundle. > > One basic issue with this understanding, is that the lifecyc

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
Ismael, your understanding is appropriate for FinishBundle. One basic issue with this understanding, is that the lifecycle of a DoFn is much longer than a single bundle (which I think you expressed by adding the *s). How long the DoFn lives is not defined. In fact a runner is completely free to de

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
Agree let's try another time: any issue removing "best effort"? if yes, any issue explaining it is due to failure and not a runner choice? if one of both is fine then we close this thread and just decide who fixes it, if not we must define and discuss why there is a teardown now and how to impl

Re: @TearDown guarantees

2018-02-19 Thread Ismaël Mejía
I also had a different understanding of the lifecycle of a DoFn. My understanding of the use case for every method in the DoFn was clear and perfectly aligned with Thomas explanation, but what I understood was that in a general terms ‘@Setup was where I got resources/prepare connections and @Teard

Re: @TearDown guarantees

2018-02-19 Thread Eugene Kirpichov
Romain, would it be fair to say that currently the goal of your participation in this discussion is to identify situations where @Teardown in principle could have been called, but some of the current runners don't make a good enough effort to call it? If yes - as I said before, please, by all means

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
Le 19 févr. 2018 22:56, "Reuven Lax" a écrit : On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau wrote: > > > Le 19 févr. 2018 21:28, "Reuven Lax" a écrit : > > How do you call teardown? There are cases in which the Java code gets no > indication that the restart is happening (e.g. cases w

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau wrote: > > > Le 19 févr. 2018 21:28, "Reuven Lax" a écrit : > > How do you call teardown? There are cases in which the Java code gets no > indication that the restart is happening (e.g. cases where the machine > itself is taken down) > > > This

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
Le 19 févr. 2018 21:28, "Reuven Lax" a écrit : How do you call teardown? There are cases in which the Java code gets no indication that the restart is happening (e.g. cases where the machine itself is taken down) This is a bug, 0 downtime maintenance is very doable in 2018 ;). Crashes are bugs,

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
Le 19 févr. 2018 21:24, "Eugene Kirpichov" a écrit : Okay, so then this is exactly how Teardown works already, as we've discussed above - no change needed (except perhaps a clarification in docs, as also suggested above - feel free to send a PR). Do you agree with this much? This is all i asked

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
How do you call teardown? There are cases in which the Java code gets no indication that the restart is happening (e.g. cases where the machine itself is taken down) On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau wrote: > Restarting doesnt mean you dont call teardown. Except a bug there is no

Re: @TearDown guarantees

2018-02-19 Thread Eugene Kirpichov
Okay, so then this is exactly how Teardown works already, as we've discussed above - no change needed (except perhaps a clarification in docs, as also suggested above - feel free to send a PR). Do you agree with this much? I'm not sure I understand what's left to discuss in this thread. Teardown i

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
Restarting doesnt mean you dont call teardown. Except a bug there is no reason - technically - it happens, no reason. Le 19 févr. 2018 21:14, "Reuven Lax" a écrit : > Workers restarting is not a bug, it's standard often expected. > > On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau > wrote: >

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
Workers restarting is not a bug, it's standard often expected. On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau wrote: > Nothing, as mentionned it is a bug so recovery is a bug recovery > (procedure) > > Le 19 févr. 2018 19:42, "Eugene Kirpichov" a > écrit : > >> So what would you like to happ

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
Nothing, as mentionned it is a bug so recovery is a bug recovery (procedure) Le 19 févr. 2018 19:42, "Eugene Kirpichov" a écrit : > So what would you like to happen if there is a crash? The DoFn instance no > longer exists because the JVM it ran on no longer exists. What should > Teardown be cal

Re: @TearDown guarantees

2018-02-19 Thread Eugene Kirpichov
So what would you like to happen if there is a crash? The DoFn instance no longer exists because the JVM it ran on no longer exists. What should Teardown be called on? On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau wrote: > This is what i want and not 99 teardowns for 100 setups until

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
This is what i want and not 99 teardowns for 100 setups until there is an unexpected crash (= a bug). Le 19 févr. 2018 18:57, "Reuven Lax" a écrit : > > > On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau > wrote: > >> >> >> 2018-02-19 15:57 GMT+01:00 Reuven Lax : >> >>> >>> >>> On Mo

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau wrote: > > > 2018-02-19 15:57 GMT+01:00 Reuven Lax : > >> >> >> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau < >> rmannibu...@gmail.com> wrote: >> >>> @Reuven: in practise it is created by pool of 256 but leads to the same >>> pattern, t

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
2018-02-19 15:57 GMT+01:00 Reuven Lax : > > > On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >> @Reuven: in practise it is created by pool of 256 but leads to the same >> pattern, the teardown is just a "if (iCreatedThem) releaseThem();" >> > > How do you

Re: @TearDown guarantees

2018-02-19 Thread Reuven Lax
On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau wrote: > @Reuven: in practise it is created by pool of 256 but leads to the same > pattern, the teardown is just a "if (iCreatedThem) releaseThem();" > How do you control "256?" Even if you have a pool of 256 workers, nothing in Beam guarantee

Re: @TearDown guarantees

2018-02-19 Thread Romain Manni-Bucau
@Reuven: in practise it is created by pool of 256 but leads to the same pattern, the teardown is just a "if (iCreatedThem) releaseThem();" @Eugene: 1. wait logic is about passing the value which is not always possible (like 15% of cases from my raw estimate) 2. sdf: i'll try to detail why i mention

Re: @TearDown guarantees

2018-02-18 Thread Eugene Kirpichov
The kind of whole-transform lifecycle you're mentioning can be accomplished using the Wait transform as I suggested in the thread above, and I believe it should become the canonical way to do that. (Would like to reiterate one more time, as the main author of most design documents related to SDF a

Re: @TearDown guarantees

2018-02-18 Thread Reuven Lax
On Sun, Feb 18, 2018 at 11:07 AM, Reuven Lax wrote: > > > On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau < > rmannibu...@gmail.com> wrote: > >> >> >> Le 18 févr. 2018 19:28, "Ben Chambers" a écrit : >> >> It feels like his thread may be a bit off-track. Rather than focusing on >> the seman

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
I kind of agree except transforms lack a lifecycle too. My understanding is that sdf could be a way to unify it and clean the api. Otherwise how to normalize - single api - lifecycle of transforms? Le 18 févr. 2018 21:32, "Ben Chambers" a écrit : > Are you sure that focusing on the cleanup of

Re: @TearDown guarantees

2018-02-18 Thread Ben Chambers
Are you sure that focusing on the cleanup of specific DoFn's is appropriate? Many cases where cleanup is necessary, it is around an entire composite PTransform. I think there have been discussions/proposals around a more methodical "cleanup" option, but those haven't been implemented, to the best o

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Yes 1M. Lets try to explain you simplifying the overall execution. Each instance - one fn so likely in a thread of a worker - has its lifecycle. Caricaturally: "new" and garbage collection. In practise, new is often an unsafe allocate (deserialization) but it doesnt matter here. What i want is an

Re: @TearDown guarantees

2018-02-18 Thread Reuven Lax
On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau wrote: > > > Le 18 févr. 2018 19:28, "Ben Chambers" a écrit : > > It feels like his thread may be a bit off-track. Rather than focusing on > the semantics of the existing methods -- which have been noted to be meet > many existing use cases --

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Le 18 févr. 2018 19:28, "Ben Chambers" a écrit : It feels like his thread may be a bit off-track. Rather than focusing on the semantics of the existing methods -- which have been noted to be meet many existing use cases -- it would be helpful to focus on more on the reason you are looking for som

Re: @TearDown guarantees

2018-02-18 Thread Eugene Kirpichov
On Sun, Feb 18, 2018 at 10:25 AM Romain Manni-Bucau wrote: > 2018-02-18 19:19 GMT+01:00 Eugene Kirpichov : > >> FinishBundle has a stronger guarantee: if the pipeline succeeded, then it >> has been called for every succeeded bundle, and succeeded bundles together >> cover the entire input PCollec

Re: @TearDown guarantees

2018-02-18 Thread Ben Chambers
It feels like his thread may be a bit off-track. Rather than focusing on the semantics of the existing methods -- which have been noted to be meet many existing use cases -- it would be helpful to focus on more on the reason you are looking for something with different semantics. Some possibilitie

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
2018-02-18 19:19 GMT+01:00 Eugene Kirpichov : > FinishBundle has a stronger guarantee: if the pipeline succeeded, then it > has been called for every succeeded bundle, and succeeded bundles together > cover the entire input PCollection. Of course, it may not have been called > for failed bundles.

Re: @TearDown guarantees

2018-02-18 Thread Eugene Kirpichov
FinishBundle has a stronger guarantee: if the pipeline succeeded, then it has been called for every succeeded bundle, and succeeded bundles together cover the entire input PCollection. Of course, it may not have been called for failed bundles. To anticipate a possible objection "why not also keep r

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
2018-02-18 18:36 GMT+01:00 Eugene Kirpichov : > "Machine state" is overly low-level because many of the possible reasons > can happen on a perfectly fine machine. > If you'd like to rephrase it to "it will be called except in various > situations where it's logically impossible or impractical to g

Re: @TearDown guarantees

2018-02-18 Thread Eugene Kirpichov
"Machine state" is overly low-level because many of the possible reasons can happen on a perfectly fine machine. If you'd like to rephrase it to "it will be called except in various situations where it's logically impossible or impractical to guarantee that it's called", that's fine. Or you can lis

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Agree Eugene except that "best effort" means that. It is also often used to say "at will" and this is what triggered this thread. I'm fine using "except if the machine state prevents it" but "best effort" is too open and can be very badly and wrongly perceived by users (like I did). Romain Manni

Re: @TearDown guarantees

2018-02-18 Thread Eugene Kirpichov
It will not be called if it's impossible to call it: in the example situation you have (intergalactic crash), and in a number of more common cases: eg in case the worker container has crashed (eg user code in a different thread called a C library over JNI and it segfaulted), JVM bug, crash due to u

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
2018-02-18 18:00 GMT+01:00 Eugene Kirpichov : > > > On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau > wrote: > >> >> >> Le 18 févr. 2018 00:23, "Kenneth Knowles" a écrit : >> >> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau < >> rmannibu...@gmail.com> wrote: >>> >>> If you give an example

Re: @TearDown guarantees

2018-02-18 Thread Eugene Kirpichov
On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau wrote: > > > Le 18 févr. 2018 00:23, "Kenneth Knowles" a écrit : > > On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau > wrote: >> >> If you give an example of a high-level need (e.g. "I'm trying to write an >> IO for system $x and it requires

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Yes exactly JB, I just want to ensure the sdk/core API is clear and well defined and that any not respect of that falls into a runner bug. What I don't want is that a buggy impl leaks in the SDK/core definition. Romain Manni-Bucau @rmannibucau | Blog

Re: @TearDown guarantees

2018-02-18 Thread Jean-Baptiste Onofré
My bad, I thought you talked about guarantee in the Runner API. If it's semantic point in the SDK (enforcement instead of best effort), and then if the runner doesn't respect that, it's a limitation/bug in the runner, I would agree with that. Regards JB On 18/02/2018 16:58, Romain Manni-Buca

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Le 18 févr. 2018 15:39, "Jean-Baptiste Onofré" a écrit : Hi, I think, as you said, it depends of the protocol and the IO. For instance, in first version of JdbcIO, I created the connections in @Setup and released in @Teardown. But, in case of streaming system, it's not so good (especially for

Re: @TearDown guarantees

2018-02-18 Thread Jean-Baptiste Onofré
Hi, I think, as you said, it depends of the protocol and the IO. For instance, in first version of JdbcIO, I created the connections in @Setup and released in @Teardown. But, in case of streaming system, it's not so good (especially for pooling) as the connection stays open for a very long time.

Re: @TearDown guarantees

2018-02-18 Thread Romain Manni-Bucau
Le 18 févr. 2018 00:23, "Kenneth Knowles" a écrit : On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau wrote: > > If you give an example of a high-level need (e.g. "I'm trying to write an > IO for system $x and it requires the following initialization and the > following cleanup logic and the f

Re: @TearDown guarantees

2018-02-17 Thread Kenneth Knowles
On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau wrote: > > If you give an example of a high-level need (e.g. "I'm trying to write an > IO for system $x and it requires the following initialization and the > following cleanup logic and the following processing in between") I'll be > better able

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
Le 17 févr. 2018 22:31, "Eugene Kirpichov" a écrit : On Sat, Feb 17, 2018 at 1:10 PM Romain Manni-Bucau wrote: > You phrased it right Eugene - thanks for that. > > However the solution is not functional I think - hope I missed something. > With distribution etc you cant use by reference param p

Re: @TearDown guarantees

2018-02-17 Thread Eugene Kirpichov
On Sat, Feb 17, 2018 at 1:10 PM Romain Manni-Bucau wrote: > You phrased it right Eugene - thanks for that. > > However the solution is not functional I think - hope I missed something. > With distribution etc you cant use by reference param passing, therefore no > way to clean up the internal sta

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
You phrased it right Eugene - thanks for that. However the solution is not functional I think - hope I missed something. With distribution etc you cant use by reference param passing, therefore no way to clean up the internal states of another fn. So i kind of feel back to the original need :(. A

Re: @TearDown guarantees

2018-02-17 Thread Jean-Baptiste Onofré
I agree, it's a decent assumption. Regards JB On 02/17/2018 05:59 PM, Romain Manni-Bucau wrote: > Assuming a Pipeline.run(); the corresponding sequence: > > WorkerStartFn(); > WorkerEndFn(); > > So a single instance of the fn for the full pipeline execution. > > Le 17 févr. 2018 17:42, "Reuven

Re: @TearDown guarantees

2018-02-17 Thread Eugene Kirpichov
Actually the initialization should be treated using Wait transform too. So basically the pattern is just: input.apply(Wait.on(...initialization result...)) .apply(...your processing...) .apply(Wait.on(...finalization result...)) where initialization and finalization results can be computed us

Re: @TearDown guarantees

2018-02-17 Thread Eugene Kirpichov
"Single instance of the fn for the full pipeline execution", if taken literally, is incompatible: - with parallelization: requiring a single instance rules out multiple parallel/distributed instances - with fault tolerance: what if the worker running this "single instance" crashes or becomes a zomb

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
Assuming a Pipeline.run(); the corresponding sequence: WorkerStartFn(); WorkerEndFn(); So a single instance of the fn for the full pipeline execution. Le 17 févr. 2018 17:42, "Reuven Lax" a écrit : > " and a transform is by design bound to an execution" > > What do you mean by execution? > > O

Re: @TearDown guarantees

2018-02-17 Thread Reuven Lax
" and a transform is by design bound to an execution" What do you mean by execution? On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau wrote: > > > Le 16 févr. 2018 22:41, "Reuven Lax" a écrit : > > Kenn is correct. Allowing Fn reuse across bundles was a major, major > performance improveme

Re: @TearDown guarantees

2018-02-17 Thread Romain Manni-Bucau
Le 16 févr. 2018 22:41, "Reuven Lax" a écrit : Kenn is correct. Allowing Fn reuse across bundles was a major, major performance improvement. Profiling on the old Dataflow SDKs consistently showed Java serialization being the number one performance bottleneck for streaming pipelines, and Beam fixe

Re: @TearDown guarantees

2018-02-16 Thread Reuven Lax
Kenn is correct. Allowing Fn reuse across bundles was a major, major performance improvement. Profiling on the old Dataflow SDKs consistently showed Java serialization being the number one performance bottleneck for streaming pipelines, and Beam fixed this. Romain - can you state precisely what yo

Re: @TearDown guarantees

2018-02-16 Thread Kenneth Knowles
On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau wrote: > > The serialization of fn being once per bundle, the perf impact is only > huge if there is a bug somewhere else, even java serialization is > negligeable on big config compared to any small pipeline (seconds vs > minutes). > Profiling

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
Le 16 févr. 2018 19:28, "Kenneth Knowles" a écrit : On Fri, Feb 16, 2018 at 9:39 AM, Romain Manni-Bucau wrote: > > 2018-02-16 18:18 GMT+01:00 Kenneth Knowles : > >> Which runner's bundling are you concerned with? It sounds like the Flink >> runner? >> > > Flink, Spark, DirectRunner, DataFlow at

Re: @TearDown guarantees

2018-02-16 Thread Kenneth Knowles
On Fri, Feb 16, 2018 at 9:39 AM, Romain Manni-Bucau wrote: > > 2018-02-16 18:18 GMT+01:00 Kenneth Knowles : > >> Which runner's bundling are you concerned with? It sounds like the Flink >> runner? >> > > Flink, Spark, DirectRunner, DataFlow at least (others would be good but > are out of scope) >

Re: @TearDown guarantees

2018-02-16 Thread Thomas Groh
On perf: Deserialization of an arbitrary object is expensive. This cost is amortized over all of the elements that the object processes, but for a runner with small bundles, that cost never gets meaningfully amortized - deserializing a DoFn instance of unknown complexity to process one element mean

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
So do I get it right a leak of Dataflow implementation impacts the API? Also sounds like this perf issues is due to a blind serialization instead of modelizing what is serialized - nothing should be slow enough in the serialization at that level, do you have more details on that particular point? I

Re: @TearDown guarantees

2018-02-16 Thread Kenneth Knowles
Which runner's bundling are you concerned with? It sounds like the Flink runner? Kenn On Fri, Feb 16, 2018 at 9:04 AM, Romain Manni-Bucau wrote: > > 2018-02-16 17:59 GMT+01:00 Kenneth Knowles : > >> What I am hearing is this: >> >> - @FinishBundle does what you want (a reliable "flush" call)

Re: @TearDown guarantees

2018-02-16 Thread Thomas Groh
I'll note as well that you don't need a well defined DoFn lifecycle method - you just want less granular bundling, which is a different requirement. Teardown has well-defined interactions with the rest of the DoFn methods, and what the runner is permitted to do when it calls Teardown - the fact th

Re: @TearDown guarantees

2018-02-16 Thread Thomas Groh
Given that I'm the original author of both the @Setup and @Teardown methods and the PR under discussion, I thought I'd drop in to give in a bit of history and my thoughts on the issue. Originally (Dataflow 1.x), the spec required a Runner to deserialize a new instance of a DoFn for every Bundle. F

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
2018-02-16 17:59 GMT+01:00 Kenneth Knowles : > What I am hearing is this: > > - @FinishBundle does what you want (a reliable "flush" call) but your > runner is not doing a good job of bundling > Nop, finishbundle is defined but not a bundle. Typically for 1 million rows I'll get 1 million calls

Re: @TearDown guarantees

2018-02-16 Thread Kenneth Knowles
What I am hearing is this: - @FinishBundle does what you want (a reliable "flush" call) but your runner is not doing a good job of bundling - @Teardown has well-defined semantics and they are not what you want So you are hoping for something that is called less frequently but is still mandatory

Re: @TearDown guarantees

2018-02-16 Thread Reuven Lax
@TearDown refers to DoFn teardown not process teardown (it's basically a destructor). So it's also runner defined. There may be a place for a container that lives as long as the process (not tied to the DoFn life). However that would be something new to add. On Fri, Feb 16, 2018, 8:52 AM Romain M

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
finish bundle is well defined and must be called, right, not at the end so you still miss teardown as a user. Bundles are defined by the runner and you can have 10 bundles per batch (even more for a stream ;)) so you dont want to release your resources or handle you execution auditing in it, yo

Re: @TearDown guarantees

2018-02-16 Thread Reuven Lax
+1 I think @FinishBundle is the right thing to look at here. On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré wrote: > Hi Romain > > Is it not @FinishBundle your solution ? > > Regards > JB > Le 16 févr. 2018, à 17:06, Romain Manni-Bucau a > écrit: >> >> I see Reuven, so it is actually a brok

Re: @TearDown guarantees

2018-02-16 Thread Jean-Baptiste Onofré
Hi Romain Is it not @FinishBundle your solution ? Regards JB Le 16 févr. 2018 à 17:06, à 17:06, Romain Manni-Bucau a écrit: >I see Reuven, so it is actually a broken contract for end users more >than a >bug. Concretely a user must have a way to execute code once the >teardown is >no more used

Re: @TearDown guarantees

2018-02-16 Thread Reuven Lax
On Fri, Feb 16, 2018 at 8:06 AM, Romain Manni-Bucau wrote: > I see Reuven, so it is actually a broken contract for end users more than > a bug. Concretely a user must have a way to execute code once the teardown > is no more used and a teardown is populated by the user in the context of > an exec

Re: @TearDown guarantees

2018-02-16 Thread Kenneth Knowles
It sounds like you just want @FinishBundle On Fri, Feb 16, 2018 at 8:06 AM, Romain Manni-Bucau wrote: > I see Reuven, so it is actually a broken contract for end users more than > a bug. Concretely a user must have a way to execute code once the teardown > is no more used and a teardown is popul

Re: @TearDown guarantees

2018-02-16 Thread Romain Manni-Bucau
I see Reuven, so it is actually a broken contract for end users more than a bug. Concretely a user must have a way to execute code once the teardown is no more used and a teardown is populated by the user in the context of an execution. It means that if the environment wants to pool (cache) the ins

Re: @TearDown guarantees

2018-02-16 Thread Reuven Lax
So the concern is that @TearDown might not be called? Let's understand the reason for @TearDown. The runner is free to cache the DoFn object across many invocations, and indeed in streaming this is often a critical optimization. However if the runner does decide to destroy the DoFn object (e.g. be