Re: @TearDown guarantees

Reuven Lax Mon, 19 Feb 2018 12:14:49 -0800

Workers restarting is not a bug, it's standard often expected.

On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:


> Nothing, as mentionned it is a bug so recovery is a bug recovery
> (procedure)
>
> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <kirpic...@google.com> a
> écrit :
>
>> So what would you like to happen if there is a crash? The DoFn instance
>> no longer exists because the JVM it ran on no longer exists. What should
>> Teardown be called on?
>>
>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rmannibu...@gmail.com>
>> wrote:
>>
>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>> there is an unexpected crash (= a bug).
>>>
>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>
>>>>
>>>>
>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>> rmannibu...@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>
>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>
>>>>>>
>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>> worker.
>>>>>>
>>>>>
>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>> the distribution you allocate to some fn the role to own the instance
>>>>> lookup and releasing.
>>>>>
>>>>
>>>> I still don't understand. Let's be more precise. If you write the
>>>> following code:
>>>>
>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>
>>>> There is no way to control how many instances of MyDoFn are created.
>>>> The runner might decided to create a million instances of this class across
>>>> your worker pool, which means that you will get a million Setup and
>>>> Teardown calls.
>>>>
>>>>
>>>>> Anyway this was just an example of an external resource you must
>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>> lifecycle to let user embrace its programming model.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> @Eugene:
>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>
>>>>>>>
>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>> This API defines a *container* API and therefore implies bean 
>>>>>>> lifecycles.
>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>
>>>>>>> A. Source
>>>>>>>
>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>
>>>>>>> connect()
>>>>>>> try {
>>>>>>>   estimateSize()
>>>>>>>   split()
>>>>>>> } finally {
>>>>>>>   disconnect()
>>>>>>> }
>>>>>>>
>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>
>>>>>>> connect()
>>>>>>> try {
>>>>>>>   estimateSize()
>>>>>>> } finally {
>>>>>>>   disconnect()
>>>>>>> }
>>>>>>> connect()
>>>>>>> try {
>>>>>>>   split()
>>>>>>> } finally {
>>>>>>>   disconnect()
>>>>>>> }
>>>>>>>
>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>> is often called in split but you dont want to connect twice in the 
>>>>>>> second
>>>>>>> phase.
>>>>>>>
>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>> I insists it is a very very basic concern for such API. However beam
>>>>>>> doesn't embraces it and doesn't assume it so building any API on top of
>>>>>>> beam is very hurtful today and for direct beam users you hit the exact 
>>>>>>> same
>>>>>>> issues - check how IO are implemented, the static utilities which create
>>>>>>> volatile connections preventing to reuse existing connection in a single
>>>>>>> method (
>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>>>>>>> ).
>>>>>>>
>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>
>>>>>>> B. DoFn & SDF
>>>>>>>
>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try {
>>>>>>> while (...) process(); } finally { destroy(); } and that it is executed 
>>>>>>> on
>>>>>>> the exact same instance to be able to be stateful at that level for
>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>
>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>
>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>> more instances and requires to have a way more strict/explicit 
>>>>>>> definition
>>>>>>> of the exact lifecycle and which instance does what. Since beam handles 
>>>>>>> the
>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>
>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse 
>>>>>>> a
>>>>>>> connection instance to not correct performances. Now with the SDF and 
>>>>>>> the
>>>>>>> split increase, how do you handle the pool size? Generally in batch you 
>>>>>>> use
>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use 
>>>>>>> a
>>>>>>> pool a bit higher but multiplied by the number of beans you will likely 
>>>>>>> x2
>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>> connection available". I you picked 1 (pool of #1), then you still have 
>>>>>>> to
>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure 
>>>>>>> you
>>>>>>> release the pool and don't leak the connection information in the JVM. 
>>>>>>> In
>>>>>>> all case you come back to the init()/destroy() lifecycle even if you 
>>>>>>> fake
>>>>>>> to get connections with bundles.
>>>>>>>
>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>> the current issues with the loose definition of the bean lifecycles at 
>>>>>>> an
>>>>>>> exponential level, nothing else.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Romain Manni-Bucau
>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>
>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <kirpic...@google.com>:
>>>>>>>
>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>> accomplished using the Wait transform as I suggested in the thread 
>>>>>>>> above,
>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>
>>>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>> understanding is that sdf could be a way to unify it and clean the 
>>>>>>>>> api.
>>>>>>>>>
>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bchamb...@apache.org> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an 
>>>>>>>>>> entire
>>>>>>>>>> composite PTransform. I think there have been discussions/proposals 
>>>>>>>>>> around
>>>>>>>>>> a more methodical "cleanup" option, but those haven't been 
>>>>>>>>>> implemented, to
>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>
>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>
>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>
>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to 
>>>>>>>>>> ensure it
>>>>>>>>>> runs on one worker.
>>>>>>>>>>
>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We
>>>>>>>>>> need an API for a PTransform to schedule some cleanup work for when 
>>>>>>>>>> the
>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery 
>>>>>>>>>> sink
>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>
>>>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>>>> until the end of the pipeline? Or do you want to wait until the end 
>>>>>>>>>> of the
>>>>>>>>>> window? In practice, you just want to wait until you know nobody 
>>>>>>>>>> will need
>>>>>>>>>> the resource anymore.
>>>>>>>>>>
>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>> could have a transform that output resource objects. Each resource 
>>>>>>>>>> object
>>>>>>>>>> would have logic for cleaning it up. And there would be something 
>>>>>>>>>> that
>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what 
>>>>>>>>>> kind of
>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the 
>>>>>>>>>> pipeline
>>>>>>>>>> had advanced far enough that it would no longer need the resources, 
>>>>>>>>>> they
>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>
>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker 
>>>>>>>>>>> - has
>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>
>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>
>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>> process or stattbundle and the last time beam has the instance 
>>>>>>>>>>> before it is
>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>
>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>
>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bchamb...@apache.org>
>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have 
>>>>>>>>>>>>> been noted
>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to 
>>>>>>>>>>>>> focus on more
>>>>>>>>>>>>> on the reason you are looking for something with different 
>>>>>>>>>>>>> semantics.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>> do):
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is 
>>>>>>>>>>>>> the case,
>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not 
>>>>>>>>>>>>> once per
>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the 
>>>>>>>>>>>>> pipeline
>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", 
>>>>>>>>>>>>> then what
>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch is
>>>>>>>>>>>>> done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>
>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same 
>>>>>>>>>>>> DoFn). How
>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) 
>>>>>>>>>>>> and when
>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When 
>>>>>>>>>>>> an
>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - 
>>>>>>>>>>>> may be
>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. Finalize some resources that are used within some region of
>>>>>>>>>>>>> the pipeline. While, the DoFn lifecycle methods are not a good 
>>>>>>>>>>>>> fit for this
>>>>>>>>>>>>> (they are focused on managing resources within the DoFn), you 
>>>>>>>>>>>>> could model
>>>>>>>>>>>>> this on how FileIO finalizes the files that it produced. For 
>>>>>>>>>>>>> instance:
>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>>>>>>> output the fact they're done
>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>
>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have 
>>>>>>>>>>>>> been
>>>>>>>>>>>>> finished by using the require deterministic input. This is 
>>>>>>>>>>>>> important to
>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? 
>>>>>>>>>>>>> That would
>>>>>>>>>>>>> help me understand both the problems with existing options and 
>>>>>>>>>>>>> possibly
>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>> means each transform is different in its lifecycle handling  
>>>>>>>>>>>>> except i
>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any 
>>>>>>>>>>>>> unified
>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to 
>>>>>>>>>>>>> integrate or to
>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>> kirpic...@google.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>>>>>>> various situations where it's logically impossible or 
>>>>>>>>>>>>>>> impractical to
>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some 
>>>>>>>>>>>>>>> of the
>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's 
>>>>>>>>>>>>>>> not just
>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. 
>>>>>>>>>>>>>>> cleaning up
>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number 
>>>>>>>>>>>>>>> of VMs you
>>>>>>>>>>>>>>> started etc), you have to express it using one of the other 
>>>>>>>>>>>>>>> methods that
>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. 
>>>>>>>>>>>>>>> no
>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>>>> which other method you speak about. Concretely if you make it 
>>>>>>>>>>>>>> really
>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users 
>>>>>>>>>>>>>> can use it
>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is 
>>>>>>>>>>>>>> unexpected and
>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or 
>>>>>>>>>>>>>> auto if
>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the 
>>>>>>>>>>>>>> difference and
>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered 
>>>>>>>>>>>>>>>> this thread.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and 
>>>>>>>>>>>>>>>> wrongly perceived
>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>> kirpic...@google.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in 
>>>>>>>>>>>>>>>>> a number of
>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has 
>>>>>>>>>>>>>>>>> crashed (eg user
>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it 
>>>>>>>>>>>>>>>>> segfaulted),
>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has 
>>>>>>>>>>>>>>>>> lost network
>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to 
>>>>>>>>>>>>>>>>> do anything
>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it 
>>>>>>>>>>>>>>>>> was preempted
>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the 
>>>>>>>>>>>>>>>>> worker was too
>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) 
>>>>>>>>>>>>>>>>> until the
>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware 
>>>>>>>>>>>>>>>>> simply failed
>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other 
>>>>>>>>>>>>>>>>> conditions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you 
>>>>>>>>>>>>>>>>> observed a
>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible 
>>>>>>>>>>>>>>>>> to call it
>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>> kirpic...@google.com>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>> k...@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibu...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the 
>>>>>>>>>>>>>>>>>>>>> following
>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the 
>>>>>>>>>>>>>>>>>>>>> following processing
>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is 
>>>>>>>>>>>>>>>>>>>>> not controlled.
>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection 
>>>>>>>>>>>>>>>>>>>>> since it is a
>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you 
>>>>>>>>>>>>>>>>>>>>> pay a lot - aws
>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - 
>>>>>>>>>>>>>>>>>>>>> concurrent limit.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing 
>>>>>>>>>>>>>>>>>>>> else can be called
>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you 
>>>>>>>>>>>>>>>>>>>> thinking of that
>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other 
>>>>>>>>>>>>>>>>>>>> end has died?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges 
>>>>>>>>>>>>>>>>>>>> which are not only
>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing 
>>>>>>>>>>>>>>>>>>>> them at the end. If
>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You 
>>>>>>>>>>>>>>>>>>>> can say it can be
>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic 
>>>>>>>>>>>>>>>>>>>> pipelines and if
>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human 
>>>>>>>>>>>>>>>>>>>> system? Nothing
>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due 
>>>>>>>>>>>>>>>>>>>> to some legacy
>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change 
>>>>>>>>>>>>>>>>>>> you're asking for.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a 
>>>>>>>>>>>>>>>>>>>> different behavior
>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the 
>>>>>>>>>>>>>>>>>>>> IOs he composes use
>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must 
>>>>>>>>>>>>>>>>>>>> be handled IMHO.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for 
>>>>>>>>>>>>>>>>>>>> years and do it
>>>>>>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee -
>>>>>>>>>>>>>>>>>>>> in the normal IT conditions - the execution of teardown. 
>>>>>>>>>>>>>>>>>>>> Then we see if we
>>>>>>>>>>>>>>>>>>>> can handle it and only if there is a technical reason we 
>>>>>>>>>>>>>>>>>>>> cant we make it
>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and 
>>>>>>>>>>>>>>>>>>>> flink can, any
>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing 
>>>>>>>>>>>>>>>>>>>> software) is
>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. 
>>>>>>>>>>>>>>>>>>>> Only case where it
>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a 
>>>>>>>>>>>>>>>>>>>> vendor and never
>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd 
>>>>>>>>>>>>>>>>>>>> to the vendor to
>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a 
>>>>>>>>>>>>>>>>>>>> vendor - otherwise
>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made 
>>>>>>>>>>>>>>>>>>>> optional right?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined 
>>>>>>>>>>>>>>>>>>>> lifecycle.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>

Re: @TearDown guarantees

Reply via email to