I believe it doesn't need to be stable across refactoring, only across all
workers executing a specific version of the code. Specifically, it is used
as follows:

1. Create a pipeline on the user's machine. It walks the stack until the
static initializer block, which provides an ID.
2. Send the pipeline to many worker machines.
3. Each worker machine walks the stack until the static initializer block
(on the same version of the code), receiving the same ID.

This ensures that the tupletag is the same on all the workers, as well as
on the user's machine, which is critical since it used as an identifier
across these machines.

Assigning a UUID would work if all of the machines agreed on the same tuple
ID, which could be accomplished with serialization. Serialization, however,
doesn't work well with static initializers, since those will have been
called to initialize the class at load time.

On Tue, Apr 10, 2018 at 10:27 AM Romain Manni-Bucau <rmannibu...@gmail.com>
wrote:

> Well issue is more about all the existing tests currently.
>
> Out of curiosity: how walking the stack is stable since the stack can
> change? Stop condition is the static block of a class which can use method
> so refactoring and therefore is not stable. Should it be deprecated?
>
>
> Le 10 avr. 2018 19:17, "Robert Bradshaw" <rober...@google.com> a écrit :
>
> If it's too slow perhaps you could use the constructor where you pass an
> explicit id (though in my experience walking the stack isn't that slow).
>
> On Tue, Apr 10, 2018 at 10:09 AM Romain Manni-Bucau <rmannibu...@gmail.com>
> wrote:
>
>> Oops cross post sorry.
>>
>> Issue i hit on this thread is it is used a lot in tests abd it slows down
>> tests for nothing like with generatesequence ones
>>
>> Le 10 avr. 2018 19:00, "Romain Manni-Bucau" <rmannibu...@gmail.com> a
>> écrit :
>>
>>>
>>>
>>> Le 10 avr. 2018 18:40, "Robert Bradshaw" <rober...@google.com> a écrit :
>>>
>>> These values should be, inasmuch as possible, stable across VMs. How
>>> slow is slow? Doesn't this happen only once per VM startup?
>>>
>>>
>>> Once per jvm and idea launches a jvm per test and the daemon does save
>>> enough time, you still go through the whole project and check all upstream
>>> deps it seems.
>>>
>>> It is <1s with maven vs 5-6s with gradle.
>>>
>>>
>>> On Tue, Apr 10, 2018 at 9:33 AM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> does org.apache.beam.sdk.values.TupleTag#genId need to get the
>>>> stacktrace or can we use any id generator (like
>>>> UUID.random().toString())? Using traces is quite slow under load and
>>>> environments where the root stack is not just the "next" level so
>>>> skipping it would be nice.
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>>
>>>
>>>
>

Reply via email to