It means that your DoFn class doesn't need to be thread safe, because when
a runner wants to run it in multiple threads, it will create multiple
copies of your DoFn.

On Mon, Oct 16, 2017, 10:27 AM Derek Hao Hu <[email protected]> wrote:

> Hi Eugene,
>
> I'm not sure I understand what you mean - can you explain a bit more about
> "an individual instance will be accessed only serially but not
> concurrently"?
>
> Thanks,
>
> Derek​
>
> On Mon, Oct 16, 2017 at 8:50 AM, Eugene Kirpichov <[email protected]>
> wrote:
>
>> A worker can execute several instances of the same DoFn at the same time.
>> They will be clones of the original DoFn specified in the pipeline and an
>> individual instance will be accessed only serially but not concurrently.
>>
>> On Mon, Oct 16, 2017, 8:38 AM Jacob Marble <[email protected]> wrote:
>>
>>> Perfect, thanks.
>>>
>>> Jacob
>>>
>>> On Sun, Oct 15, 2017 at 11:43 PM, Jean-Baptiste Onofré <[email protected]>
>>> wrote:
>>>
>>>> Yes, no problem at all. I meant that the DoFn is "attached" to a
>>>> pipeline.
>>>>
>>>> Regards
>>>> JB
>>>>
>>>> On 10/16/2017 08:25 AM, Derek Hao Hu wrote:
>>>>
>>>>> I believe a worker can execute multiple instances (i.e. threads) of a
>>>>> DoFn.
>>>>>
>>>>> Derek
>>>>>
>>>>> On Sun, Oct 15, 2017 at 10:46 PM, Jean-Baptiste Onofré <
>>>>> [email protected] <mailto:[email protected]>> wrote:
>>>>>
>>>>>     Hi,
>>>>>
>>>>>     Correct, @setup is used when bootstrapping the DoFn, @StartBundle
>>>>> is called
>>>>>     for a set of data (bundle), @ProcessElement is for each element in
>>>>> the
>>>>>     bundle/collection, @FinishBundle at the end of the dataset
>>>>> (bundle),
>>>>>     @Teardown is called when the DoFn is "removed".
>>>>>
>>>>>     A DoFn is per pipeline.
>>>>>
>>>>>     Regards
>>>>>     JB
>>>>>
>>>>>
>>>>>     On 10/16/2017 07:31 AM, Jacob Marble wrote:
>>>>>
>>>>>         (there might be documentation on this that I didn't find; if
>>>>> so a link
>>>>>         is sufficient)
>>>>>
>>>>>         Good evening, this is just a check on my understanding. It
>>>>> looks like an
>>>>>         instance of a given DoFn goes through this lifecycle. Am I
>>>>> correct?
>>>>>
>>>>>         - constructor
>>>>>         - @Setup (once)
>>>>>             - @StartBundle (zero to many times)
>>>>>               - @ProcessContext (zero to many times)
>>>>>             - @FinishBundle
>>>>>         - @Teardown (once)
>>>>>
>>>>>         Can any of these steps be called concurrently? (I believe no)
>>>>>         Can one worker execute multiple instances of a DoFn? (I
>>>>> believe yes)
>>>>>
>>>>>         Thank you,
>>>>>
>>>>>         Jacob
>>>>>
>>>>>
>>>>>     --     Jean-Baptiste Onofré
>>>>>     [email protected] <mailto:[email protected]>
>>>>>     http://blog.nanthrax.net
>>>>>     Talend - http://www.talend.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Derek Hao Hu
>>>>>
>>>>> Software Engineer | Snapchat
>>>>> Snap Inc.
>>>>>
>>>>
>>>> --
>>>> Jean-Baptiste Onofré
>>>> [email protected]
>>>> http://blog.nanthrax.net
>>>> Talend - http://www.talend.com
>>>>
>>>
>>>
>
>
> --
> Derek Hao Hu
>
> Software Engineer | Snapchat
> Snap Inc.
>

Reply via email to