Re: Difference between setup() and activate()

Ananth G Thu, 03 Aug 2017 16:13:09 -0700

Thanks a lot Vlad , Pramod and Sanjay. This clarifies the differences for me.


Regards
Ananth

> On 4 Aug 2017, at 8:37 am, Pramod Immaneni <pra...@datatorrent.com> wrote:
> 
> Yes activate is called closer to start of tuple processing as far as apex
> is concerned, so if you are doing things like writing an input operator
> that does asynchronous processing where you will start receiving data as
> soon as you open a connection to your external source it is better to do it
> in activate to reduce latency and buffer build up.
> 
>> On Thu, Aug 3, 2017 at 3:07 PM, Vlad Rozov <v.rozo...@gmail.com> wrote:
>> 
>> Correct, both setup() and activate() are called when an operator is
>> restored from a checkpoint. When an operator is restored from a checkpoint
>> it is considered to be a new instance/deployment of an operator with it's
>> state reset to a checkpoint. In this case Apex core gives an operator a
>> chance to initialize transient fields both in setup() or activate().
>> 
>> I am not aware of any use case where platform will go through
>> activate/deactivate cycle without setup/teardown, but such code path may be
>> introduced in the future (for example it may be used to manage an input
>> operator with high emit rate). It is better not to make any assumptions on
>> how many times activate/deactivate may be called.
>> 
>> Currently the main difference between setup() and activate() is described
>> in the java doc for ActivationListener:
>> 
>> * An example of where one would consider implementing ActivationListener
>> is an * input operator which wants to consume a high throughput stream.
>> Since there is * typically at least a few hundreds of milliseconds between
>> the time the setup method * is called and the first window, you would want
>> to place the code to activate the * stream inside activate instead of setup.
>> 
>> 
>> My recommendation is to use setup() to initialize transient fields unless
>> you need to deal with the above case.
>> 
>> Thank you,
>> 
>> Vlad
>> 
>> 
>>> On 8/2/17 13:31, Ananth G wrote:
>>> 
>>> Hello Vlad,
>>> 
>>> Thanks for your response.
>>> 
>>> Do you refer to restoring from a checkpoint as serialize/deserialize
>>>>> cycles?
>>>>> 
>>>> Yes.
>>> 
>>> In case of restoring from a checkpoint (deserialization) setup() is a
>>>>> part of a redeployment request, AFAIK.
>>>>> 
>>>> This sounds a bit in contradiction to the response from Sanjay in the
>>> mail thread below. I tried to quickly glance in the apex-core code and it
>>> looks like both are being called ( Perhaps I am entirely wrong on this as
>>> it was only a quick scan). I was referring to the code in
>>> StreamingContainer.java in the engine package and the method called
>>> deploy().
>>> 
>>> 
>>> Please see ActivationListener javadoc for details when it is necessary to
>>>>> use activate() vs setup().
>>>>> 
>>>> I had to raise this question in the mail after going through the
>>> javadoc. The javadoc is a bit cryptic in this scenario of
>>> serialise/deserialize. Also the javadoc is not clear as to what we meant by
>>> activate/deactivate being called multiple times whereas setup is called
>>> once in a lifetime of the operator. If the setup is called once in lifetime
>>> of an operator per javadoc, did it mean once in the lifetime of the JVM
>>> instantiating via the constructor or across the deserialise cycles of the
>>> passivated operator state ? If it is once across all passivated instances
>>> of the operator, then setup() would not be called multiple times and hence
>>> not a great location for transient variables ? If setup() is called across
>>> deserialise cycles, then I find it more confusing as to why we need setup()
>>> and activate() methods almost having the same functionality.
>>> 
>>> Thoughts ?
>>> 
>>> 
>>> Regards,
>>> Ananth
>>> 
>>> 
>>>> On 1 Aug 2017, at 3:38 am, Vlad Rozov <v.ro...@datatorrent.com> wrote:
>>>> 
>>>> Do you refer to restoring from a checkpoint as serialize/deserialize
>>>> cycles? There are no calls to setup/teardown and/or activate/deactivate
>>>> during checkpointing/serialization. In case of restoring from a checkpoint
>>>> (deserialization) setup() is a part of a redeployment request, AFAIK. The
>>>> best answer to question 3 is it depends. In most cases using setup() to
>>>> resolve all transient field is as good as doing that in activate(). Please
>>>> see ActivationListener javadoc for details when it is necessary to use
>>>> activate() vs setup().
>>>> 
>>>> Thank you,
>>>> 
>>>> Vlad
>>>> 
>>>>> On 7/29/17 19:58, Sanjay Pujare wrote:
>>>>> 
>>>>> The Javadoc comment
>>>>> for com.datatorrent.api.Operator.ActivationListener<CONTEXT>  (in
>>>>> https://github.com/apache/apex-core/blob/master/api/src/main
>>>>> /java/com/datatorrent/api/Operator.java)
>>>>> should hopefully answer your questions.
>>>>> 
>>>>> Specifically:
>>>>> 
>>>>> 1. No, setup() is called only once in the entire lifetime (
>>>>> http://apex.apache.org/docs/apex/operator_development/#setup-call)
>>>>> 
>>>>> 2. Yes. When an operator is "activated" - first time in its life or
>>>>> reactivation after a failover -  actuvate() is called before the first
>>>>> beginWindow() is called.
>>>>> 
>>>>> 3. Yes.
>>>>> 
>>>>> 
>>>>> On Sun, Jul 30, 2017 at 12:18 AM, Ananth G <ananthg.a...@gmail.com>
>>>>> wrote:
>>>>> 
>>>>> Hello All,
>>>>>> 
>>>>>> I was looking at the documentation and could not get a clear
>>>>>> distinction
>>>>>> of behaviours for setup() and activate() during scenarios when an
>>>>>> operator
>>>>>> is passivated ( ex: application shutdown, repartition use cases ) and
>>>>>> being
>>>>>> brought back to life again. Could someone from the community advise me
>>>>>> on
>>>>>> the following questions ?
>>>>>> 
>>>>>> 1. Is setup() called in these scenarios (serialize/deserialize cycles)
>>>>>> as
>>>>>> well ?
>>>>>> 
>>>>>> 2. I am assuming activate() is called in these scenarios ? - The
>>>>>> javadoc
>>>>>> for activation states that the activate() can be called multiple times
>>>>>> (
>>>>>> without explicitly stating why ) and my assumption is that it is
>>>>>> because of
>>>>>> these scenarios.
>>>>>> 
>>>>>> 3. If setup() is only called once during the lifetime of an operator ,
>>>>>> is
>>>>>> it fair to assume that activate() is the best place to resolve all of
>>>>>> the
>>>>>> transient fields of an operator ?
>>>>>> 
>>>>>> 
>>>>>> Regards,
>>>>>> Ananth
>>>>>> 
>>>>> 
>>

Re: Difference between setup() and activate()

Reply via email to