Thanks a lot Vlad , Pramod and Sanjay. This clarifies the differences for me.
Regards Ananth > On 4 Aug 2017, at 8:37 am, Pramod Immaneni <pra...@datatorrent.com> wrote: > > Yes activate is called closer to start of tuple processing as far as apex > is concerned, so if you are doing things like writing an input operator > that does asynchronous processing where you will start receiving data as > soon as you open a connection to your external source it is better to do it > in activate to reduce latency and buffer build up. > >> On Thu, Aug 3, 2017 at 3:07 PM, Vlad Rozov <v.rozo...@gmail.com> wrote: >> >> Correct, both setup() and activate() are called when an operator is >> restored from a checkpoint. When an operator is restored from a checkpoint >> it is considered to be a new instance/deployment of an operator with it's >> state reset to a checkpoint. In this case Apex core gives an operator a >> chance to initialize transient fields both in setup() or activate(). >> >> I am not aware of any use case where platform will go through >> activate/deactivate cycle without setup/teardown, but such code path may be >> introduced in the future (for example it may be used to manage an input >> operator with high emit rate). It is better not to make any assumptions on >> how many times activate/deactivate may be called. >> >> Currently the main difference between setup() and activate() is described >> in the java doc for ActivationListener: >> >> * An example of where one would consider implementing ActivationListener >> is an * input operator which wants to consume a high throughput stream. >> Since there is * typically at least a few hundreds of milliseconds between >> the time the setup method * is called and the first window, you would want >> to place the code to activate the * stream inside activate instead of setup. >> >> >> My recommendation is to use setup() to initialize transient fields unless >> you need to deal with the above case. >> >> Thank you, >> >> Vlad >> >> >>> On 8/2/17 13:31, Ananth G wrote: >>> >>> Hello Vlad, >>> >>> Thanks for your response. >>> >>> Do you refer to restoring from a checkpoint as serialize/deserialize >>>>> cycles? >>>>> >>>> Yes. >>> >>> In case of restoring from a checkpoint (deserialization) setup() is a >>>>> part of a redeployment request, AFAIK. >>>>> >>>> This sounds a bit in contradiction to the response from Sanjay in the >>> mail thread below. I tried to quickly glance in the apex-core code and it >>> looks like both are being called ( Perhaps I am entirely wrong on this as >>> it was only a quick scan). I was referring to the code in >>> StreamingContainer.java in the engine package and the method called >>> deploy(). >>> >>> >>> Please see ActivationListener javadoc for details when it is necessary to >>>>> use activate() vs setup(). >>>>> >>>> I had to raise this question in the mail after going through the >>> javadoc. The javadoc is a bit cryptic in this scenario of >>> serialise/deserialize. Also the javadoc is not clear as to what we meant by >>> activate/deactivate being called multiple times whereas setup is called >>> once in a lifetime of the operator. If the setup is called once in lifetime >>> of an operator per javadoc, did it mean once in the lifetime of the JVM >>> instantiating via the constructor or across the deserialise cycles of the >>> passivated operator state ? If it is once across all passivated instances >>> of the operator, then setup() would not be called multiple times and hence >>> not a great location for transient variables ? If setup() is called across >>> deserialise cycles, then I find it more confusing as to why we need setup() >>> and activate() methods almost having the same functionality. >>> >>> Thoughts ? >>> >>> >>> Regards, >>> Ananth >>> >>> >>>> On 1 Aug 2017, at 3:38 am, Vlad Rozov <v.ro...@datatorrent.com> wrote: >>>> >>>> Do you refer to restoring from a checkpoint as serialize/deserialize >>>> cycles? There are no calls to setup/teardown and/or activate/deactivate >>>> during checkpointing/serialization. In case of restoring from a checkpoint >>>> (deserialization) setup() is a part of a redeployment request, AFAIK. The >>>> best answer to question 3 is it depends. In most cases using setup() to >>>> resolve all transient field is as good as doing that in activate(). Please >>>> see ActivationListener javadoc for details when it is necessary to use >>>> activate() vs setup(). >>>> >>>> Thank you, >>>> >>>> Vlad >>>> >>>>> On 7/29/17 19:58, Sanjay Pujare wrote: >>>>> >>>>> The Javadoc comment >>>>> for com.datatorrent.api.Operator.ActivationListener<CONTEXT> (in >>>>> https://github.com/apache/apex-core/blob/master/api/src/main >>>>> /java/com/datatorrent/api/Operator.java) >>>>> should hopefully answer your questions. >>>>> >>>>> Specifically: >>>>> >>>>> 1. No, setup() is called only once in the entire lifetime ( >>>>> http://apex.apache.org/docs/apex/operator_development/#setup-call) >>>>> >>>>> 2. Yes. When an operator is "activated" - first time in its life or >>>>> reactivation after a failover - actuvate() is called before the first >>>>> beginWindow() is called. >>>>> >>>>> 3. Yes. >>>>> >>>>> >>>>> On Sun, Jul 30, 2017 at 12:18 AM, Ananth G <ananthg.a...@gmail.com> >>>>> wrote: >>>>> >>>>> Hello All, >>>>>> >>>>>> I was looking at the documentation and could not get a clear >>>>>> distinction >>>>>> of behaviours for setup() and activate() during scenarios when an >>>>>> operator >>>>>> is passivated ( ex: application shutdown, repartition use cases ) and >>>>>> being >>>>>> brought back to life again. Could someone from the community advise me >>>>>> on >>>>>> the following questions ? >>>>>> >>>>>> 1. Is setup() called in these scenarios (serialize/deserialize cycles) >>>>>> as >>>>>> well ? >>>>>> >>>>>> 2. I am assuming activate() is called in these scenarios ? - The >>>>>> javadoc >>>>>> for activation states that the activate() can be called multiple times >>>>>> ( >>>>>> without explicitly stating why ) and my assumption is that it is >>>>>> because of >>>>>> these scenarios. >>>>>> >>>>>> 3. If setup() is only called once during the lifetime of an operator , >>>>>> is >>>>>> it fair to assume that activate() is the best place to resolve all of >>>>>> the >>>>>> transient fields of an operator ? >>>>>> >>>>>> >>>>>> Regards, >>>>>> Ananth >>>>>> >>>>> >>