The driver code for the operators tries to "guess" that the operator is not doing any work and just returning. The logic used to guess is what Ashwin mentioned: for input operators - no events generated by emitTuples and for other operators input tuple queue being empty (as all the processing happens in response to consuming tuples). When this condition is detected, the driver has 2 options - frantically keep checking for work to do or sleep for a few millis and then check again. We go with the 2nd approach as that results in better resource utilization.
Yet for some use cases - the user code may want to do some auxiliary operations instead of sleeping so IdleTimeHandler interface was introduced. So if the operator does implement IdleTimeHandler, instead of sleeping for a few millis, the IdleTimeHandler is invoked. By nature of the processing that happens in IdleTimeHandler, it's impossible for the operator driver to guess if IdleTimeHandler actually did meaningful work. So if IdleTimeHandler doesn't do anything it will result in operator code frantically checking for work to do. Often that's unintentional so I recommend that if you implement IdleTImeHandler, but find yourself in a condition where you do not have anything to do when IdleTimeHandler is invoked, sleep for a few millis. Recommended sleep time is already controlled using OperatorContext.SPIN_MILLIS attribute. Sandeep's response with the code snippet is right on the spot for implementing IdleTimeHandler. -- Chetan On Wed, Sep 2, 2015 at 11:58 AM, Ashwin Chandra Putta < [email protected]> wrote: > Invocation of handleIdleTime() is not guaranteed. What is guaranteed is > that when there are no tuples coming in and no tuples emitted, then > handleIdleTime is called. > > A little more detail, the way it is designed is as follows: > > For input operator, when the operator observes that no tuples are emitted > in the emitTuples call, handleIdleTime is called. > > For any other operator, when the operator observes that there are no > incoming tuples and no tuples are emitted, handleIdleTime is called. > > Also, handleIdleTime implementation should only contain the logic to do > some work if needed. The platform already sleeps if there is no work, so > there is no need to sleep in the implementation and let the platform handle > it. > > Regards, > Ashwin. > > Regards, > Ashwin. > As I understand, I can get my task done earlier if I have that in > handleIdleTime() rathe than waiting for endWindow(). > > But can I depend solely on handleIdleTime() ? Is invocation of > handleIdleTime() guaranteed in the operator per window cycle? > > > > On Wed, Sep 2, 2015 at 7:46 AM, Pramod Immaneni <[email protected]> > wrote: > > > The time you spend in handleIdleTime could still be less than a window > > interval. If you move your processing to end window, since end window is > > called when end window is received from upstream you would delay the > > results being sent to downstream. > > > > On Wed, Sep 2, 2015 at 1:26 AM, Bhupesh Chawda <[email protected]> > > wrote: > > > > > Hi All, > > > > > > I understand that handleIdleTime() is called when the operator is > idling > > > and is intended for auxiliary processing. Also, if the operator does > not > > > have anything to do, it must block for some time to avoid busy loop. > > > What happens if my processing within handleIdleTime() exceeds the > amount > > of > > > time it would have blocked otherwise? In that case does it make a > > > difference whether the processing is done in handleIdleTime() or in > > > endWindow() call? > > > > > > To clarify the question, is this the right approach: > > > > > > handleIdleTime() > > > { > > > do some work W; > > > t = time to do work W; > > > sleep(SPIN_MILLIS - t); > > > } > > > > > > What is the right approach if t > SPIN_MILLIS? > > > > > > Thanks. > > > -- > > > Regards, > > > Bhupesh Chawda > > > > > >
