It looks like option 1 is preferred by the community. But let me elaborate why I brought up the option of piggy backing BEGIN and END_WINDOW
Option 2 implicitly enforces that the operations related to the custom control tuple be done at the streaming window boundary. For most operations, it makes sense to have that enforcement. Option 1 opens the door to the possibility of sending and handling control tuples within a window, thus imposing a challenge of ensuring idempotency. In fact, allowing that would make idempotency extremely difficult to achieve. David On Fri, Jun 24, 2016 at 4:38 PM, Vlad Rozov <[email protected]> wrote: > +1 for option 1. > > Thank you, > > Vlad > > > On 6/24/16 14:35, Bright Chen wrote: > >> +1 >> It also can help to Shutdown the application gracefully. >> Bright >> >> On Jun 24, 2016, at 1:35 PM, Siyuan Hua <[email protected]> wrote: >>> >>> +1 >>> >>> I think it's good to have custom control tuple and I prefer the 1 option. >>> >>> Also I think we should think about couple different callbacks, that could >>> be operator level(triggered when an operator receives an control tuple) >>> or >>> dag level(triggered when control tuple flow over the whole dag) >>> >>> Regards, >>> Siyuan >>> >>> >>> >>> >>> On Fri, Jun 24, 2016 at 12:42 PM, David Yan <[email protected]> >>> wrote: >>> >>> My initial thinking is that the custom control tuples, just like the >>>> existing control tuples, will only be generated from the input operators >>>> and will be propagated downstream to all operators in the DAG. So the >>>> NxM >>>> partitioning scenario works just like how other control tuples work, >>>> i.e. >>>> the callback will not be called unless all ports have received the >>>> control >>>> tuple for a particular window. This creates a little bit of complication >>>> with multiple input operators though. >>>> >>>> David >>>> >>>> >>>> On Fri, Jun 24, 2016 at 12:03 PM, Tushar Gosavi <[email protected] >>>> > >>>> wrote: >>>> >>>> +1 for the feature >>>>> >>>>> I am in favor of option 1, but we may need an helper method to avoid >>>>> compiler error on typed port, as calling port.emit(controlTuple) will >>>>> be an error if type of control tuple and port does not match. or new >>>>> method in outputPort object , emitControlTuple(ControlTuple). >>>>> >>>>> Can you give example of piggy backing tuple with current BEGIN_WINDOW >>>>> and END_WINDOW control tuples? >>>>> >>>>> In case of NxM partitioning, each downstream operator will receive N >>>>> control tuples. will it call user handler N times for each downstream >>>>> operator or just once. >>>>> >>>>> Regards, >>>>> - Tushar. >>>>> >>>>> >>>>> >>>>> On Fri, Jun 24, 2016 at 11:52 PM, David Yan <[email protected]> >>>>> >>>> wrote: >>>> >>>>> Hi all, >>>>>> >>>>>> I would like to propose a new feature to the Apex core engine -- the >>>>>> support of custom control tuples. Currently, we have control tuples >>>>>> >>>>> such >>>> >>>>> as >>>>> >>>>>> BEGIN_WINDOW, END_WINDOW, CHECKPOINT, and so on, but we don't have the >>>>>> support for applications to insert their own control tuples. The way >>>>>> currently to get around this is to use data tuples and have a separate >>>>>> >>>>> port >>>>> >>>>>> for such tuples that sends tuples to all partitions of the downstream >>>>>> operators, which is not exactly developer friendly. >>>>>> >>>>>> We have already seen a number of use cases that can use this feature: >>>>>> >>>>>> 1) Batch support: We need to tell all operators of the physical DAG >>>>>> >>>>> when >>>> >>>>> a >>>>> >>>>>> batch starts and ends, so the operators can do whatever that is needed >>>>>> >>>>> upon >>>>> >>>>>> the start or the end of a batch. >>>>>> >>>>>> 2) Watermark: To support the concepts of event time windowing, the >>>>>> watermark control tuple is needed to tell which windows should be >>>>>> considered late. >>>>>> >>>>>> 3) Changing operator properties: We do have the support of changing >>>>>> operator properties on the fly, but with a custom control tuple, the >>>>>> command to change operator properties can be window aligned for all >>>>>> partitions and also across the DAG. >>>>>> >>>>>> 4) Recording tuples: Like changing operator properties, we do have >>>>>> this >>>>>> support now but only at the individual physical operator level, and >>>>>> >>>>> without >>>>> >>>>>> control of which window to record tuples for. With a custom control >>>>>> >>>>> tuple, >>>>> >>>>>> because a control tuple must belong to a window, all operators in the >>>>>> >>>>> DAG >>>> >>>>> can start (and stop) recording for the same windows. >>>>>> >>>>>> I can think of two options to achieve this: >>>>>> >>>>>> 1) new custom control tuple type that takes user's serializable >>>>>> object. >>>>>> >>>>>> 2) piggy back the current BEGIN_WINDOW and END_WINDOW control tuples. >>>>>> >>>>>> Please provide your feedback. Thank you. >>>>>> >>>>>> David >>>>>> >>>>> >
