Overall I am strongly +1 on this ask. During pre-open source days we had discussed these and had kept it on the table for future. I believe the time has come
I would prefer option 1 that is sent between previous window's END_WINDOW and new window's BEGIN_WINDOW control tuple. Piggy backing BEGIN_WINDOW or END_WINDOW may restrict in future. Thks, Amol On Fri, Jun 24, 2016 at 11:22 AM, David Yan <[email protected]> wrote: > Hi all, > > I would like to propose a new feature to the Apex core engine -- the > support of custom control tuples. Currently, we have control tuples such as > BEGIN_WINDOW, END_WINDOW, CHECKPOINT, and so on, but we don't have the > support for applications to insert their own control tuples. The way > currently to get around this is to use data tuples and have a separate port > for such tuples that sends tuples to all partitions of the downstream > operators, which is not exactly developer friendly. > > We have already seen a number of use cases that can use this feature: > > 1) Batch support: We need to tell all operators of the physical DAG when a > batch starts and ends, so the operators can do whatever that is needed upon > the start or the end of a batch. > > 2) Watermark: To support the concepts of event time windowing, the > watermark control tuple is needed to tell which windows should be > considered late. > > 3) Changing operator properties: We do have the support of changing > operator properties on the fly, but with a custom control tuple, the > command to change operator properties can be window aligned for all > partitions and also across the DAG. > > 4) Recording tuples: Like changing operator properties, we do have this > support now but only at the individual physical operator level, and without > control of which window to record tuples for. With a custom control tuple, > because a control tuple must belong to a window, all operators in the DAG > can start (and stop) recording for the same windows. > > I can think of two options to achieve this: > > 1) new custom control tuple type that takes user's serializable object. > > 2) piggy back the current BEGIN_WINDOW and END_WINDOW control tuples. > > Please provide your feedback. Thank you. > > David >
