Hi all,

I would like to propose a new feature to the Apex core engine -- the
support of custom control tuples. Currently, we have control tuples such as
BEGIN_WINDOW, END_WINDOW, CHECKPOINT, and so on, but we don't have the
support for applications to insert their own control tuples. The way
currently to get around this is to use data tuples and have a separate port
for such tuples that sends tuples to all partitions of the downstream
operators, which is not exactly developer friendly.

We have already seen a number of use cases that can use this feature:

1) Batch support: We need to tell all operators of the physical DAG when a
batch starts and ends, so the operators can do whatever that is needed upon
the start or the end of a batch.

2) Watermark: To support the concepts of event time windowing, the
watermark control tuple is needed to tell which windows should be
considered late.

3) Changing operator properties: We do have the support of changing
operator properties on the fly, but with a custom control tuple, the
command to change operator properties can be window aligned for all
partitions and also across the DAG.

4) Recording tuples: Like changing operator properties, we do have this
support now but only at the individual physical operator level, and without
control of which window to record tuples for. With a custom control tuple,
because a control tuple must belong to a window, all operators in the DAG
can start (and stop) recording for the same windows.

I can think of two options to achieve this:

1) new custom control tuple type that takes user's serializable object.

2) piggy back the current BEGIN_WINDOW and END_WINDOW control tuples.

Please provide your feedback. Thank you.

David

Reply via email to