Re: [DISCUSSION] Custom Control Tuples

Vlad Rozov Fri, 24 Jun 2016 16:38:32 -0700

+1 for option 1.

Thank you,


Vlad

On 6/24/16 14:35, Bright Chen wrote:

+1
It also can help to Shutdown the application gracefully.
Bright

On Jun 24, 2016, at 1:35 PM, Siyuan Hua <[email protected]> wrote:

+1

I think it's good to have custom control tuple and I prefer the 1 option.

Also I think we should think about couple different callbacks, that could
be operator level(triggered when an operator receives an control tuple) or
dag level(triggered when control tuple flow over the whole dag)

Regards,
Siyuan




On Fri, Jun 24, 2016 at 12:42 PM, David Yan <[email protected]> wrote:

My initial thinking is that the custom control tuples, just like the
existing control tuples, will only be generated from the input operators
and will be propagated downstream to all operators in the DAG. So the NxM
partitioning scenario works just like how other control tuples work, i.e.
the callback will not be called unless all ports have received the control
tuple for a particular window. This creates a little bit of complication
with multiple input operators though.

David


On Fri, Jun 24, 2016 at 12:03 PM, Tushar Gosavi <[email protected]>
wrote:

+1 for the feature

I am in favor of option 1, but we may need an helper method to avoid
compiler error on typed port, as calling port.emit(controlTuple) will
be an error if type of control tuple and port does not match. or new
method in outputPort object , emitControlTuple(ControlTuple).

Can you give example of piggy backing tuple with current BEGIN_WINDOW
and END_WINDOW control tuples?

In case of NxM partitioning, each downstream operator will receive N
control tuples. will it call user handler N times for each downstream
operator or just once.

Regards,
- Tushar.



On Fri, Jun 24, 2016 at 11:52 PM, David Yan <[email protected]>

wrote:

Hi all,

I would like to propose a new feature to the Apex core engine -- the
support of custom control tuples. Currently, we have control tuples

such

as

BEGIN_WINDOW, END_WINDOW, CHECKPOINT, and so on, but we don't have the
support for applications to insert their own control tuples. The way
currently to get around this is to use data tuples and have a separate

port

for such tuples that sends tuples to all partitions of the downstream
operators, which is not exactly developer friendly.

We have already seen a number of use cases that can use this feature:

1) Batch support: We need to tell all operators of the physical DAG

when

batch starts and ends, so the operators can do whatever that is needed

upon

the start or the end of a batch.

2) Watermark: To support the concepts of event time windowing, the
watermark control tuple is needed to tell which windows should be
considered late.

3) Changing operator properties: We do have the support of changing
operator properties on the fly, but with a custom control tuple, the
command to change operator properties can be window aligned for all
partitions and also across the DAG.

4) Recording tuples: Like changing operator properties, we do have this
support now but only at the individual physical operator level, and

without

control of which window to record tuples for. With a custom control

tuple,

because a control tuple must belong to a window, all operators in the

DAG

can start (and stop) recording for the same windows.

I can think of two options to achieve this:

1) new custom control tuple type that takes user's serializable object.

2) piggy back the current BEGIN_WINDOW and END_WINDOW control tuples.

Please provide your feedback. Thank you.

David

Re: [DISCUSSION] Custom Control Tuples

Reply via email to