David,

Your suggestion make sense. This will also help in the idempotency.
I'll also change the name of interface to WatermarkGenerator.

I had 2 more questions here:
1. AbstractWindowedOperator currently has fixedWatermark right now. Should
that be moved out of AbstractWindowedOperator and provided that as an
implementation of WatermarkGenerator?

2. Should we have a setter method to this interface to provide object of
DataStorage, retractionStorage to the implementation? This can be useful
for any advanced implementation of watermark generation.

-Chinmay.



On Tue, Nov 29, 2016 at 2:19 AM, David Yan <[email protected]> wrote:

> +1 for the feature.
>
> But since having multiple watermark tuples within one streaming window is
> not useful because WindowedOperator currently only processes watermark only
> at endWindow, how about:
>
> void processTupleForWatermark(Tuple.WindowedTuple<InputT> tuple);
> // to be called for each input tuple, updates the state of the impl of
> WatermarkGenerator interface.
>
> ControlTuple.Watermark getWatermarkTuple();
> // to be called at endWindow. return null if watermark is not available
>
> This is actually somewhat related to the separate debate on this list about
> control tuples being delivered only at streaming window boundary.
>
> David
>
> On Thu, Nov 24, 2016 at 2:52 AM, Chinmay Kolhatkar <[email protected]>
> wrote:
>
> > Dear Community,
> >
> > I'm working on adding support for heuristic watermark in Windowed
> Operator.
> > Heuristic watermark give users of WindowedOperator a way to logically
> > determine whether watermark condition is met or not by inspecting the
> > tuples received.
> > This can act as a replacement for or way to work along with Control Tuple
> > received on control port.
> >
> > Here is the approach I'm considering:
> >
> > 1. A new interface lets say "HeuristicWatermark" will be added which
> > extends Component<Context.OperatorContext>
> > The reason why its extended with Component is then it can follow a
> > lifecycle.
> >
> > 2. This method contains a single method something like this:
> >
> > ControlTuple.Watermark processTupleForWatermark(
> > Tuple.WindowedTuple<InputT>
> > input);
> >
> > 3. Object of this type can optionally be set to AbstractWindowedOperator
> as
> > a plugin which identified whether watermark condition has reached.
> >
> > 4. If heuristicWatermark is set, processTupleForWatermark will be called
> > for every received tuple and the method can return the Watermark object
> if
> > watermark condition is met OR return null if not so.
> >
> > 5. If return value of this method is non-null, then processWatermark
> method
> > will be called which sets the nextWatermark value. And then rest of the
> > watermark processing can continue to happen in endWindow.
> >
> >
> > Please share your opinion on above approach.
> >
> > Thanks,
> > Chinmay.
> >
>

Reply via email to