Chinmay: My response is inline:
On Mon, Nov 28, 2016 at 9:47 PM, Chinmay Kolhatkar <[email protected]> wrote: > David, > > Your suggestion make sense. This will also help in the idempotency. > I'll also change the name of interface to WatermarkGenerator. > > I had 2 more questions here: > 1. AbstractWindowedOperator currently has fixedWatermark right now. Should > that be moved out of AbstractWindowedOperator and provided that as an > implementation of WatermarkGenerator? > Yes, this makes sense. > > 2. Should we have a setter method to this interface to provide object of > DataStorage, retractionStorage to the implementation? This can be useful > for any advanced implementation of watermark generation. > > The storage objects can be passed as part of a context object when constructing the WatermarkGenerator object. In the future, we can add other stuff to the context. > -Chinmay. > > > > On Tue, Nov 29, 2016 at 2:19 AM, David Yan <[email protected]> wrote: > > > +1 for the feature. > > > > But since having multiple watermark tuples within one streaming window is > > not useful because WindowedOperator currently only processes watermark > only > > at endWindow, how about: > > > > void processTupleForWatermark(Tuple.WindowedTuple<InputT> tuple); > > // to be called for each input tuple, updates the state of the impl of > > WatermarkGenerator interface. > > > > ControlTuple.Watermark getWatermarkTuple(); > > // to be called at endWindow. return null if watermark is not available > > > > This is actually somewhat related to the separate debate on this list > about > > control tuples being delivered only at streaming window boundary. > > > > David > > > > On Thu, Nov 24, 2016 at 2:52 AM, Chinmay Kolhatkar <[email protected]> > > wrote: > > > > > Dear Community, > > > > > > I'm working on adding support for heuristic watermark in Windowed > > Operator. > > > Heuristic watermark give users of WindowedOperator a way to logically > > > determine whether watermark condition is met or not by inspecting the > > > tuples received. > > > This can act as a replacement for or way to work along with Control > Tuple > > > received on control port. > > > > > > Here is the approach I'm considering: > > > > > > 1. A new interface lets say "HeuristicWatermark" will be added which > > > extends Component<Context.OperatorContext> > > > The reason why its extended with Component is then it can follow a > > > lifecycle. > > > > > > 2. This method contains a single method something like this: > > > > > > ControlTuple.Watermark processTupleForWatermark( > > > Tuple.WindowedTuple<InputT> > > > input); > > > > > > 3. Object of this type can optionally be set to > AbstractWindowedOperator > > as > > > a plugin which identified whether watermark condition has reached. > > > > > > 4. If heuristicWatermark is set, processTupleForWatermark will be > called > > > for every received tuple and the method can return the Watermark object > > if > > > watermark condition is met OR return null if not so. > > > > > > 5. If return value of this method is non-null, then processWatermark > > method > > > will be called which sets the nextWatermark value. And then rest of the > > > watermark processing can continue to happen in endWindow. > > > > > > > > > Please share your opinion on above approach. > > > > > > Thanks, > > > Chinmay. > > > > > >
