Is #5 same as https://malhar.atlassian.net/browse/APEX-78

It would be good as we can leverage that in other operators also.

On Wed, Nov 4, 2015 at 10:41 AM, David Yan <da...@datatorrent.com> wrote:

> Thanks Pramod and Sandesh for offering to help.
>
> Here's my rough plan:
>
> 1. Add DelayOperator interface.  The DelayOperator will have a method
> firstWindow(long windowId).  Implementations of this interface is supposed
> to emit tuples for the first window of the execution of the operator
> (either the first window of the execution of the application or the first
> window after recovery).
>
> 2. Add SimpleDelayOperator that implements DelayOperator.  It has one input
> port and one output port.  It simply passes the tuples from the input port
> to the output port and does not do anything for firstWindow() call.
>
> 3. Engine (e.g. DAG validation) to support loops in case of DelayOperator.
>
> 4. Implement the +1 delay in the engine for input ports that are connected
> to output ports of DelayOperator.
>
> 5. Add capability in the engine to let the operator know when is the next
> checkpoint window.
>
> 6. Add DefaultDelayOperator that extends SimpleDelayOperator.  It writes
> the tuples in the window before each checkpoint to a DFS-backed WAL (using
> item 5 above), and it overrides firstWindow() to read from the WAL and
> emits the tuples at recovery.
>
> 7. Add +N delay capability (in addition to +1) in the DefaultDelayOperator.
>
> Let me know whether this plan sounds good to you.
>
> I'm done with 1, 2, and 3.  4 is in progress.
> I think the bulk of the work is 5 and 6 and we can discuss how we can
> divide the work.
> 7 is a nice-to-have and does not have to be done unless there is a demand.
>
> David
>
>
> On Wed, Nov 4, 2015 at 10:11 AM, Pramod Immaneni <pra...@datatorrent.com>
> wrote:
>
> > I would like to help. I might be able to pick up some of the smaller
> tasks.
> >
> > On Wed, Nov 4, 2015 at 10:05 AM, David Yan <da...@datatorrent.com>
> wrote:
> >
> > > Thank you for all your feedback.  Looks like option #2 wins.
> > >
> > > I will be working on this in November and please let me know if you'd
> > like
> > > to join the effort!
> > >
> > > On Wed, Nov 4, 2015 at 9:58 AM, Thomas Weise <tho...@datatorrent.com>
> > > wrote:
> > >
> > > > Agreed, there is no ambiguity.
> > > >
> > > > #2 will also allow the user to tune locality as there are no implicit
> > > > streams as opposed to the unifier like approach.
> > > >
> > > > On Wed, Nov 4, 2015 at 9:54 AM, David Yan <da...@datatorrent.com>
> > wrote:
> > > >
> > > > > On Tue, Nov 3, 2015 at 9:57 PM, Thomas Weise <
> tho...@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > >
> > > > > > #2 will address that. But if an operator with the delay interface
> > has
> > > > > > multiple input ports, on which port will the engine perform the
> > > delay?
> > > > > > Maybe we will need to validate that a delay operator can only
> have
> > a
> > > > > single
> > > > > > input port?
> > > > > >
> > > > >
> > > > > My understanding is that the engine performs the +1 delay on the
> > input
> > > > > ports of operators that are connected to output ports of delay
> > > operators.
> > > > > So whether or not the delay operator has multiple input ports
> should
> > > not
> > > > > matter.
> > > > >
> > > >
> > >
> >
>

Reply via email to