Huge +1. I think this is a wonderful idea that, if implemented, will make
major improvements in many areas of Beam:
- As you said, eliminating a class of data loss bugs and highly unintuitive
behaviors or errors (eg CoGbk)
- As per the thread "Callbacks/other functions run after a PDone/output
transform", your document also gives a conceptual framework under which to
think about the interaction of sinks and triggers, which enables sensible
semantics for "do X after Y is done"
- Most importantly, it just makes sense :) unlike current triggers, which
unintuitively say that triggering is something inherent to a PCollection
rather than to a processing step (triggers are clearly an operational
device). Btw do you think windowing also should be separate from a
PCollection? I'm not sure whether it should be considered operational or
not.

Also, I think anyone reading this document really ought to at least skim
the (linked from there) http://s.apache.org/beam-streams-tables and
internalize the idea of "PCollections as changelogs, aggregations as tables
on which the changelog acts". It probably would be good to rewrite our
documentation with this in mind: even with my experience on the Beam team,
this simple idea made it much easier for me to think clearly about all the
concepts.

I'm very excited about both of these ideas, I think they rival in
importance the idea of batch/streaming unification and will end up being a
fundamental part of the future of Beam model.

On Thu, Nov 30, 2017 at 8:52 PM Jean-Baptiste Onofré <[email protected]>
wrote:

> Hi Kenn,
>
> very interesting idea. It sounds more usable and "logic".
>
> Regards
> JB
>
> On 11/30/2017 09:06 PM, Kenneth Knowles wrote:
> > Hi all,
> >
> > Triggers are one of the more novel aspects of Beam's support for
> unbounded data.
> > They are also one of the most challenging aspects of the model.
> >
> > Ben & I have been working on a major new idea for how triggers could
> work in the
> > Beam model. We think it will make triggers much more usable, create new
> > opportunities for no-knobs execution/optimization, and improve
> compatibility
> > with DSLs like SQL. (also eliminate a whole class of bugs)
> >
> > Triggering is for sinks!
> >
> > https://s.apache.org/beam-sink-triggers
> >
> > Please take a look at this "1"-pager and give feedback.
> >
> > Kenn
>
> --
> Jean-Baptiste Onofré
> [email protected]
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Reply via email to