Huge +1. I think this is a wonderful idea that, if implemented, will make major improvements in many areas of Beam: - As you said, eliminating a class of data loss bugs and highly unintuitive behaviors or errors (eg CoGbk) - As per the thread "Callbacks/other functions run after a PDone/output transform", your document also gives a conceptual framework under which to think about the interaction of sinks and triggers, which enables sensible semantics for "do X after Y is done" - Most importantly, it just makes sense :) unlike current triggers, which unintuitively say that triggering is something inherent to a PCollection rather than to a processing step (triggers are clearly an operational device). Btw do you think windowing also should be separate from a PCollection? I'm not sure whether it should be considered operational or not.
Also, I think anyone reading this document really ought to at least skim the (linked from there) http://s.apache.org/beam-streams-tables and internalize the idea of "PCollections as changelogs, aggregations as tables on which the changelog acts". It probably would be good to rewrite our documentation with this in mind: even with my experience on the Beam team, this simple idea made it much easier for me to think clearly about all the concepts. I'm very excited about both of these ideas, I think they rival in importance the idea of batch/streaming unification and will end up being a fundamental part of the future of Beam model. On Thu, Nov 30, 2017 at 8:52 PM Jean-Baptiste Onofré <[email protected]> wrote: > Hi Kenn, > > very interesting idea. It sounds more usable and "logic". > > Regards > JB > > On 11/30/2017 09:06 PM, Kenneth Knowles wrote: > > Hi all, > > > > Triggers are one of the more novel aspects of Beam's support for > unbounded data. > > They are also one of the most challenging aspects of the model. > > > > Ben & I have been working on a major new idea for how triggers could > work in the > > Beam model. We think it will make triggers much more usable, create new > > opportunities for no-knobs execution/optimization, and improve > compatibility > > with DSLs like SQL. (also eliminate a whole class of bugs) > > > > Triggering is for sinks! > > > > https://s.apache.org/beam-sink-triggers > > > > Please take a look at this "1"-pager and give feedback. > > > > Kenn > > -- > Jean-Baptiste Onofré > [email protected] > http://blog.nanthrax.net > Talend - http://www.talend.com >
