[jira] [Commented] (BEAM-101) Data-driven triggers

Kenneth Knowles (JIRA) Thu, 01 Jun 2017 10:49:02 -0700

    [ 
https://issues.apache.org/jira/browse/BEAM-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033379#comment-16033379
 ]


Kenneth Knowles commented on BEAM-101:
--------------------------------------

I believe you could address the use case in a couple of ways:

1. A {{DoFn}} that uses state and timers to implement this behavior. You can do 
essentially any custom triggering with this. The only issue is that your runner 
needs to support it.
2. The approach of a {{CombineFn}} does not work as described - you cannot 
apply it right at the GBK because the element type may not match. You cannot 
apply it right at the {{Window.into}} because the element may lead to many 
output elements and there's not really a good story around propagating metadata 
in that case. You could have a {{CombineFn<Instant, AccumT, Boolean>}} and it 
could work.

The other trouble is that including a {{CombineFn}} in a trigger is not as 
portable; it needs a different execution strategy that calls a UDF, possibly 
over the Fn API. Today, triggers are just syntax, so they can be executed. The 
most coherent approach I know of (which is not really fleshed out) is to do the 
combine on a PCollection explicitly and then have a trigger that just 
references that PCollection. The runner then just needs to be able to decode a 
bool, not run a {{CombineFn}}.

In my mind, data-driven trigger means a trigger that is aware of the details of 
the data type. A timestamp-driven trigger would not really be data-driven in 
this way. But until we have some clear design for custom triggers, we could 
definitely consider adding new syntax to triggers for particular common uses. 
If the existing solutions don't work for you, please open a JIRA for your 
specific use case.

> Data-driven triggers
> --------------------
>
>                 Key: BEAM-101
>                 URL: https://issues.apache.org/jira/browse/BEAM-101
>             Project: Beam
>          Issue Type: New Feature
>          Components: beam-model
>            Reporter: Robert Bradshaw
>
> For some applications, it's useful to declare a pane/window to be emitted (or 
> finished) based on its contents. The simplest of these is the AfterCount 
> trigger, but more sophisticated predicates could be constructed.
> The requirements for consistent trigger firing are essentially that the state 
> of the trigger form a lattice and that the "should fire?" question is a 
> monotonic predicate on the lattice. Basically it asks "are we high enough up 
> the lattice?"
> Because the element types may change between the application of Windowing and 
> the actuation of the trigger, one idea is to extract the relevant data from 
> the element at Windowing and pass it along implicitly where it can be 
> combined and inspected in a type safe way later (similar to how timestamps 
> and windows are implicitly passed with elements).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-101) Data-driven triggers

Reply via email to