[
https://issues.apache.org/jira/browse/BEAM-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16033379#comment-16033379
]
Kenneth Knowles commented on BEAM-101:
--------------------------------------
I believe you could address the use case in a couple of ways:
1. A {{DoFn}} that uses state and timers to implement this behavior. You can do
essentially any custom triggering with this. The only issue is that your runner
needs to support it.
2. The approach of a {{CombineFn}} does not work as described - you cannot
apply it right at the GBK because the element type may not match. You cannot
apply it right at the {{Window.into}} because the element may lead to many
output elements and there's not really a good story around propagating metadata
in that case. You could have a {{CombineFn<Instant, AccumT, Boolean>}} and it
could work.
The other trouble is that including a {{CombineFn}} in a trigger is not as
portable; it needs a different execution strategy that calls a UDF, possibly
over the Fn API. Today, triggers are just syntax, so they can be executed. The
most coherent approach I know of (which is not really fleshed out) is to do the
combine on a PCollection explicitly and then have a trigger that just
references that PCollection. The runner then just needs to be able to decode a
bool, not run a {{CombineFn}}.
In my mind, data-driven trigger means a trigger that is aware of the details of
the data type. A timestamp-driven trigger would not really be data-driven in
this way. But until we have some clear design for custom triggers, we could
definitely consider adding new syntax to triggers for particular common uses.
If the existing solutions don't work for you, please open a JIRA for your
specific use case.
> Data-driven triggers
> --------------------
>
> Key: BEAM-101
> URL: https://issues.apache.org/jira/browse/BEAM-101
> Project: Beam
> Issue Type: New Feature
> Components: beam-model
> Reporter: Robert Bradshaw
>
> For some applications, it's useful to declare a pane/window to be emitted (or
> finished) based on its contents. The simplest of these is the AfterCount
> trigger, but more sophisticated predicates could be constructed.
> The requirements for consistent trigger firing are essentially that the state
> of the trigger form a lattice and that the "should fire?" question is a
> monotonic predicate on the lattice. Basically it asks "are we high enough up
> the lattice?"
> Because the element types may change between the application of Windowing and
> the actuation of the trigger, one idea is to extract the relevant data from
> the element at Windowing and pass it along implicitly where it can be
> combined and inspected in a type safe way later (similar to how timestamps
> and windows are implicitly passed with elements).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)