Hi,
I have left a note regarding the proposed splitting of batch and
streaming expansion of this transform. In general, a need for such split
triggers doubts in me. This signals that either
a) the transform does something is should not, or
b) Beam model is not complete in terms of being "unified"
The problem that is described in the document is that in the batch case
timers are not fired appropriately. This is actually on of the
motivations that led to introduction of @RequiresTimeSortedInput
annotation and, though mentioned years ago as a question, I do not
remember what arguments were used against enforcing sorting inputs by
timestamp in the batch stateful DoFn as a requirement in the model. That
would enable the appropriate firing of timers while preserving the batch
invariant which is there are no late data allowed. IIRC there are
runners that do this sorting by default (at least the sorting, not sure
about the timers, but once inputs are sorted, firing timers is simple).
A different question is if this particular transform should maybe fire
not by event time, but rather processing time?
Best,
Jan
On 2/21/24 03:00, Robert Burke wrote:
Thanks for the design Damon! And thanks for collaborating with me on getting a
high level textual description of the key implementation idea down in writing.
I think the solution is pretty elegant.
I do have concerns about how different Runners might handle
ProcessContinuations for the Bounded Input case. I know Dataflow famously has
two different execution modes under the hood, but I agree with the principle
that ProcessContinuation.Resume should largely be in line with the expected
delay, though it's by no means guaranteed AFAIK.
We should also ensure this is linked from https://s.apache.org/beam-design-docs
if not already.
Robert Burke
Beam Go Busybody
On 2024/02/20 14:00:00 Damon Douglas wrote:
Hello Everyone,
The following describes a Throttle PTransform that holds element throughput
to minimize downstream API overusage. Thank you for reading and your
valuable input.
https://s.apache.org/beam-throttle-transform
Best,
Damon