Hi,

I have left a note regarding the proposed splitting of batch and streaming expansion of this transform. In general, a need for such split triggers doubts in me. This signals that either

 a) the transform does something is should not, or

 b) Beam model is not complete in terms of being "unified"

The problem that is described in the document is that in the batch case timers are not fired appropriately. This is actually on of the motivations that led to introduction of @RequiresTimeSortedInput annotation and, though mentioned years ago as a question, I do not remember what arguments were used against enforcing sorting inputs by timestamp in the batch stateful DoFn as a requirement in the model. That would enable the appropriate firing of timers while preserving the batch invariant which is there are no late data allowed. IIRC there are runners that do this sorting by default (at least the sorting, not sure about the timers, but once inputs are sorted, firing timers is simple).

A different question is if this particular transform should maybe fire not by event time, but rather processing time?

Best,
 Jan

On 2/21/24 03:00, Robert Burke wrote:
Thanks for the design Damon! And thanks for collaborating with me on getting a 
high level textual description of the key implementation idea down in writing. 
I think the solution is pretty elegant.

I do have concerns about how different Runners might handle 
ProcessContinuations for the Bounded Input case. I know Dataflow famously has 
two different execution modes under the hood, but I agree with the principle 
that ProcessContinuation.Resume should largely be in line with the expected 
delay, though it's by no means guaranteed AFAIK.

We should also ensure this is linked from https://s.apache.org/beam-design-docs 
if not already.

Robert Burke
Beam Go Busybody

On 2024/02/20 14:00:00 Damon Douglas wrote:
Hello Everyone,

The following describes a Throttle PTransform that holds element throughput
to minimize downstream API overusage. Thank you for reading and your
valuable input.

https://s.apache.org/beam-throttle-transform

Best,

Damon

Reply via email to