Thank you Robert, this is really helpful context and confirmation.

> You mention having an "agressive" watermark--could you clarify what you
mean by this?

I'm using KafkaIO and I customize the watermark to follow my message event
timestamps as they come (instead of having it lag behind the stream to
allow for any even slightly out-of-order events.) So I call this
"aggressive". I also have a custom window that closes precisely on certain
event markers in the stream -- and I need these windows to trigger as fast
as possible (was hoping for sub-second here, but I suspected that was not
the focus of runners as you're saying so I think I can live w/o this).

> Certainly all-in-process runners would likely take the cake here

Besides DirectRunner which as I understand it is primarily for testing,
what are other in-process runners?










On Fri, Dec 13, 2019 at 1:45 PM Robert Bradshaw <[email protected]> wrote:

> In general, sub-second latencies are difficult because one must wait
> for the watermark to catch up before actually firing. This would
> require the oldest item in flight across all machines to be almost
> exactly the same timestamp as the newest. Furthermore most sources
> cannot provide sub-second watermarks. You mention having an
> "agressive" watermark--could you clarify what you mean by this? There
> are also (generally small) latencies involved with persisting state
> and then firing timers, and windowing is built on the same mechanisms.
>
> I am not aware of latency benchmarks for various runners--in my
> experience most people are interested in high throughput at
> O(second-minute) latency. There's nothing in the Beam model that
> prevents sub-second latencies, but I don't think this has been pushed
> very far on most runners. Gathering such data would certainly be
> interesting. (Certainly all-in-process runners would likely take the
> cake here, but it'd be interesting to see how it degrades as one adds
> more machines.)
>
> On Fri, Dec 13, 2019 at 10:58 AM Aaron Dixon <[email protected]> wrote:
> >
> > I've been building pipelines and benchmarking Beam jobs in Dataflow.
> >
> > Without windowing, latencies look pretty good (reliably sub-second) from
> ingest to sink.
> >
> > Once I introduce windowing even with aggressive watermark it seems it
> takes at least a second (often multiple seconds) to see a window fire.
> >
> > Same appears true for setting events timers in the State API; there
> seems a delay between setting a timer at current event time and the timer
> callback fires.
> >
> > Are there any good/canonical latency benchmarks/reports for different
> runners?
> >
> > My next move may be to evaluate Flink in terms of these latencies, but
> curious if I should be trying to get sub-second latencies out of Beam
> (windowing) at all?
> >
>

Reply via email to