In general, sub-second latencies are difficult because one must wait
for the watermark to catch up before actually firing. This would
require the oldest item in flight across all machines to be almost
exactly the same timestamp as the newest. Furthermore most sources
cannot provide sub-second watermarks. You mention having an
"agressive" watermark--could you clarify what you mean by this? There
are also (generally small) latencies involved with persisting state
and then firing timers, and windowing is built on the same mechanisms.

I am not aware of latency benchmarks for various runners--in my
experience most people are interested in high throughput at
O(second-minute) latency. There's nothing in the Beam model that
prevents sub-second latencies, but I don't think this has been pushed
very far on most runners. Gathering such data would certainly be
interesting. (Certainly all-in-process runners would likely take the
cake here, but it'd be interesting to see how it degrades as one adds
more machines.)

On Fri, Dec 13, 2019 at 10:58 AM Aaron Dixon <[email protected]> wrote:
>
> I've been building pipelines and benchmarking Beam jobs in Dataflow.
>
> Without windowing, latencies look pretty good (reliably sub-second) from 
> ingest to sink.
>
> Once I introduce windowing even with aggressive watermark it seems it takes 
> at least a second (often multiple seconds) to see a window fire.
>
> Same appears true for setting events timers in the State API; there seems a 
> delay between setting a timer at current event time and the timer callback 
> fires.
>
> Are there any good/canonical latency benchmarks/reports for different runners?
>
> My next move may be to evaluate Flink in terms of these latencies, but 
> curious if I should be trying to get sub-second latencies out of Beam 
> (windowing) at all?
>

Reply via email to