Hi Leon! I agree with Aljoscha that the term "microbatches" is confusing in that context. Flink's network layer is "buffer" oriented rather than "record oriented". Buffering it is a best effort to gather some elements in case where they come fast enough that this would not add much latency anyways.
Concerning the latency: Chaining has a positive effect on latency. Some of the benchmarks show how Flink needs to communicate less with external systems (like Redis) - that is another source of reducing latency. For very simple programs that have no external communication and no chaining, I would expect Flink and Storm to be not very different in latency. Greetings, Stephan On Wed, May 11, 2016 at 9:24 AM, Aljoscha Krettek <[email protected]> wrote: > Hi, > latency for Flink and Storm are pretty similar. The only reason I could > see for Flink having the slight upper hand there is the fact that Storm > tracks the progress of every tuple throughout the topology and requires > ACKs that have to go back to the sinks. > > As for throughput you are right that Flink sends elements in batches. The > size of these batches can be controlled, even be reduced to 1, which yields > best latency. The fact that there are these batches not not visible > anywhere in the model, so calling them micro batches is problematic, since > that already refers to a very different concept in Spark Streaming. > > Cheers, > Aljoscha > > On Mon, 9 May 2016 at 11:06 <[email protected]> wrote: > >> Hello Flink team, >> >> i am currently playing around with Storm and Flink in the context of a >> smart home. The primary functional requirement is to quickly react to >> certain properties in stream tuples. >> >> I was looking at some benchmarks from the two systems, and generally >> Flink has the upper hand, in both throughput and latency. I do not really >> understand how Flink achieves better latency than Storm, which is driven by >> one-at-at-time tuples. >> >> From what i understood in the documentation, Flink performs micro >> batching when transferring data across the network to downstream operators >> located on other nodes. Perhaps this achieves a better average latency. >> >> Surely the bigger factor however is that Flink can completely bypass >> internal operator queues with operator chaining, which Storm cannot do. >> >> Kind regards >> Leon >> <https://tutanota.com> >> >
