Re: Key factors for Flink's performance

Stephan Ewen Wed, 11 May 2016 03:00:35 -0700

Hi Leon!

I agree with Aljoscha that the term "microbatches" is confusing in that
context. Flink's network layer is "buffer" oriented rather than "record
oriented". Buffering it is a best effort to gather some elements in case
where they come fast enough that this would not add much latency anyways.


Concerning the latency: Chaining has a positive effect on latency. Some of
the benchmarks show how Flink needs to communicate less with external
systems (like Redis) - that is another source of reducing latency.
For very simple programs that have no external communication and no
chaining, I would expect Flink and Storm to be not very different in
latency.

Greetings,
Stephan


On Wed, May 11, 2016 at 9:24 AM, Aljoscha Krettek <[email protected]>
wrote:

> Hi,
> latency for Flink and Storm are pretty similar. The only reason I could
> see for Flink having the slight upper hand there is the fact that Storm
> tracks the progress of every tuple throughout the topology and requires
> ACKs that have to go back to the sinks.
>
> As for throughput you are right that Flink sends elements in batches. The
> size of these batches can be controlled, even be reduced to 1, which yields
> best latency. The fact that there are these batches not not visible
> anywhere in the model, so calling them micro batches is problematic, since
> that already refers to a very different concept in Spark Streaming.
>
> Cheers,
> Aljoscha
>
> On Mon, 9 May 2016 at 11:06 <[email protected]> wrote:
>
>> Hello Flink team,
>>
>> i am currently playing around with Storm and Flink in the context of a
>> smart home. The primary functional requirement is to quickly react to
>> certain properties in stream tuples.
>>
>> I was looking at some benchmarks from the two systems, and generally
>> Flink has the upper hand, in both throughput and latency. I do not really
>> understand how Flink achieves better latency than Storm, which is driven by
>> one-at-at-time tuples.
>>
>> From what i understood in the documentation, Flink performs micro
>> batching when transferring data across the network to downstream operators
>> located on other nodes. Perhaps this achieves a better average latency.
>>
>> Surely the bigger factor however is that Flink can completely bypass
>> internal operator queues with operator chaining, which Storm cannot do.
>>
>> Kind regards
>> Leon
>> <https://tutanota.com>
>>
>

Re: Key factors for Flink's performance

Reply via email to