I have new findings & subsequently relative improvements.Am testing as we
speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had keep
state somewhere. I went with Redis. I found it to be a major bottle neck as
Beam nodes constantly are going across NW to update its repository.So I
replaced Redis with Java Concurrenthashmaps. Must faster. Then Kafka went out
of disk space and the replication manager complained. So I clustered the two
Kafka nodes hoping for sharing space. As of this second I am typing this email,
its sustaining but only 1/2 of the 201401969 tuples have been processed after
3.5 hours.According to the Linear Road benchmarking expectations, if your
system is working well, this whole 201401969 tuples must be done in 3.5 hrs
max.So this means there is still room for tuning Flink nodes. I have already
shared with you all more details about my config.It run perfect yesterday with
almost 1/10th of this load. Perfect real-time send/processed streaming
behavior.If thats the case & I cannot get better performance with FlinkRunner,
my nest stop is SparkRunner and repeat of the whole thing for final
benchmarking of the two under Beam APIs.Which was the initial intent anyways.If
you have suggestions to make improvements in the above case, I am all ears &
greatly appreciate it.Cheers,Amir-
From: "Chawla,Sumit" <[email protected]>
To: [email protected]; amir bahmanyari <[email protected]>
Sent: Sunday, September 18, 2016 2:07 PM
Subject: Re: Performance and Latency Chart for Flink
Has anyone else run these kind of benchmarks? Would love to hear more
people'e experience and details about those benchmarks.
Regards
Sumit Chawla
On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <[email protected]>
wrote:
> Hi Amir
>
> Would it be possible for you to share the numbers? Also share if possible
> your configuration details.
>
> Regards
> Sumit Chawla
>
>
> On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> [email protected]> wrote:
>
>> Hi Fabian,FYI. This is report on other engines we did the same type of
>> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
>> your help.
>> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
>> linear-road-benchmark
>> https://github.com/IBMStreams/benchmarks
>> https://www.datatorrent.com/blog/blog-implementing-linear-ro
>> ad-benchmark-in-apex/
>>
>>
>> From: Fabian Hueske <[email protected]>
>> To: "[email protected]" <[email protected]>
>> Sent: Friday, September 16, 2016 12:31 AM
>> Subject: Re: Performance and Latency Chart for Flink
>>
>> Hi,
>>
>> I am not aware of periodic performance runs for the Flink releases.
>> I know a few benchmarks which have been published at different points in
>> time like [1], [2], and [3] (you'll probably find more).
>>
>> In general, fair benchmarks that compare different systems (if there is
>> such thing) are very difficult and the results often depend on the use
>> case.
>> IMO the best option is to run your own benchmarks, if you have a concrete
>> use case.
>>
>> Best, Fabian
>>
>> [1] 08/2015:
>> http://data-artisans.com/high-throughput-low-latency-and-exa
>> ctly-once-stream-processing-with-apache-flink/
>> [2] 12/2015:
>> https://yahooeng.tumblr.com/post/135321837876/benchmarking-
>> streaming-computation-engines-at
>> [3] 02/2016:
>> http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>>
>>
>> 2016-09-16 5:54 GMT+02:00 Chawla,Sumit <[email protected]>:
>>
>> > Hi
>> >
>> > Is there any performance run that is done for each Flink release? Or you
>> > are aware of any third party evaluation of performance metrics for
>> Flink?
>> > I am interested in seeing how performance has improved over release to
>> > release, and performance vs other competitors.
>> >
>> > Regards
>> > Sumit Chawla
>> >
>>
>>
>>
>>
>
>