Re: Performance and Latency Chart for Flink

2016-09-19 Thread amir bahmanyari
Hi Greg,Setting  "taskmanager.memory.preallocate" to true caused "Association 
with remote system [akka.tcp://flink@" "has failed" "[Disassociated]" on all 
TMs.Changed it back to false.I increased the NW buffers to 1 G & started to get 
TM slots  exceptions. 
So I am going incremental with that value. Have it set at 8192 (twice as much 
as before 4096).Thanks

  From: Greg Hogan <c...@greghogan.com>
 To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com> 
 Sent: Monday, September 19, 2016 1:28 PM
 Subject: Re: Performance and Latency Chart for Flink
   
My thought would be to compare the data rate and buffer sizes which gives a
refresh interval. For example, if you are transmitting 1 GB/s on 128 MiB of
network buffers then the refresh rate is at most 1/8 second. There is the
same consideration with spill files if the system does not have sufficient
free memory for a large number of readahead buffers. Another set of buffers
are the kernel socket buffers and you can increase from the Linux default 4
MiB by changing "taskmanager.net.sendReceiveBufferSize" (documentation is
in progress; see org.apache.flink.runtime.io.network.netty.NettyConfig).

Your nodes have 100+ GB of memory so a conservative assignment might be a
gigabyte of network buffers. Then add the following to the conf, restart
the cluster, start jconsole on a TaskManager, connect to the TaskManager
process, and on the MBeans tab look under org.apache.flink.metrics for
Network.AvailableMemorySegments.

metrics.reporters: my_jmx_reporter
metrics.reporter.my_jmx_reporter.class:
org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.my_jmx_reporter.port: 9020-9040


On Mon, Sep 19, 2016 at 3:54 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> Thanks Greg."Your setting of 4096 is only 128 MiB."...Correct. Cz I
> followed that formula :-)))I can bump it up to twice as much like what the
> example is doing to for instance 300 MiB.Is this reasonable? what do you
> suggest as a reasonable range?Thanks Greg
>
>      From: Greg Hogan <c...@greghogan.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>  Sent: Monday, September 19, 2016 12:43 PM
>  Subject: Re: Performance and Latency Chart for Flink
>
> You will need to add the configuration parameters to your flink-conf.yaml.
> I believe the intent is that all configuration parameters should be listed
> at
>
> https://ci.apache.org/projects/flink/flink-docs-
> master/setup/config.html#full-reference
>
> My understanding is that the Flink buffers are currently copied to Netty
> buffers, although I don't understand the stated memory doubling.
>
>
> On Mon, Sep 19, 2016 at 3:08 PM, amir bahmanyari <
> amirto...@yahoo.com.invalid> wrote:
>
> > Hi Greg,In the same Flink config link below, there are parameters that
> > dont even exist in flink-conf.yaml.Are they defined somewhere else?I
> > grepped the followings & none existed in any of the files under conf
> > folder."taskmanager.memory.fraction", taskmanager.memory.off
> > -heap, taskmanager.memory.segment-size & many more.
> > Also, isnt the example calculating the network buffers wrong? Based on
> the
> > example, roughly 5000 buffers x 32KiB = 16 KiB should be
> > allocated.16 KiB divided by 1024 = 156.25 MiB. Why is the example
> > saying "the system would allocate roughly 300 MiBytes for network
> buffers."
> > ?Thats roughly twice as much. Am i Missing something here?I still need
> your
> > help to set the accurate number for my
> >    - taskmanager.network.numberOfBuffers = 4096.
> >
> > Thanks for your response Greg.Amir-      From: amir bahmanyari <
> > amirto...@yahoo.com>
> >  To: "dev@flink.apache.org" <dev@flink.apache.org>
> >  Sent: Monday, September 19, 2016 10:34 AM
> >  Subject: Re: Performance and Latency Chart for Flink
> >
> > Hi Greg,I used this guideline to calculate "taskmanager.network.
> numberOfBuffers":Apache
> > Flink 1.2-SNAPSHOT Documentation: Configuration
> >
> >
> > |
> > |
> > |
> > |  |    |
> >
> >  |
> >
> >  |
> > |
> > |  |
> > Apache Flink 1.2-SNAPSHOT Documentation: Configuration
> >    |  |
> >
> >  |
> >
> >  |
> >
> >
> >
> > 4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4
> > is there in the formula.What would you set it to? Once I have that
> number,
> > I will set  "taskmanager.memory.preallocate" to true & will give it
> > another shot.Thanks Greg
> >
> >      From: Greg Hogan <c...@

Re: Performance and Latency Chart for Flink

2016-09-19 Thread Greg Hogan
My thought would be to compare the data rate and buffer sizes which gives a
refresh interval. For example, if you are transmitting 1 GB/s on 128 MiB of
network buffers then the refresh rate is at most 1/8 second. There is the
same consideration with spill files if the system does not have sufficient
free memory for a large number of readahead buffers. Another set of buffers
are the kernel socket buffers and you can increase from the Linux default 4
MiB by changing "taskmanager.net.sendReceiveBufferSize" (documentation is
in progress; see org.apache.flink.runtime.io.network.netty.NettyConfig).

Your nodes have 100+ GB of memory so a conservative assignment might be a
gigabyte of network buffers. Then add the following to the conf, restart
the cluster, start jconsole on a TaskManager, connect to the TaskManager
process, and on the MBeans tab look under org.apache.flink.metrics for
Network.AvailableMemorySegments.

metrics.reporters: my_jmx_reporter
metrics.reporter.my_jmx_reporter.class:
org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.my_jmx_reporter.port: 9020-9040


On Mon, Sep 19, 2016 at 3:54 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> Thanks Greg."Your setting of 4096 is only 128 MiB."...Correct. Cz I
> followed that formula :-)))I can bump it up to twice as much like what the
> example is doing to for instance 300 MiB.Is this reasonable? what do you
> suggest as a reasonable range?Thanks Greg
>
>   From: Greg Hogan <c...@greghogan.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>  Sent: Monday, September 19, 2016 12:43 PM
>  Subject: Re: Performance and Latency Chart for Flink
>
> You will need to add the configuration parameters to your flink-conf.yaml.
> I believe the intent is that all configuration parameters should be listed
> at
>
> https://ci.apache.org/projects/flink/flink-docs-
> master/setup/config.html#full-reference
>
> My understanding is that the Flink buffers are currently copied to Netty
> buffers, although I don't understand the stated memory doubling.
>
>
> On Mon, Sep 19, 2016 at 3:08 PM, amir bahmanyari <
> amirto...@yahoo.com.invalid> wrote:
>
> > Hi Greg,In the same Flink config link below, there are parameters that
> > dont even exist in flink-conf.yaml.Are they defined somewhere else?I
> > grepped the followings & none existed in any of the files under conf
> > folder."taskmanager.memory.fraction", taskmanager.memory.off
> > -heap, taskmanager.memory.segment-size & many more.
> > Also, isnt the example calculating the network buffers wrong? Based on
> the
> > example, roughly 5000 buffers x 32KiB = 16 KiB should be
> > allocated.16 KiB divided by 1024 = 156.25 MiB. Why is the example
> > saying "the system would allocate roughly 300 MiBytes for network
> buffers."
> > ?Thats roughly twice as much. Am i Missing something here?I still need
> your
> > help to set the accurate number for my
> >- taskmanager.network.numberOfBuffers = 4096.
> >
> > Thanks for your response Greg.Amir-  From: amir bahmanyari <
> > amirto...@yahoo.com>
> >  To: "dev@flink.apache.org" <dev@flink.apache.org>
> >  Sent: Monday, September 19, 2016 10:34 AM
> >  Subject: Re: Performance and Latency Chart for Flink
> >
> > Hi Greg,I used this guideline to calculate "taskmanager.network.
> numberOfBuffers":Apache
> > Flink 1.2-SNAPSHOT Documentation: Configuration
> >
> >
> > |
> > |
> > |
> > |  ||
> >
> >  |
> >
> >  |
> > |
> > |  |
> > Apache Flink 1.2-SNAPSHOT Documentation: Configuration
> >|  |
> >
> >  |
> >
> >  |
> >
> >
> >
> > 4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4
> > is there in the formula.What would you set it to? Once I have that
> number,
> > I will set  "taskmanager.memory.preallocate" to true & will give it
> > another shot.Thanks Greg
> >
> >  From: Greg Hogan <c...@greghogan.com>
> >  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
> >  Sent: Monday, September 19, 2016 8:29 AM
> >  Subject: Re: Performance and Latency Chart for Flink
> >
> > Hi Amir,
> >
> > You may see improved performance setting "taskmanager.memory.
> preallocate:
> > true" in order to use off-heap memory.
> >
> > Also, your number of buffers looks quite low and you may want to increase
> > "taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128
> > MiB.
> >
> > As th

Re: Performance and Latency Chart for Flink

2016-09-19 Thread Greg Hogan
Excellent!

On Mon, Sep 19, 2016 at 3:43 PM, Chesnay Schepler <ches...@apache.org>
wrote:

> It is normal that you don't see it in the WebInterface.
>
> FLINK-4389 was only about exposing metrics *to* the WebInterface, not
> exposing them *from* it.
>
> Essentially, a metric travels from TaskManager -> WebInterface -> User.
> FLINK-4389 was about the first arrow, which is a prerequisite step for the
> second one.
>
> Regards,
> Chesnay
>
>
> On 19.09.2016 21:35, Greg Hogan wrote:
>
>> The nightly snapshots now include "[FLINK-4389] Expose metrics to
>> WebFrontend":
>>https://flink.apache.org/contribute-code.html#snapshots-nightly-builds
>>
>> For 1.2 we have metrics for "AvailableMemorySegments" and
>> "TotalMemorySegments":
>>
>> https://ci.apache.org/projects/flink/flink-docs-master/
>> monitoring/metrics.html#list-of-all-variables
>>
>> However, when I download the snapshot and start a cluster with the default
>> configuration I am not seeing a value for this metric in the web UI.
>>
>> An alternative is to configure the JMX reporter in flink-conf.yaml:
>>
>> metrics.reporters: jmx_reporter
>> metrics.reporter.jmx_reporter.class:
>> org.apache.flink.metrics.jmx.JMXReporter
>> metrics.reporter.jmx_reporter.port: 9020
>>
>> You can then monitor the system for the number of used memory segments.
>> Let
>> us know what you discover!
>>
>> On Mon, Sep 19, 2016 at 1:34 PM, amir bahmanyari <
>> amirto...@yahoo.com.invalid> wrote:
>>
>> Hi Greg,I used this guideline to calculate "taskmanager.
>>> network.numberOfBuffers":Apache Flink 1.2-SNAPSHOT Documentation:
>>> Configuration
>>>
>>>
>>> |
>>> |
>>> |
>>> |   ||
>>>
>>> |
>>>
>>>|
>>> |
>>> |   |
>>> Apache Flink 1.2-SNAPSHOT Documentation: Configuration
>>> |   |
>>>
>>>|
>>>
>>>|
>>>
>>>
>>>
>>> 4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4
>>> is there in the formula.What would you set it to? Once I have that
>>> number,
>>> I will set  "taskmanager.memory.preallocate" to true & will give it
>>> another shot.Thanks Greg
>>>
>>>From: Greg Hogan <c...@greghogan.com>
>>>   To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>>>   Sent: Monday, September 19, 2016 8:29 AM
>>>   Subject: Re: Performance and Latency Chart for Flink
>>>
>>> Hi Amir,
>>>
>>> You may see improved performance setting "taskmanager.memory.preallocat
>>> e:
>>> true" in order to use off-heap memory.
>>>
>>> Also, your number of buffers looks quite low and you may want to increase
>>> "taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128
>>> MiB.
>>>
>>> As this is a only benchmark are you able to post the code to github to
>>> solicit feedback?
>>>
>>> Greg
>>>
>>> On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
>>> amirto...@yahoo.com.invalid> wrote:
>>>
>>> I have new findings & subsequently relative improvements.Am testing as we
>>>> speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
>>>> keep state somewhere. I went with Redis. I found it to be a major bottle
>>>> neck as Beam nodes constantly are going across NW to update its
>>>> repository.So I replaced Redis with Java Concurrenthashmaps. Must
>>>> faster.
>>>> Then Kafka went out of disk space and the replication manager
>>>> complained. So I clustered the two Kafka nodes hoping for sharing space.
>>>>
>>> As
>>>
>>>> of this second I am typing this email, its sustaining but only 1/2 of
>>>> the 201401969  tuples have been processed after 3.5 hours.According to
>>>>
>>> the
>>>
>>>> Linear Road benchmarking expectations, if your system is working well,
>>>>
>>> this
>>>
>>>> whole 201401969  tuples must be done in 3.5 hrs max.So this means there
>>>>
>>> is
>>>
>>>> still room for tuning Flink nodes. I have already shared with you all
>>>>
>>> more
>>>
>>>> details about my config.It run perfe

Re: Performance and Latency Chart for Flink

2016-09-19 Thread Greg Hogan
You will need to add the configuration parameters to your flink-conf.yaml.
I believe the intent is that all configuration parameters should be listed
at

https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#full-reference

My understanding is that the Flink buffers are currently copied to Netty
buffers, although I don't understand the stated memory doubling.


On Mon, Sep 19, 2016 at 3:08 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> Hi Greg,In the same Flink config link below, there are parameters that
> dont even exist in flink-conf.yaml.Are they defined somewhere else?I
> grepped the followings & none existed in any of the files under conf
> folder."taskmanager.memory.fraction", taskmanager.memory.off
> -heap, taskmanager.memory.segment-size & many more.
> Also, isnt the example calculating the network buffers wrong? Based on the
> example, roughly 5000 buffers x 32KiB = 16 KiB should be
> allocated.16 KiB divided by 1024 = 156.25 MiB. Why is the example
> saying "the system would allocate roughly 300 MiBytes for network buffers."
> ?Thats roughly twice as much. Am i Missing something here?I still need your
> help to set the accurate number for my
>- taskmanager.network.numberOfBuffers = 4096.
>
> Thanks for your response Greg.Amir-  From: amir bahmanyari <
> amirto...@yahoo.com>
>  To: "dev@flink.apache.org" <dev@flink.apache.org>
>  Sent: Monday, September 19, 2016 10:34 AM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Hi Greg,I used this guideline to calculate 
> "taskmanager.network.numberOfBuffers":Apache
> Flink 1.2-SNAPSHOT Documentation: Configuration
>
>
> |
> |
> |
> |   ||
>
>   |
>
>   |
> |
> |   |
> Apache Flink 1.2-SNAPSHOT Documentation: Configuration
>|   |
>
>   |
>
>   |
>
>
>
> 4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4
> is there in the formula.What would you set it to? Once I have that number,
> I will set  "taskmanager.memory.preallocate" to true & will give it
> another shot.Thanks Greg
>
>   From: Greg Hogan <c...@greghogan.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>  Sent: Monday, September 19, 2016 8:29 AM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Hi Amir,
>
> You may see improved performance setting "taskmanager.memory.preallocate:
> true" in order to use off-heap memory.
>
> Also, your number of buffers looks quite low and you may want to increase
> "taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128
> MiB.
>
> As this is a only benchmark are you able to post the code to github to
> solicit feedback?
>
> Greg
>
> On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
> amirto...@yahoo.com.invalid> wrote:
>
> > I have new findings & subsequently relative improvements.Am testing as we
> > speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
> > keep state somewhere. I went with Redis. I found it to be a major bottle
> > neck as Beam nodes constantly are going across NW to update its
> > repository.So I replaced Redis with Java Concurrenthashmaps. Must faster.
> > Then Kafka went out of disk space and the replication manager
> > complained. So I clustered the two Kafka nodes hoping for sharing space.
> As
> > of this second I am typing this email, its sustaining but only 1/2 of
> > the 201401969  tuples have been processed after 3.5 hours.According to
> the
> > Linear Road benchmarking expectations, if your system is working well,
> this
> > whole 201401969  tuples must be done in 3.5 hrs max.So this means there
> is
> > still room for tuning Flink nodes. I have already shared with you all
> more
> > details about my config.It run perfect yesterday with almost 1/10th of
> this
> > load. Perfect real-time send/processed streaming behavior.If thats the
> case
> > & I cannot get better performance with FlinkRunner, my nest stop is
> > SparkRunner and repeat of the whole thing for final benchmarking of the
> two
> > under Beam APIs.Which was the initial intent anyways.If you have
> > suggestions to make improvements in the above case, I am all ears &
> greatly
> > appreciate it.Cheers,Amir-
> >
> >  From: "Chawla,Sumit" <sumitkcha...@gmail.com>
> >  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
> >  Sent: Sunday, September 18, 2016 2:07 PM
> >  Subject: Re: Performance and Latency Chart for Flink
> >
> > Has anyone else run th

Re: Performance and Latency Chart for Flink

2016-09-19 Thread Chesnay Schepler

It is normal that you don't see it in the WebInterface.

FLINK-4389 was only about exposing metrics *to* the WebInterface, not 
exposing them *from* it.


Essentially, a metric travels from TaskManager -> WebInterface -> User. 
FLINK-4389 was about the first arrow, which is a prerequisite step for 
the second one.


Regards,
Chesnay

On 19.09.2016 21:35, Greg Hogan wrote:

The nightly snapshots now include "[FLINK-4389] Expose metrics to
WebFrontend":
   https://flink.apache.org/contribute-code.html#snapshots-nightly-builds

For 1.2 we have metrics for "AvailableMemorySegments" and
"TotalMemorySegments":

https://ci.apache.org/projects/flink/flink-docs-master/monitoring/metrics.html#list-of-all-variables

However, when I download the snapshot and start a cluster with the default
configuration I am not seeing a value for this metric in the web UI.

An alternative is to configure the JMX reporter in flink-conf.yaml:

metrics.reporters: jmx_reporter
metrics.reporter.jmx_reporter.class:
org.apache.flink.metrics.jmx.JMXReporter
metrics.reporter.jmx_reporter.port: 9020

You can then monitor the system for the number of used memory segments. Let
us know what you discover!

On Mon, Sep 19, 2016 at 1:34 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:


Hi Greg,I used this guideline to calculate "taskmanager.
network.numberOfBuffers":Apache Flink 1.2-SNAPSHOT Documentation:
Configuration


|
|
|
|   ||

|

   |
|
|   |
Apache Flink 1.2-SNAPSHOT Documentation: Configuration
|   |

   |

   |



4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4
is there in the formula.What would you set it to? Once I have that number,
I will set  "taskmanager.memory.preallocate" to true & will give it
another shot.Thanks Greg

   From: Greg Hogan <c...@greghogan.com>
  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
  Sent: Monday, September 19, 2016 8:29 AM
  Subject: Re: Performance and Latency Chart for Flink

Hi Amir,

You may see improved performance setting "taskmanager.memory.preallocate:
true" in order to use off-heap memory.

Also, your number of buffers looks quite low and you may want to increase
"taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128
MiB.

As this is a only benchmark are you able to post the code to github to
solicit feedback?

Greg

On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:


I have new findings & subsequently relative improvements.Am testing as we
speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
keep state somewhere. I went with Redis. I found it to be a major bottle
neck as Beam nodes constantly are going across NW to update its
repository.So I replaced Redis with Java Concurrenthashmaps. Must faster.
Then Kafka went out of disk space and the replication manager
complained. So I clustered the two Kafka nodes hoping for sharing space.

As

of this second I am typing this email, its sustaining but only 1/2 of
the 201401969  tuples have been processed after 3.5 hours.According to

the

Linear Road benchmarking expectations, if your system is working well,

this

whole 201401969  tuples must be done in 3.5 hrs max.So this means there

is

still room for tuning Flink nodes. I have already shared with you all

more

details about my config.It run perfect yesterday with almost 1/10th of

this

load. Perfect real-time send/processed streaming behavior.If thats the

case

& I cannot get better performance with FlinkRunner, my nest stop is
SparkRunner and repeat of the whole thing for final benchmarking of the

two

under Beam APIs.Which was the initial intent anyways.If you have
suggestions to make improvements in the above case, I am all ears &

greatly

appreciate it.Cheers,Amir-

  From: "Chawla,Sumit" <sumitkcha...@gmail.com>
  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
  Sent: Sunday, September 18, 2016 2:07 PM
  Subject: Re: Performance and Latency Chart for Flink

Has anyone else run these kind of benchmarks?  Would love to hear more
people'e experience and details about those benchmarks.

Regards
Sumit Chawla


On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkcha...@gmail.com>
wrote:


Hi Amir

Would it be possible for you to share the numbers? Also share if

possible

your configuration details.

Regards
Sumit Chawla


On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:


Hi Fabian,FYI. This is report on other engines we did the same type of
bench-marking.Also explains what Linear Road bench-marking is.Thanks

for

your help.
http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
linear-road-benchmark
https://github.com/IBMStreams/benchmarks
https://www.datatorrent.com/blog/blog-implementing-linear-ro
ad-benchmark-in-apex/

Re: Performance and Latency Chart for Flink

2016-09-19 Thread amir bahmanyari
Hi Greg,In the same Flink config link below, there are parameters that dont 
even exist in flink-conf.yaml.Are they defined somewhere else?I grepped the 
followings & none existed in any of the files under conf 
folder."taskmanager.memory.fraction", taskmanager.memory.off-heap, 
taskmanager.memory.segment-size & many more.
Also, isnt the example calculating the network buffers wrong? Based on the 
example, roughly 5000 buffers x 32KiB = 16 KiB should be allocated.16 
KiB divided by 1024 = 156.25 MiB. Why is the example saying "the system would 
allocate roughly 300 MiBytes for network buffers." ?Thats roughly twice as 
much. Am i Missing something here?I still need your help to set the accurate 
number for my    
   - taskmanager.network.numberOfBuffers = 4096.

Thanks for your response Greg.Amir-  From: amir bahmanyari 
<amirto...@yahoo.com>
 To: "dev@flink.apache.org" <dev@flink.apache.org> 
 Sent: Monday, September 19, 2016 10:34 AM
 Subject: Re: Performance and Latency Chart for Flink
   
Hi Greg,I used this guideline to calculate 
"taskmanager.network.numberOfBuffers":Apache Flink 1.2-SNAPSHOT Documentation: 
Configuration

  
|  
|  
|  
|   ||

  |

  |
|  
|   |  
Apache Flink 1.2-SNAPSHOT Documentation: Configuration
   |   |

  |

  |

 

4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4 is 
there in the formula.What would you set it to? Once I have that number, I will 
set  "taskmanager.memory.preallocate" to true & will give it another 
shot.Thanks Greg

  From: Greg Hogan <c...@greghogan.com>
 To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com> 
 Sent: Monday, September 19, 2016 8:29 AM
 Subject: Re: Performance and Latency Chart for Flink
  
Hi Amir,

You may see improved performance setting "taskmanager.memory.preallocate:
true" in order to use off-heap memory.

Also, your number of buffers looks quite low and you may want to increase
"taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128 MiB.

As this is a only benchmark are you able to post the code to github to
solicit feedback?

Greg

On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> I have new findings & subsequently relative improvements.Am testing as we
> speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
> keep state somewhere. I went with Redis. I found it to be a major bottle
> neck as Beam nodes constantly are going across NW to update its
> repository.So I replaced Redis with Java Concurrenthashmaps. Must faster.
> Then Kafka went out of disk space and the replication manager
> complained. So I clustered the two Kafka nodes hoping for sharing space. As
> of this second I am typing this email, its sustaining but only 1/2 of
> the 201401969  tuples have been processed after 3.5 hours.According to the
> Linear Road benchmarking expectations, if your system is working well, this
> whole 201401969  tuples must be done in 3.5 hrs max.So this means there is
> still room for tuning Flink nodes. I have already shared with you all more
> details about my config.It run perfect yesterday with almost 1/10th of this
> load. Perfect real-time send/processed streaming behavior.If thats the case
> & I cannot get better performance with FlinkRunner, my nest stop is
> SparkRunner and repeat of the whole thing for final benchmarking of the two
> under Beam APIs.Which was the initial intent anyways.If you have
> suggestions to make improvements in the above case, I am all ears & greatly
> appreciate it.Cheers,Amir-
>
>      From: "Chawla,Sumit" <sumitkcha...@gmail.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>  Sent: Sunday, September 18, 2016 2:07 PM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Has anyone else run these kind of benchmarks?  Would love to hear more
> people'e experience and details about those benchmarks.
>
> Regards
> Sumit Chawla
>
>
> On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
>
> > Hi Amir
> >
> > Would it be possible for you to share the numbers? Also share if possible
> > your configuration details.
> >
> > Regards
> > Sumit Chawla
> >
> >
> > On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> > amirto...@yahoo.com.invalid> wrote:
> >
> >> Hi Fabian,FYI. This is report on other engines we did the same type of
> >> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
> >> your help.
> >> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
> >> linear-road-benchmark
> >> https://github.com/IBMStrea

Re: Performance and Latency Chart for Flink

2016-09-19 Thread amir bahmanyari
Hi Greg,I used this guideline to calculate 
"taskmanager.network.numberOfBuffers":Apache Flink 1.2-SNAPSHOT Documentation: 
Configuration

  
|  
|   
|   
|   ||

   |

  |
|  
|   |  
Apache Flink 1.2-SNAPSHOT Documentation: Configuration
   |   |

  |

  |

 

4096 = (16x16)x4x4 where 16 is number of tasks per TM, 4 is # of TMs & 4 is 
there in the formula.What would you set it to? Once I have that number, I will 
set  "taskmanager.memory.preallocate" to true & will give it another 
shot.Thanks Greg

  From: Greg Hogan <c...@greghogan.com>
 To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com> 
 Sent: Monday, September 19, 2016 8:29 AM
 Subject: Re: Performance and Latency Chart for Flink
   
Hi Amir,

You may see improved performance setting "taskmanager.memory.preallocate:
true" in order to use off-heap memory.

Also, your number of buffers looks quite low and you may want to increase
"taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128 MiB.

As this is a only benchmark are you able to post the code to github to
solicit feedback?

Greg

On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> I have new findings & subsequently relative improvements.Am testing as we
> speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
> keep state somewhere. I went with Redis. I found it to be a major bottle
> neck as Beam nodes constantly are going across NW to update its
> repository.So I replaced Redis with Java Concurrenthashmaps. Must faster.
> Then Kafka went out of disk space and the replication manager
> complained. So I clustered the two Kafka nodes hoping for sharing space. As
> of this second I am typing this email, its sustaining but only 1/2 of
> the 201401969  tuples have been processed after 3.5 hours.According to the
> Linear Road benchmarking expectations, if your system is working well, this
> whole 201401969  tuples must be done in 3.5 hrs max.So this means there is
> still room for tuning Flink nodes. I have already shared with you all more
> details about my config.It run perfect yesterday with almost 1/10th of this
> load. Perfect real-time send/processed streaming behavior.If thats the case
> & I cannot get better performance with FlinkRunner, my nest stop is
> SparkRunner and repeat of the whole thing for final benchmarking of the two
> under Beam APIs.Which was the initial intent anyways.If you have
> suggestions to make improvements in the above case, I am all ears & greatly
> appreciate it.Cheers,Amir-
>
>      From: "Chawla,Sumit" <sumitkcha...@gmail.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>  Sent: Sunday, September 18, 2016 2:07 PM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Has anyone else run these kind of benchmarks?  Would love to hear more
> people'e experience and details about those benchmarks.
>
> Regards
> Sumit Chawla
>
>
> On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
>
> > Hi Amir
> >
> > Would it be possible for you to share the numbers? Also share if possible
> > your configuration details.
> >
> > Regards
> > Sumit Chawla
> >
> >
> > On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> > amirto...@yahoo.com.invalid> wrote:
> >
> >> Hi Fabian,FYI. This is report on other engines we did the same type of
> >> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
> >> your help.
> >> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
> >> linear-road-benchmark
> >> https://github.com/IBMStreams/benchmarks
> >> https://www.datatorrent.com/blog/blog-implementing-linear-ro
> >> ad-benchmark-in-apex/
> >>
> >>
> >>      From: Fabian Hueske <fhue...@gmail.com>
> >>  To: "dev@flink.apache.org" <dev@flink.apache.org>
> >>  Sent: Friday, September 16, 2016 12:31 AM
> >>  Subject: Re: Performance and Latency Chart for Flink
> >>
> >> Hi,
> >>
> >> I am not aware of periodic performance runs for the Flink releases.
> >> I know a few benchmarks which have been published at different points in
> >> time like [1], [2], and [3] (you'll probably find more).
> >>
> >> In general, fair benchmarks that compare different systems (if there is
> >> such thing) are very difficult and the results often depend on the use
> >> case.
> >> IMO the best option is to run your own benchmarks, if you have a
> concrete
> >> use case.
> >>
> >> Best, F

Re: Performance and Latency Chart for Flink

2016-09-19 Thread Greg Hogan
Hi Amir,

You may see improved performance setting "taskmanager.memory.preallocate:
true" in order to use off-heap memory.

Also, your number of buffers looks quite low and you may want to increase
"taskmanager.network.numberOfBuffers". Your setting of 4096 is only 128 MiB.

As this is a only benchmark are you able to post the code to github to
solicit feedback?

Greg

On Sun, Sep 18, 2016 at 9:00 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> I have new findings & subsequently relative improvements.Am testing as we
> speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had
> keep state somewhere. I went with Redis. I found it to be a major bottle
> neck as Beam nodes constantly are going across NW to update its
> repository.So I replaced Redis with Java Concurrenthashmaps. Must faster.
> Then Kafka went out of disk space and the replication manager
> complained. So I clustered the two Kafka nodes hoping for sharing space. As
> of this second I am typing this email, its sustaining but only 1/2 of
> the 201401969  tuples have been processed after 3.5 hours.According to the
> Linear Road benchmarking expectations, if your system is working well, this
> whole 201401969   tuples must be done in 3.5 hrs max.So this means there is
> still room for tuning Flink nodes. I have already shared with you all more
> details about my config.It run perfect yesterday with almost 1/10th of this
> load. Perfect real-time send/processed streaming behavior.If thats the case
> & I cannot get better performance with FlinkRunner, my nest stop is
> SparkRunner and repeat of the whole thing for final benchmarking of the two
> under Beam APIs.Which was the initial intent anyways.If you have
> suggestions to make improvements in the above case, I am all ears & greatly
> appreciate it.Cheers,Amir-
>
>   From: "Chawla,Sumit" <sumitkcha...@gmail.com>
>  To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com>
>  Sent: Sunday, September 18, 2016 2:07 PM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Has anyone else run these kind of benchmarks?  Would love to hear more
> people'e experience and details about those benchmarks.
>
> Regards
> Sumit Chawla
>
>
> On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkcha...@gmail.com>
> wrote:
>
> > Hi Amir
> >
> > Would it be possible for you to share the numbers? Also share if possible
> > your configuration details.
> >
> > Regards
> > Sumit Chawla
> >
> >
> > On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> > amirto...@yahoo.com.invalid> wrote:
> >
> >> Hi Fabian,FYI. This is report on other engines we did the same type of
> >> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
> >> your help.
> >> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
> >> linear-road-benchmark
> >> https://github.com/IBMStreams/benchmarks
> >> https://www.datatorrent.com/blog/blog-implementing-linear-ro
> >> ad-benchmark-in-apex/
> >>
> >>
> >>  From: Fabian Hueske <fhue...@gmail.com>
> >>  To: "dev@flink.apache.org" <dev@flink.apache.org>
> >>  Sent: Friday, September 16, 2016 12:31 AM
> >>  Subject: Re: Performance and Latency Chart for Flink
> >>
> >> Hi,
> >>
> >> I am not aware of periodic performance runs for the Flink releases.
> >> I know a few benchmarks which have been published at different points in
> >> time like [1], [2], and [3] (you'll probably find more).
> >>
> >> In general, fair benchmarks that compare different systems (if there is
> >> such thing) are very difficult and the results often depend on the use
> >> case.
> >> IMO the best option is to run your own benchmarks, if you have a
> concrete
> >> use case.
> >>
> >> Best, Fabian
> >>
> >> [1] 08/2015:
> >> http://data-artisans.com/high-throughput-low-latency-and-exa
> >> ctly-once-stream-processing-with-apache-flink/
> >> [2] 12/2015:
> >> https://yahooeng.tumblr.com/post/135321837876/benchmarking-
> >> streaming-computation-engines-at
> >> [3] 02/2016:
> >> http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
> >>
> >>
> >> 2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkcha...@gmail.com>:
> >>
> >> > Hi
> >> >
> >> > Is there any performance run that is done for each Flink release? Or
> you
> >> > are aware of any third party evaluation of performance metrics for
> >> Flink?
> >> > I am interested in seeing how performance has improved over release to
> >> > release, and performance vs other competitors.
> >> >
> >> > Regards
> >> > Sumit Chawla
> >> >
> >>
> >>
> >>
> >>
> >
> >
>
>
>
>


Re: Performance and Latency Chart for Flink

2016-09-18 Thread amir bahmanyari
I have new findings & subsequently relative improvements.Am testing as we 
speak. 4 Beam server nodes , Azure A11 & 2 Kafka nodes same config.I had keep 
state somewhere. I went with Redis. I found it to be a major bottle neck as 
Beam nodes constantly are going across NW to update its repository.So I 
replaced Redis with Java Concurrenthashmaps. Must faster. Then Kafka went out 
of disk space and the replication manager complained. So I clustered the two 
Kafka nodes hoping for sharing space. As of this second I am typing this email, 
its sustaining but only 1/2 of the 201401969  tuples have been processed after 
3.5 hours.According to the Linear Road benchmarking expectations, if your 
system is working well, this whole 201401969   tuples must be done in 3.5 hrs 
max.So this means there is still room for tuning Flink nodes. I have already 
shared with you all more details about my config.It run perfect yesterday with 
almost 1/10th of this load. Perfect real-time send/processed streaming 
behavior.If thats the case & I cannot get better performance with FlinkRunner, 
my nest stop is SparkRunner and repeat of the whole thing for final 
benchmarking of the two under Beam APIs.Which was the initial intent anyways.If 
you have suggestions to make improvements in the above case, I am all ears & 
greatly appreciate it.Cheers,Amir-

  From: "Chawla,Sumit" <sumitkcha...@gmail.com>
 To: dev@flink.apache.org; amir bahmanyari <amirto...@yahoo.com> 
 Sent: Sunday, September 18, 2016 2:07 PM
 Subject: Re: Performance and Latency Chart for Flink
   
Has anyone else run these kind of benchmarks?  Would love to hear more
people'e experience and details about those benchmarks.

Regards
Sumit Chawla


On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkcha...@gmail.com>
wrote:

> Hi Amir
>
> Would it be possible for you to share the numbers? Also share if possible
> your configuration details.
>
> Regards
> Sumit Chawla
>
>
> On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> amirto...@yahoo.com.invalid> wrote:
>
>> Hi Fabian,FYI. This is report on other engines we did the same type of
>> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
>> your help.
>> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
>> linear-road-benchmark
>> https://github.com/IBMStreams/benchmarks
>> https://www.datatorrent.com/blog/blog-implementing-linear-ro
>> ad-benchmark-in-apex/
>>
>>
>>      From: Fabian Hueske <fhue...@gmail.com>
>>  To: "dev@flink.apache.org" <dev@flink.apache.org>
>>  Sent: Friday, September 16, 2016 12:31 AM
>>  Subject: Re: Performance and Latency Chart for Flink
>>
>> Hi,
>>
>> I am not aware of periodic performance runs for the Flink releases.
>> I know a few benchmarks which have been published at different points in
>> time like [1], [2], and [3] (you'll probably find more).
>>
>> In general, fair benchmarks that compare different systems (if there is
>> such thing) are very difficult and the results often depend on the use
>> case.
>> IMO the best option is to run your own benchmarks, if you have a concrete
>> use case.
>>
>> Best, Fabian
>>
>> [1] 08/2015:
>> http://data-artisans.com/high-throughput-low-latency-and-exa
>> ctly-once-stream-processing-with-apache-flink/
>> [2] 12/2015:
>> https://yahooeng.tumblr.com/post/135321837876/benchmarking-
>> streaming-computation-engines-at
>> [3] 02/2016:
>> http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>>
>>
>> 2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkcha...@gmail.com>:
>>
>> > Hi
>> >
>> > Is there any performance run that is done for each Flink release? Or you
>> > are aware of any third party evaluation of performance metrics for
>> Flink?
>> > I am interested in seeing how performance has improved over release to
>> > release, and performance vs other competitors.
>> >
>> > Regards
>> > Sumit Chawla
>> >
>>
>>
>>
>>
>
>


   

Re: Performance and Latency Chart for Flink

2016-09-18 Thread Chawla,Sumit
Has anyone else run these kind of benchmarks?  Would love to hear more
people'e experience and details about those benchmarks.

Regards
Sumit Chawla


On Sun, Sep 18, 2016 at 2:01 PM, Chawla,Sumit <sumitkcha...@gmail.com>
wrote:

> Hi Amir
>
> Would it be possible for you to share the numbers? Also share if possible
> your configuration details.
>
> Regards
> Sumit Chawla
>
>
> On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
> amirto...@yahoo.com.invalid> wrote:
>
>> Hi Fabian,FYI. This is report on other engines we did the same type of
>> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
>> your help.
>> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-
>> linear-road-benchmark
>> https://github.com/IBMStreams/benchmarks
>> https://www.datatorrent.com/blog/blog-implementing-linear-ro
>> ad-benchmark-in-apex/
>>
>>
>>   From: Fabian Hueske <fhue...@gmail.com>
>>  To: "dev@flink.apache.org" <dev@flink.apache.org>
>>  Sent: Friday, September 16, 2016 12:31 AM
>>  Subject: Re: Performance and Latency Chart for Flink
>>
>> Hi,
>>
>> I am not aware of periodic performance runs for the Flink releases.
>> I know a few benchmarks which have been published at different points in
>> time like [1], [2], and [3] (you'll probably find more).
>>
>> In general, fair benchmarks that compare different systems (if there is
>> such thing) are very difficult and the results often depend on the use
>> case.
>> IMO the best option is to run your own benchmarks, if you have a concrete
>> use case.
>>
>> Best, Fabian
>>
>> [1] 08/2015:
>> http://data-artisans.com/high-throughput-low-latency-and-exa
>> ctly-once-stream-processing-with-apache-flink/
>> [2] 12/2015:
>> https://yahooeng.tumblr.com/post/135321837876/benchmarking-
>> streaming-computation-engines-at
>> [3] 02/2016:
>> http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>>
>>
>> 2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkcha...@gmail.com>:
>>
>> > Hi
>> >
>> > Is there any performance run that is done for each Flink release? Or you
>> > are aware of any third party evaluation of performance metrics for
>> Flink?
>> > I am interested in seeing how performance has improved over release to
>> > release, and performance vs other competitors.
>> >
>> > Regards
>> > Sumit Chawla
>> >
>>
>>
>>
>>
>
>


Re: Performance and Latency Chart for Flink

2016-09-18 Thread Chawla,Sumit
Hi Amir

Would it be possible for you to share the numbers? Also share if possible
your configuration details.

Regards
Sumit Chawla


On Fri, Sep 16, 2016 at 12:18 PM, amir bahmanyari <
amirto...@yahoo.com.invalid> wrote:

> Hi Fabian,FYI. This is report on other engines we did the same type of
> bench-marking.Also explains what Linear Road bench-marking is.Thanks for
> your help.
> http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-
> the-linear-road-benchmark
> https://github.com/IBMStreams/benchmarks
> https://www.datatorrent.com/blog/blog-implementing-linear-
> road-benchmark-in-apex/
>
>
>   From: Fabian Hueske <fhue...@gmail.com>
>  To: "dev@flink.apache.org" <dev@flink.apache.org>
>  Sent: Friday, September 16, 2016 12:31 AM
>  Subject: Re: Performance and Latency Chart for Flink
>
> Hi,
>
> I am not aware of periodic performance runs for the Flink releases.
> I know a few benchmarks which have been published at different points in
> time like [1], [2], and [3] (you'll probably find more).
>
> In general, fair benchmarks that compare different systems (if there is
> such thing) are very difficult and the results often depend on the use
> case.
> IMO the best option is to run your own benchmarks, if you have a concrete
> use case.
>
> Best, Fabian
>
> [1] 08/2015:
> http://data-artisans.com/high-throughput-low-latency-and-
> exactly-once-stream-processing-with-apache-flink/
> [2] 12/2015:
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-
> computation-engines-at
> [3] 02/2016:
> http://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>
>
> 2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkcha...@gmail.com>:
>
> > Hi
> >
> > Is there any performance run that is done for each Flink release? Or you
> > are aware of any third party evaluation of performance metrics for Flink?
> > I am interested in seeing how performance has improved over release to
> > release, and performance vs other competitors.
> >
> > Regards
> > Sumit Chawla
> >
>
>
>
>


Re: Performance and Latency Chart for Flink

2016-09-16 Thread amir bahmanyari
Hi Fabian,FYI. This is report on other engines we did the same type of 
bench-marking.Also explains what Linear Road bench-marking is.Thanks for your 
help.
http://www.slideshare.net/RedisLabs/walmart-ibm-revisit-the-linear-road-benchmark
https://github.com/IBMStreams/benchmarks 
https://www.datatorrent.com/blog/blog-implementing-linear-road-benchmark-in-apex/


  From: Fabian Hueske <fhue...@gmail.com>
 To: "dev@flink.apache.org" <dev@flink.apache.org> 
 Sent: Friday, September 16, 2016 12:31 AM
 Subject: Re: Performance and Latency Chart for Flink
   
Hi,

I am not aware of periodic performance runs for the Flink releases.
I know a few benchmarks which have been published at different points in
time like [1], [2], and [3] (you'll probably find more).

In general, fair benchmarks that compare different systems (if there is
such thing) are very difficult and the results often depend on the use case.
IMO the best option is to run your own benchmarks, if you have a concrete
use case.

Best, Fabian

[1] 08/2015:
http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
[2] 12/2015:
https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
[3] 02/2016:
http://data-artisans.com/extending-the-yahoo-streaming-benchmark/


2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkcha...@gmail.com>:

> Hi
>
> Is there any performance run that is done for each Flink release? Or you
> are aware of any third party evaluation of performance metrics for Flink?
> I am interested in seeing how performance has improved over release to
> release, and performance vs other competitors.
>
> Regards
> Sumit Chawla
>


   

Re: Performance and Latency Chart for Flink

2016-09-16 Thread Timo Walther

Hi Amir,

it would be great if you could link to the details of your benchmark 
environment if you make such claims. Compared to which IBM system? 
Characteristics of your machines? Configuration of the software? 
Implementation code? etc.


In general the Beam Runner also adds some overhead compared to native 
Flink jobs.  There are many factors that could affect results. I don't 
know the Linear Road Benchmark but 150 times sounds unrealistic.


Timo


Am 16/09/16 um 10:02 schrieb amir bahmanyari:

FYI, we, at a well known IT department, have been actively measuring Beam Flink 
Runner performance using MIT's Linear Road to stress the Flink Cluster servers.The 
results, thus far does not even come close to the previous streaming engines we 
have bench-marked.Our optimistic assumption was, when we started, that Beam runners 
(Flink for instance) will leave Storm & IBM in smoke.Wrong. What IBM managed to 
perform is 150 times better than Flink. Needless to mention Storm, and 
Hortonworks.As an example, IBM  handled 150 expressways in 3.5 hours.In the same 
identical topology, everything fixed, Beam Flink Runner in a Flink Cluster handled 
10 expressways in 17 hours at its best so far.
I have followed every single performance tuning recommendation that is out there 
& none improved it even a bit.Works fine with 1 expressway. Sorry but thats our 
findings so far unless we are doing something wrong.I posted all details to this 
forum but never got any solid response that would make a difference in our 
observations.Therefore, we assume what we are seeing is the reality which we have 
to report to our superiors.Pls prove us wrong. We still have some time.Thanks.Amir-

   From: Fabian Hueske <fhue...@gmail.com>
  To: "dev@flink.apache.org" <dev@flink.apache.org>
  Sent: Friday, September 16, 2016 12:31 AM
  Subject: Re: Performance and Latency Chart for Flink

Hi,


I am not aware of periodic performance runs for the Flink releases.
I know a few benchmarks which have been published at different points in
time like [1], [2], and [3] (you'll probably find more).

In general, fair benchmarks that compare different systems (if there is
such thing) are very difficult and the results often depend on the use case.
IMO the best option is to run your own benchmarks, if you have a concrete
use case.

Best, Fabian

[1] 08/2015:
http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
[2] 12/2015:
https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
[3] 02/2016:
http://data-artisans.com/extending-the-yahoo-streaming-benchmark/


2016-09-16 5:54 GMT+02:00 Chawla,Sumit <sumitkcha...@gmail.com>:


Hi

Is there any performance run that is done for each Flink release? Or you
are aware of any third party evaluation of performance metrics for Flink?
I am interested in seeing how performance has improved over release to
release, and performance vs other competitors.

Regards
Sumit Chawla







--
Freundliche Grüße / Kind Regards

Timo Walther

Follow me: @twalthr
https://www.linkedin.com/in/twalthr



Re: Performance and Latency Chart for Flink

2016-09-16 Thread Fabian Hueske
Hi,

I am not aware of periodic performance runs for the Flink releases.
I know a few benchmarks which have been published at different points in
time like [1], [2], and [3] (you'll probably find more).

In general, fair benchmarks that compare different systems (if there is
such thing) are very difficult and the results often depend on the use case.
IMO the best option is to run your own benchmarks, if you have a concrete
use case.

Best, Fabian

[1] 08/2015:
http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
[2] 12/2015:
https://yahooeng.tumblr.com/post/135321837876/benchmarking-streaming-computation-engines-at
[3] 02/2016:
http://data-artisans.com/extending-the-yahoo-streaming-benchmark/


2016-09-16 5:54 GMT+02:00 Chawla,Sumit :

> Hi
>
> Is there any performance run that is done for each Flink release? Or you
> are aware of any third party evaluation of performance metrics for Flink?
> I am interested in seeing how performance has improved over release to
> release, and performance vs other competitors.
>
> Regards
> Sumit Chawla
>