One network setting is mentioned here:

https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/datastream_api.html#controlling-latency


From: Dmitry Golubets <dgolub...@gmail.com<mailto:dgolub...@gmail.com>>
Date: Friday, February 17, 2017 at 6:43 AM
To: <user@flink.apache.org<mailto:user@flink.apache.org>>
Subject: Performance tuning

Hi,

My streaming job cannot benefit much from parallelization unfortunately.
So I'm looking for things I can tune in Flink, to make it process sequential 
stream faster.

So far in our current engine based on Akka Streams (non distributed ofc) we 
have 20k msg/sec.
Ported to Flink I'm getting 14k so far.

My observations are following:

  *   if I chain operations together they execute all in sequence, so I 
basically sum up the time required to process one data item across all my 
stream operators, not good
  *   if I split chains, they execute asynchronously to each other, but there 
is serialization and network overhead

Second approach gives me better results, considering that I have a server with 
more than enough memory and cores to do all side work for serialization. But I 
want to reduce this serialization\data transfer overhead to a minimum.

So what I have now:

environment.getConfig.enableObjectReuse() // cos it's Scala we don't need 
unnecessary serialization
environment.getConfig.disableAutoTypeRegistration() // it works faster with it, 
I'm not sure why
environment.addDefaultKryoSerializer(..) // custom Message Pack serialization 
for all message types, gives about 50% boost

But that's it, I don't know what else to do.
I didn't find any interesting network\buffer settings in docs.

Best regards,
Dmitry

Reply via email to