Hi,

My streaming job cannot benefit much from parallelization unfortunately.
So I'm looking for things I can tune in Flink, to make it process
sequential stream faster.

So far in our current engine based on Akka Streams (non distributed ofc) we
have 20k msg/sec.
Ported to Flink I'm getting 14k so far.

My observations are following:

   - if I chain operations together they execute all in sequence, so I
   basically sum up the time required to process one data item across all my
   stream operators, not good
   - if I split chains, they execute asynchronously to each other, but
   there is serialization and network overhead

Second approach gives me better results, considering that I have a server
with more than enough memory and cores to do all side work for
serialization. But I want to reduce this serialization\data transfer
overhead to a minimum.

So what I have now:

environment.getConfig.enableObjectReuse() // cos it's Scala we don't need
unnecessary serialization
environment.getConfig.disableAutoTypeRegistration() // it works faster with
it, I'm not sure why
environment.addDefaultKryoSerializer(..) // custom Message Pack
serialization for all message types, gives about 50% boost

But that's it, I don't know what else to do.
I didn't find any interesting network\buffer settings in docs.

Best regards,
Dmitry

Reply via email to