Teodor Spæren created BEAM-11146:
------------------------------------
Summary: Add option to disable copying between Flink runner
Key: BEAM-11146
URL: https://issues.apache.org/jira/browse/BEAM-11146
Project: Beam
Issue Type: Improvement
Components: runner-flink
Reporter: Teodor Spæren
In order to implement Flink
[TypeSerializer|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
the runner implements
[CoderTypeSerializer|https://github.com/apache/beam/blob/master/runners/flink/1.8/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java#L84].
The way the {{copy}} function is implemented is by first serializing and then
deserializing each element. This means that such a deep copy needs to be done
between each operator and this can become a bottleneck.
The reason the {{copy}} functions need to be implemented is that Flink
guarantees that elements will be deep copied between each operator. In Beam
this is the users responsibility and so this is not strictly neccecarry.
The aim of this improvement is to introduce an option on the Flink Runner, that
eliminates this overhead, by simply returning the value.
[Here is the mailing list
discussion|https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)