[
https://issues.apache.org/jira/browse/BEAM-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anonymous updated BEAM-11146:
-----------------------------
Status: Triage Needed (was: Resolved)
> Add option to disable copying between Flink runner
> ---------------------------------------------------
>
> Key: BEAM-11146
> URL: https://issues.apache.org/jira/browse/BEAM-11146
> Project: Beam
> Issue Type: Improvement
> Components: runner-flink
> Reporter: Teodor Spæren
> Assignee: Teodor Spæren
> Priority: P2
> Labels: performance
> Fix For: 2.26.0
>
> Time Spent: 2.5h
> Remaining Estimate: 0h
>
> In order to implement Flink
> [TypeSerializer|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
> the runner implements
> [CoderTypeSerializer|https://github.com/apache/beam/blob/master/runners/flink/1.8/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java#L84].
> The way the {{copy}} function is implemented is by first serializing and
> then deserializing each element. This means that such a deep copy needs to be
> done between each operator and this can become a bottleneck.
> The reason the {{copy}} functions need to be implemented is that Flink
> guarantees that elements will be deep copied between each operator. In Beam
> this is the users responsibility and so this is not strictly neccecarry.
> The aim of this improvement is to introduce an option on the Flink Runner,
> that eliminates this overhead, by simply returning the value.
> [Here is the mailing list
> discussion|https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)