[ 
https://issues.apache.org/jira/browse/BEAM-11146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anonymous updated BEAM-11146:
-----------------------------
    Status: Triage Needed  (was: Resolved)

> Add option to disable copying between Flink runner 
> ---------------------------------------------------
>
>                 Key: BEAM-11146
>                 URL: https://issues.apache.org/jira/browse/BEAM-11146
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-flink
>            Reporter: Teodor Spæren
>            Assignee: Teodor Spæren
>            Priority: P2
>              Labels: performance
>             Fix For: 2.26.0
>
>          Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In order to implement Flink 
> [TypeSerializer|https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
>  the runner implements 
> [CoderTypeSerializer|https://github.com/apache/beam/blob/master/runners/flink/1.8/src/main/java/org/apache/beam/runners/flink/translation/types/CoderTypeSerializer.java#L84].
>  The way the {{copy}} function is implemented is by first serializing and 
> then deserializing each element. This means that such a deep copy needs to be 
> done between each operator and this can become a bottleneck.
> The reason the {{copy}} functions need to be implemented is that Flink 
> guarantees that elements will be deep copied between each operator. In Beam 
> this is the users responsibility and so this is not strictly neccecarry.
> The aim of this improvement is to introduce an option on the Flink Runner, 
> that eliminates this overhead, by simply returning the value.
> [Here is the mailing list 
> discussion|https://lists.apache.org/thread.html/r24129dba98782e1cf4d18ec738ab9714dceb05ac23f13adfac5baad1%40%3Cdev.beam.apache.org%3E]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to