[ 
https://issues.apache.org/jira/browse/BEAM-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-7206:
-------------------------------------
    Status: Open  (was: Triage Needed)

> Coder copy overhead
> -------------------
>
>                 Key: BEAM-7206
>                 URL: https://issues.apache.org/jira/browse/BEAM-7206
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-flink, sdk-java-core
>            Reporter: Jozef Vilcek
>            Priority: Major
>
> More context can be found in discussion here:
> [http://mail-archives.apache.org/mod_mbox/beam-dev/201904.mbox/%3CCAOUjMkyKV8npYJfS_PF3Gzo=vwomb2frzute81zsrxnm13t...@mail.gmail.com%3E]
> I am not sure how much is this runner dependent, but each operator's user 
> function receives a copy of data element for isolation. Beam coders does copy 
> by serializing to bytes and then deserialize back. This seems to impact 
> performance and grows with job complexity.
> On a simple test pipeline described in discussion thread above, I noticed 
> almost 2x speedup when CoderUtils.copy() just returned the object. 
> Native Flink job does copy too, but via Kryo, which seems to be doing deep 
> copy more effectively, on object level.
> What can be done in Beam to reduce this overhead?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to