[
https://issues.apache.org/jira/browse/BEAM-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maximilian Michels updated BEAM-7206:
-------------------------------------
Status: Open (was: Triage Needed)
> Coder copy overhead
> -------------------
>
> Key: BEAM-7206
> URL: https://issues.apache.org/jira/browse/BEAM-7206
> Project: Beam
> Issue Type: Improvement
> Components: runner-flink, sdk-java-core
> Reporter: Jozef Vilcek
> Priority: Major
>
> More context can be found in discussion here:
> [http://mail-archives.apache.org/mod_mbox/beam-dev/201904.mbox/%3CCAOUjMkyKV8npYJfS_PF3Gzo=vwomb2frzute81zsrxnm13t...@mail.gmail.com%3E]
> I am not sure how much is this runner dependent, but each operator's user
> function receives a copy of data element for isolation. Beam coders does copy
> by serializing to bytes and then deserialize back. This seems to impact
> performance and grows with job complexity.
> On a simple test pipeline described in discussion thread above, I noticed
> almost 2x speedup when CoderUtils.copy() just returned the object.
> Native Flink job does copy too, but via Kryo, which seems to be doing deep
> copy more effectively, on object level.
> What can be done in Beam to reduce this overhead?
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)