[
https://issues.apache.org/jira/browse/BEAM-831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999300#comment-15999300
]
Thomas Weise commented on BEAM-831:
-----------------------------------
[~kenn] I verified that the PR fuses the ParDo transforms and in the wordcount
example reduces the containers from 14 to 7. There are further possible
optimizations like fusing Read and downstream ParDo or fusing GBK and combine
that can be taken up later.
> ParDo Chaining
> --------------
>
> Key: BEAM-831
> URL: https://issues.apache.org/jira/browse/BEAM-831
> Project: Beam
> Issue Type: Improvement
> Components: runner-apex
> Reporter: Thomas Weise
> Assignee: Chinmay Kolhatkar
> Fix For: First stable release
>
>
> Current state of Apex runner creates a plan that will place each operator in
> a separate container (which would be processes when running on a YARN
> cluster). Often the ParDo operators can be collocated in same thread or
> container. Use Apex affinity/stream locality attributes for more efficient
> execution plan.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)