[
https://issues.apache.org/jira/browse/BEAM-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ankur Goenka updated BEAM-3189:
-------------------------------
Description:
Beam Python SDK is couple of magnitude slower than Java SDK when it comes to
stream processing.
There are two related issues:
# Given a single core, currently we are not fully utilizing the core because
the single thread spends a lot of time on the IO. This is more of a limitation
of our implementation rather than a limitation of Python.
# Given a machine with multiple cores, single Python process could only utilize
one core.
In this task we will only address 1. 2 will be good for future optimization.
was:
Python post commits are failing because the runner harness is not compatible
with the sdk harness.
We need a new runner harness compatible with:
https://github.com/apache/beam/commit/80c6f4ec0c2a3cc3a441289a9cc8ff53cb70f863
> Python Fnapi - Worker speedup
> -----------------------------
>
> Key: BEAM-3189
> URL: https://issues.apache.org/jira/browse/BEAM-3189
> Project: Beam
> Issue Type: Improvement
> Components: sdk-py-harness
> Affects Versions: 2.3.0
> Reporter: Ankur Goenka
> Assignee: Ankur Goenka
> Priority: Minor
> Labels: performance, portability
>
> Beam Python SDK is couple of magnitude slower than Java SDK when it comes to
> stream processing.
> There are two related issues:
> # Given a single core, currently we are not fully utilizing the core because
> the single thread spends a lot of time on the IO. This is more of a
> limitation of our implementation rather than a limitation of Python.
> # Given a machine with multiple cores, single Python process could only
> utilize one core.
> In this task we will only address 1. 2 will be good for future optimization.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)