Pablo Estrada created BEAM-1442:
-----------------------------------
Summary: Performance improvement of the Python DirectRunner
Key: BEAM-1442
URL: https://issues.apache.org/jira/browse/BEAM-1442
Project: Beam
Issue Type: Improvement
Components: sdk-py
Reporter: Pablo Estrada
Assignee: Ahmet Altay
The DirectRunner for Python and Java are intended to act as policy enforcers,
and correctness checkers for Beam pipelines; but there are users that run data
processing tasks in them.
Currently, the Python Direct Runner has less-than-great performance, although
some work has gone into improving it. There are more opportunities for
improvement.
Skills for this project:
* Python
* Cython (nice to have)
* Working through the Beam getting started materials (nice to have)
To start figuring out this problem, it is advisable to run a simple pipeline,
and study the `Pipeline.run` and `DirectRunner.run` methods. Ask questions
directly on JIRA.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)