Charles Chen created BEAM-3537:
----------------------------------
Summary: Remove DirectRunner-specific internal PValue cache, allow
more general eager in-process pipeline execution
Key: BEAM-3537
URL: https://issues.apache.org/jira/browse/BEAM-3537
Project: Beam
Issue Type: Bug
Components: sdk-py-core
Affects Versions: 2.2.0
Reporter: Charles Chen
Assignee: Charles Chen
Currently, the Python SDK supports an eager execution mode. For example, a
list can be directly passed into a PTransform to obtain its result:
{{result = [1, 2, 3] | MyPTransform()}}
To support this use, the Python DirectRunner has an option to cache its
intermediate results into a PValueCache. The above line, when run, implicitly
creates an ephemeral pipeline and runs it with the DirectRunner. This,
however, adds a lot of complexity to the DirectRunner, and is not generalizable
to other in-process Python runners (like the in-process Python FnApiRunner).
To improve this, we should remove this DirectRunner-specific implementation and
add functionality that allows all in-process Python runners to be run in eager
mode.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)