Chamikara Madhusanka Jayalath created BEAM-10012:
----------------------------------------------------
Summary: Update Python SDK to construct Dataflow job requests from
Beam runner API protos
Key: BEAM-10012
URL: https://issues.apache.org/jira/browse/BEAM-10012
Project: Beam
Issue Type: New Feature
Components: sdk-py-core
Reporter: Chamikara Madhusanka Jayalath
Currently, portable runners are expected to do following when constructing a
runner specific job.
SDK specific job graph -> Beam runner API proto -> Runner specific job request
Portable Spark and Flink follow this model.
Dataflow does following.
SDK specific job graph -> Runner specific job request
Beam runner API proto -> Upload to GCS -> Download at workers
We should update Dataflow to follow the prior path which is expected to be
followed by all portable runners.
This will simplify the cross-language transforms job construction logic for
Dataflow.
We can probably start this by just implementing this for Python SDK for
portions of pipeline received by expanding external transforms.
cc: [~lcwik] [~robertwb]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)