GitHub user kennknowles opened a pull request:
https://github.com/apache/beam/pull/3924
[BEAM-2576] Split Beam portability framework into construction, job, and
execution modules
Follow this checklist to help us incorporate your contribution quickly and
easily:
- [x] Make sure there is a [JIRA
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the
change (usually before you start working on it). Trivial changes like typos do
not require a JIRA issue. Your pull request should address just this issue,
without pulling in other changes.
- [x] Each commit in the pull request should have a meaningful subject
line and body.
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue.
- [x] Write a pull request description that is detailed enough to
understand what the pull request does, how, and why.
- [x] Run `mvn clean verify` to make sure basic checks pass. A more
thorough check will be performed on your pull request automatically.
- [x] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
---
The Beam portability framework has a lot of simple pieces that can be
modularized more usefully than "Runner API" and "Fn API".
Here are some semantic categories that should remain stable, and make sense
to depend on separately:
- Modeling of a Pipeline and pieces of it
- Standardized transform payloads
- primitives
- non-primitives
- execution-time only transforms
- Standardized UDFs
- Job submission and management (Job API)
- Artifact upload and retrieval (Artifact API) (could separate upload and
retrieval)
- Model of bundle processing instructions (ProcessBundleDescriptor)
- Standardized execution-only instruction payloads (like data plane read
instructions)
- Setting up the SDK harness (Provision API)
- Control plane for bundle processing (FnControl)
- Multi-purpose data streaming (FnData)
- Logging
- ... more later, maybe ...
This PR is a first move in this direction, organizing the components
according to when in a pipeline's lifecycle they are used.
- **Construction**: building a pipeline and specifying transforms.
Basically just data structures, not APIs. Code doing this work should only
depend on this module.
- **Job**: submission and management of a pipeline, including artifacts,
etc. Construction code should not depend on this module.
- **Execution**: SDK harness management and UDF invocation. Only runner
internals and SDK harnesses should depend on this module.
What this PR does is move the existing protos into the right maven modules,
but does not split them yet.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kennknowles/beam beam-model
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/3924.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3924
----
commit 4ff34d419d8551c874294d9cb2a20ae69ae80530
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T19:09:35Z
Add model/construction to Maven structure and gen_protos.py
commit da9184d7de59c43c08f29cd2072eac917b012911
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T17:46:22Z
Move beam_runner_api.proto to model/construction
commit d3368c0f761d41baa0d607991b6c3471853c44f9
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T17:47:49Z
Add model/job to Maven structure and gen_protos.py
commit 856be477e07ea1c1acbebc43ca3ba2b04bf77ce7
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T18:02:14Z
Move standard_window_fns.proto to model/construction
commit 118faaba61ed2724f27f6b77770725fcb2f062c9
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T18:04:22Z
Move beam_artifact_api.proto to model/job
commit a72ecb869d969e8f1dc7fef7b0ec254a5bb13b15
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T18:05:35Z
Move beam_job_api.proto to model/job
commit 1989720e4b4b316697b33600abd4e2bdb7c46d2b
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T18:07:30Z
Remove empty runner-api module from Maven structure and gen_protos.py
commit 208a6f48bfc5d8b4196fb74f7d5a54abc665565f
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T17:48:11Z
Add model/execution to Maven structure and gen_protos.py
commit d7c24d16db6c14be2211eed4e296e504fe396b92
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T18:25:40Z
Move beam_provision_api.proto to model/execution
commit 986b1a0ef02a25c3d22d217f22139b72ebde2f48
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T19:04:42Z
Move beam_fn_api.proto to model/execution
commit a5114b0d9f1daeec673f23dfe459c8bc98b652c9
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T19:06:43Z
Move standard_coders.yaml to model/execution
commit 9ac1d0a5fb521db4de99dec1d2746e82996c3e2c
Author: Kenneth Knowles <[email protected]>
Date: 2017-09-29T19:07:21Z
Remove empty fn-api and sdks/common from Maven structure and gen_protos.py
----
---