GitHub user kennknowles opened a pull request:
https://github.com/apache/incubator-beam/pull/262
[BEAM-115] Make in-process GroupByKey consistent with Beam model
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
- [x] Make sure the PR title is formatted like:
`[BEAM-<Jira issue #>] Description of pull request`
- [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
- [x] Replace `<Jira issue #>` in the title with the actual Jira issue
number, if there is one.
- [x] If this contribution is large, please file an Apache
[Individual Contributor License
Agreement](https://www.apache.org/licenses/icla.txt).
---
The commits in this PR stand individual but have strong dependencies. They
each build towards making the `InProcessPipelineRunner` correspond to the
intended runner API / Beam model.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kennknowles/incubator-beam InProcessGroupByKey
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/262.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #262
----
commit 6f3eeb4fada9fa72763980f26af8949141dbbe51
Author: Kenneth Knowles <[email protected]>
Date: 2016-04-28T22:50:32Z
Add WindowMatchers.isWindowedValue(<value matcher>)
commit 34ec15d6a5923287f4db0db63083c37b87c030b7
Author: Kenneth Knowles <[email protected]>
Date: 2016-04-28T22:51:40Z
Add accessors for sub-coders of KeyedWorkItemCoder
commit 91643088f4032898cf67973b032d86a528eca199
Author: Kenneth Knowles <[email protected]>
Date: 2016-04-28T23:12:21Z
Encapsulate cloning behavior of in-process ParDo evaluator
This will make way for using the evluator in contexts where cloning
is not appropriate, such as evaluator GroupAlsoByWindow
commit 753787ff0eb10c524f336e9af837ed442f005121
Author: Kenneth Knowles <[email protected]>
Date: 2016-04-28T23:13:24Z
Make in-process GroupByKey respect future Beam model
This introduces top-level classes:
- InProcessGroupByKey, which expands like GroupByKeyViaGroupByKeyOnly
but with different intermediate PCollection types.
- InProcessGroupByKeyOnly, which outputs KeyedWorkItem<K, V>. This existed
already under a different name.
- InProcessGroupAlsoByWindow, which is evaluated directly and
accepts input elements of type KeyedWorkItem<K, V>.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---