[
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279595#comment-15279595
]
ASF GitHub Bot commented on BEAM-79:
------------------------------------
GitHub user manuzhang opened a pull request:
https://github.com/apache/incubator-beam/pull/323
[BEAM-79] add Gearpump runner
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
- [ ] Make sure the PR title is formatted like:
`[BEAM-<Jira issue #>] Description of pull request`
- [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
Travis-CI on your fork and ensure the whole test matrix passes).
- [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
number, if there is one.
- [ ] If this contribution is large, please file an Apache
[Individual Contributor License
Agreement](https://www.apache.org/licenses/icla.txt).
---
This PR adds Gearpump runner to Beam meeting the goals of phase 1 in the
[design
document](https://docs.google.com/document/d/1nw64QUWVfT8L7FUprPGLEeNjSBpDMkn1otfLt2rHM5g/edit).
The Gearpump runner supports the following functionalities,
* Transformations: ParDo, GroupByKey, Flatten
* Windows: using Beam's window logic and code
* side outputs
* serialization/deserialization: using Gearpump's Kryo serializer
* sources: Beam's UnboundedSource
* message delivery guarantee: at-most-once
* tests: integration test for various translators
Here's a snapshot of running the following Beam example on Gearpump cluster
```java
PCollection<KV<String, Long>> wordCounts =
p.apply(Read.from(new UnboundedTextSource()).named("WordStream"))
.apply(ParDo.of(new ExtractWordsFn()))
.apply(Window.<String>into(FixedWindows.of(Duration.standardSeconds(10))))
.apply(Count.<String>perElement());
wordCounts.apply(ParDo.of(new FormatAsStringFn()));
```

Note that the Gearpump runner is still in early stage and lacking
capabilities like trigger, side inputs, aggregator. However, I'd like to have
the community to get a feel of what Gearpump is like, whether Beam and Gearpump
go well, and gather ideas for improvements.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/manuzhang/incubator-beam gearpump_runner
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/323.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #323
----
commit 73e5978f599bcf32ed8c2f1d54b6bd3bd8350092
Author: manuzhang <[email protected]>
Date: 2016-03-15T08:15:16Z
[BEAM-79] add Gearpump runner
----
> Gearpump runner
> ---------------
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
> Issue Type: New Feature
> Components: runner-ideas
> Reporter: Tyler Akidau
> Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of
> low-level primitives a la MillWheel, with some higher level primitives like
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)