[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
[ https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157414#comment-16157414 ] ASF GitHub Bot commented on BEAM-165: - Github user asfgit closed the pull request at: https://github.com/apache/beam/pull/3705 > Add Hadoop MapReduce runner > --- > > Key: BEAM-165 > URL: https://issues.apache.org/jira/browse/BEAM-165 > Project: Beam > Issue Type: New Feature > Components: runner-ideas, runner-mapreduce >Reporter: Jean-Baptiste Onofré >Assignee: Pei He > > I think a MapReduce runner could be a good addition to Beam. It would allow > users to smoothly "migrate" from MapReduce to Spark or Flink. > Of course, the MapReduce runner will run in batch mode (not stream). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
[ https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147188#comment-16147188 ] ASF GitHub Bot commented on BEAM-165: - Github user peihe closed the pull request at: https://github.com/apache/beam/pull/3705 > Add Hadoop MapReduce runner > --- > > Key: BEAM-165 > URL: https://issues.apache.org/jira/browse/BEAM-165 > Project: Beam > Issue Type: New Feature > Components: runner-ideas, runner-mapreduce >Reporter: Jean-Baptiste Onofré >Assignee: Pei He > > I think a MapReduce runner could be a good addition to Beam. It would allow > users to smoothly "migrate" from MapReduce to Spark or Flink. > Of course, the MapReduce runner will run in batch mode (not stream). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
[ https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119313#comment-16119313 ] ASF GitHub Bot commented on BEAM-165: - GitHub user peihe opened a pull request: https://github.com/apache/beam/pull/3705 [BEAM-165] Initial implementation of the MapReduce runner. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes. - [ ] Each commit in the pull request should have a meaningful subject line and body. - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue. - [ ] Write a pull request description that is detailed enough to understand what the pull request does, how, and why. - [ ] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/peihe/incubator-beam mr-runner Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/3705.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3705 commit 9fffd554f1e5fd6465989bb3568dfb6f2d854eeb Author: Pei HeDate: 2017-07-06T02:22:27Z Initial commit for MapReduceRunner. commit 3bacc3e6099718bbcb672ab738ad607204fa8487 Author: Pei He Date: 2017-07-11T02:45:11Z MapReduceRunner: add Graph and its visitors. commit b62238545c1ba95e9857710d91609431cd0a2f93 Author: Pei He Date: 2017-07-13T06:09:10Z MapReduceRunner: add unit tests for GraphConverter and GraphPlanner. commit 64548dc949d0251949efdd02df68eed6032a64f4 Author: Pei He Date: 2017-07-21T05:46:36Z mr-runner: support BoundedSource with BeamInputFormat. commit 3070fded4bc0dde8f08b63e53f94342d21d4bc53 Author: Pei He Date: 2017-07-24T12:15:37Z mr-runner: add JobPrototype and translate it to a MR job. commit 0e16c52463278c6c4f9db61253c6b8287c4718ff Author: Pei He Date: 2017-07-25T13:44:34Z mr-runner: add ParDoOperation and support ParDos chaining. commit 72a50aa508726e34110475448e9bb52381711faf Author: Pei He Date: 2017-07-26T13:19:30Z mr-runner: add BeamReducer and support GroupByKey. commit 1b449b0981ae2bb2e1b397113b48eec1df53a4b1 Author: Pei He Date: 2017-07-27T07:01:22Z core-java: InMemoryTimerInternals expose getTimers() for timer firings in mr-runner. commit 6d152a623550446b06bde91ad0c54df1f7e5c60b Author: Pei He Date: 2017-07-27T02:52:32Z mr-runner: support reduce side ParDos and WordCount. commit 1ef0dec520ee301328007f99419c25b7a7b5b46f Author: Pei He Date: 2017-07-27T07:05:06Z mr-runner: add JarClassInstanceFactory to run ValidatesRunner tests. commit 02c77375cc114a210f99079cf3efec3d2426941e Author: Pei He Date: 2017-07-28T08:31:41Z mr-runner: refactors and creates Graph data structures to handle general Beam pipelines. commit bb3349e10c0cfacd81b610880ddfec030fedf34d Author: Pei He Date: 2017-08-02T11:19:14Z mr-runner: support graph visualization with dotfiles. commit 0fd2f15847e1f9bdd42f4388f6de6e566f9b64ef Author: Pei He Date: 2017-08-02T13:59:21Z mr-runner: hack to get around that ViewAsXXX.expand() return wrong output PValue. commit 5079322c2e2a092a85b9740d04a7ca9bd887460e Author: Pei He Date: 2017-08-08T03:30:29Z mr-runner: support PCollections materialization with multiple MR jobs. commit ad4cd2d5ea2af795bba86319d6447e7f8c415bf2 Author: Pei He Date: 2017-08-08T07:49:04Z mr-runner: support multiple SourceOperations by composing and partitioning. commit de2859e1092bfc3fdd036c3becf9e79fbb8fc8fa Author: Pei He Date: 2017-08-08T09:38:58Z mr-runner: support side inputs by reading in all views contents. commit 69ee0f92bf170f0628d788d5dabeb339e7f1ad0c Author: Pei He Date: 2017-08-08T14:07:12Z mr-runner: setup file paths for read and write sides of materialization. > Add Hadoop MapReduce runner >
[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
[ https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955434#comment-15955434 ] Mark Lester commented on BEAM-165: -- Are you going to push to the same branch? Really excited to try this runner out. > Add Hadoop MapReduce runner > --- > > Key: BEAM-165 > URL: https://issues.apache.org/jira/browse/BEAM-165 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré > > I think a MapReduce runner could be a good addition to Beam. It would allow > users to smoothly "migrate" from MapReduce to Spark or Flink. > Of course, the MapReduce runner will run in batch mode (not stream). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
[ https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938504#comment-15938504 ] Jean-Baptiste Onofré commented on BEAM-165: --- As already said, I did a bunch of change on my local branch, but not yet push. I will push and let you know if you want to take a look. > Add Hadoop MapReduce runner > --- > > Key: BEAM-165 > URL: https://issues.apache.org/jira/browse/BEAM-165 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré > > I think a MapReduce runner could be a good addition to Beam. It would allow > users to smoothly "migrate" from MapReduce to Spark or Flink. > Of course, the MapReduce runner will run in batch mode (not stream). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner
[ https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938490#comment-15938490 ] Michael Hogue commented on BEAM-165: [~eljefe6aa] is spot on. For heavy MR users looking to transition to a higher level API, having a MR runner would make that transition much more smooth. > Add Hadoop MapReduce runner > --- > > Key: BEAM-165 > URL: https://issues.apache.org/jira/browse/BEAM-165 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré > > I think a MapReduce runner could be a good addition to Beam. It would allow > users to smoothly "migrate" from MapReduce to Spark or Flink. > Of course, the MapReduce runner will run in batch mode (not stream). -- This message was sent by Atlassian JIRA (v6.3.15#6346)