[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner

2017-09-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16157414#comment-16157414
 ] 

ASF GitHub Bot commented on BEAM-165:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3705


> Add Hadoop MapReduce runner
> ---
>
> Key: BEAM-165
> URL: https://issues.apache.org/jira/browse/BEAM-165
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas, runner-mapreduce
>Reporter: Jean-Baptiste Onofré
>Assignee: Pei He
>
> I think a MapReduce runner could be a good addition to Beam. It would allow 
> users to smoothly "migrate" from MapReduce to Spark or Flink.
> Of course, the MapReduce runner will run in batch mode (not stream).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner

2017-08-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16147188#comment-16147188
 ] 

ASF GitHub Bot commented on BEAM-165:
-

Github user peihe closed the pull request at:

https://github.com/apache/beam/pull/3705


> Add Hadoop MapReduce runner
> ---
>
> Key: BEAM-165
> URL: https://issues.apache.org/jira/browse/BEAM-165
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-ideas, runner-mapreduce
>Reporter: Jean-Baptiste Onofré
>Assignee: Pei He
>
> I think a MapReduce runner could be a good addition to Beam. It would allow 
> users to smoothly "migrate" from MapReduce to Spark or Flink.
> Of course, the MapReduce runner will run in batch mode (not stream).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner

2017-08-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16119313#comment-16119313
 ] 

ASF GitHub Bot commented on BEAM-165:
-

GitHub user peihe opened a pull request:

https://github.com/apache/beam/pull/3705

[BEAM-165] Initial implementation of the MapReduce runner.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam mr-runner

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3705.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3705


commit 9fffd554f1e5fd6465989bb3568dfb6f2d854eeb
Author: Pei He 
Date:   2017-07-06T02:22:27Z

Initial commit for MapReduceRunner.

commit 3bacc3e6099718bbcb672ab738ad607204fa8487
Author: Pei He 
Date:   2017-07-11T02:45:11Z

MapReduceRunner: add Graph and its visitors.

commit b62238545c1ba95e9857710d91609431cd0a2f93
Author: Pei He 
Date:   2017-07-13T06:09:10Z

MapReduceRunner: add unit tests for GraphConverter and GraphPlanner.

commit 64548dc949d0251949efdd02df68eed6032a64f4
Author: Pei He 
Date:   2017-07-21T05:46:36Z

mr-runner: support BoundedSource with BeamInputFormat.

commit 3070fded4bc0dde8f08b63e53f94342d21d4bc53
Author: Pei He 
Date:   2017-07-24T12:15:37Z

mr-runner: add JobPrototype and translate it to a MR job.

commit 0e16c52463278c6c4f9db61253c6b8287c4718ff
Author: Pei He 
Date:   2017-07-25T13:44:34Z

mr-runner: add ParDoOperation and support ParDos chaining.

commit 72a50aa508726e34110475448e9bb52381711faf
Author: Pei He 
Date:   2017-07-26T13:19:30Z

mr-runner: add BeamReducer and support GroupByKey.

commit 1b449b0981ae2bb2e1b397113b48eec1df53a4b1
Author: Pei He 
Date:   2017-07-27T07:01:22Z

core-java: InMemoryTimerInternals expose getTimers() for timer firings in 
mr-runner.

commit 6d152a623550446b06bde91ad0c54df1f7e5c60b
Author: Pei He 
Date:   2017-07-27T02:52:32Z

mr-runner: support reduce side ParDos and WordCount.

commit 1ef0dec520ee301328007f99419c25b7a7b5b46f
Author: Pei He 
Date:   2017-07-27T07:05:06Z

mr-runner: add JarClassInstanceFactory to run ValidatesRunner tests.

commit 02c77375cc114a210f99079cf3efec3d2426941e
Author: Pei He 
Date:   2017-07-28T08:31:41Z

mr-runner: refactors and creates Graph data structures to handle general 
Beam pipelines.

commit bb3349e10c0cfacd81b610880ddfec030fedf34d
Author: Pei He 
Date:   2017-08-02T11:19:14Z

mr-runner: support graph visualization with dotfiles.

commit 0fd2f15847e1f9bdd42f4388f6de6e566f9b64ef
Author: Pei He 
Date:   2017-08-02T13:59:21Z

mr-runner: hack to get around that ViewAsXXX.expand() return wrong output 
PValue.

commit 5079322c2e2a092a85b9740d04a7ca9bd887460e
Author: Pei He 
Date:   2017-08-08T03:30:29Z

mr-runner: support PCollections materialization with multiple MR jobs.

commit ad4cd2d5ea2af795bba86319d6447e7f8c415bf2
Author: Pei He 
Date:   2017-08-08T07:49:04Z

mr-runner: support multiple SourceOperations by composing and partitioning.

commit de2859e1092bfc3fdd036c3becf9e79fbb8fc8fa
Author: Pei He 
Date:   2017-08-08T09:38:58Z

mr-runner: support side inputs by reading in all views contents.

commit 69ee0f92bf170f0628d788d5dabeb339e7f1ad0c
Author: Pei He 
Date:   2017-08-08T14:07:12Z

mr-runner: setup file paths for read and write sides of materialization.




> Add Hadoop MapReduce runner
> 

[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner

2017-04-04 Thread Mark Lester (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15955434#comment-15955434
 ] 

Mark Lester commented on BEAM-165:
--

Are you going to push to the same branch? Really excited to try this runner out.

> Add Hadoop MapReduce runner
> ---
>
> Key: BEAM-165
> URL: https://issues.apache.org/jira/browse/BEAM-165
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> I think a MapReduce runner could be a good addition to Beam. It would allow 
> users to smoothly "migrate" from MapReduce to Spark or Flink.
> Of course, the MapReduce runner will run in batch mode (not stream).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner

2017-03-23 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938504#comment-15938504
 ] 

Jean-Baptiste Onofré commented on BEAM-165:
---

As already said, I did a bunch of change on my local branch, but not yet push. 
I will push and let you know if you want to take a look.

> Add Hadoop MapReduce runner
> ---
>
> Key: BEAM-165
> URL: https://issues.apache.org/jira/browse/BEAM-165
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> I think a MapReduce runner could be a good addition to Beam. It would allow 
> users to smoothly "migrate" from MapReduce to Spark or Flink.
> Of course, the MapReduce runner will run in batch mode (not stream).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-165) Add Hadoop MapReduce runner

2017-03-23 Thread Michael Hogue (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938490#comment-15938490
 ] 

Michael Hogue commented on BEAM-165:


[~eljefe6aa] is spot on. For heavy MR users looking to transition to a higher 
level API, having a MR runner would make that transition much more smooth. 

> Add Hadoop MapReduce runner
> ---
>
> Key: BEAM-165
> URL: https://issues.apache.org/jira/browse/BEAM-165
> Project: Beam
>  Issue Type: Wish
>  Components: runner-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> I think a MapReduce runner could be a good addition to Beam. It would allow 
> users to smoothly "migrate" from MapReduce to Spark or Flink.
> Of course, the MapReduce runner will run in batch mode (not stream).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)