[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-11-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15658789#comment-15658789
 ] 

ASF GitHub Bot commented on BEAM-310:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1339


> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
> Fix For: 0.3.0-incubating
>
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-11-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15655261#comment-15655261
 ] 

ASF GitHub Bot commented on BEAM-310:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/1339

[BEAM-310] Actually Split Root Transforms

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Permit the ExecutorServiceParallelExecutor to control its own
ExecutorService by passing only a TargetParallelism parameter. Split
roots into the greater of 3 or the target parallelism.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam actually_split

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1339.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1339


commit b18b876de91bfc01c82cf10bf53eb27a5aef3b09
Author: Thomas Groh 
Date:   2016-11-10T21:47:40Z

Actually Split Root Transforms

Permit the ExecutorServiceParallelExecutor to control its own
ExecutorService by passing only a TargetParallelism parameter. Split
roots into the greater of 3 or the target parallelism.




> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
> Fix For: 0.3.0-incubating
>
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-10-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15576473#comment-15576473
 ] 

ASF GitHub Bot commented on BEAM-310:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/1063


> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-10-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553453#comment-15553453
 ] 

ASF GitHub Bot commented on BEAM-310:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/1063

[BEAM-310] Perform initial splitting in the DirectRunner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This allows sources to be read from in parallel and generates initial
splits.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam initial_splits

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/1063.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1063


commit 891e7fc9af4fc7540269e3dab2941f8c64d4ec84
Author: Thomas Groh 
Date:   2016-10-05T23:11:21Z

Perform initial splitting in the DirectRunner

This allows sources to be read from in parallel and generates initial
splits.




> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-09-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15536420#comment-15536420
 ] 

ASF GitHub Bot commented on BEAM-310:
-

Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/996


> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-09-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15534390#comment-15534390
 ] 

ASF GitHub Bot commented on BEAM-310:
-

Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/1019


> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-09-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15517886#comment-15517886
 ] 

ASF GitHub Bot commented on BEAM-310:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/996

[BEAM-310] Add RootTransformEvaluatorFactory, Use for Reads

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is an extension of TransformEvaluatorFactory that applies to
transforms that can be the root transform of a Pipeline. They produce
bundles which provide an impulse to the PTransforms that are at the root
of the Pipeline.

Add an ImpulseBundle implementation to represent this impulse as a
bundle that is not part of a PCollection.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam root_initial_splits

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/996.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #996


commit 8670bf7995639a95a3f2c3205b063f9c64ff18b9
Author: Thomas Groh 
Date:   2016-09-23T15:52:47Z

Add RootTransformEvaluatorFactory, Use for Reads

This is an extension of TransformEvaluatorFactory that applies to
transforms that can be the root transform of a Pipeline. They produce
bundles which provide an impulse to the PTransforms that are at the root
of the Pipeline.

Add an ImpulseBundle implementation to represent this impulse as a
bundle that is not part of a PCollection.




> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-06-28 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15353334#comment-15353334
 ] 

Daniel Halperin commented on BEAM-310:
--

Note that we should also exercise splitAtFraction.

> Exercise splitIntoBundles/generateInitialSplits in the Direct Runner
> 
>
> Key: BEAM-310
> URL: https://issues.apache.org/jira/browse/BEAM-310
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
> the methods by which sources can be accessed in parallel. Exercising these 
> methods allows reads (and all transforms downstream) to be executed in 
> parallel both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)