Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #66

2016-04-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-151) Create Dataflow Runner Package

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233285#comment-15233285
 ] 

ASF GitHub Bot commented on BEAM-151:
-

GitHub user lukecwik opened a pull request:

https://github.com/apache/incubator-beam/pull/158

[BEAM-151] Move maven archetypes build order to be after runners

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
sdks/java/maven-archtypes has several dependencies on the 
DataflowPipelineRunner. Until these are refactored out or a released artifact 
exists, we need to modify the build order.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam dataflow-runner-5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #158


commit ba4cdc5f3f753d6a3fd26f6b6d4e2f9c3d8f7e70
Author: Luke Cwik 
Date:   2016-04-09T01:37:46Z

[BEAM-151] Move maven archetypes build order to be after runners

Currently sdks/java/maven-archetypes has several dependencies
on the DataflowPipelineRunner. This is mainly throw the utility
class called DataflowExampleUtils.

This is a temporary workaround till examples get updated
to use only beam SDK.




> Create Dataflow Runner Package
> --
>
> Key: BEAM-151
> URL: https://issues.apache.org/jira/browse/BEAM-151
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> Move Dataflow runner out of SDK core and into new Dataflow runner maven 
> module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-151] Move maven archetypes buil...

2016-04-08 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/incubator-beam/pull/158

[BEAM-151] Move maven archetypes build order to be after runners

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
sdks/java/maven-archtypes has several dependencies on the 
DataflowPipelineRunner. Until these are refactored out or a released artifact 
exists, we need to modify the build order.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam dataflow-runner-5

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #158


commit ba4cdc5f3f753d6a3fd26f6b6d4e2f9c3d8f7e70
Author: Luke Cwik 
Date:   2016-04-09T01:37:46Z

[BEAM-151] Move maven archetypes build order to be after runners

Currently sdks/java/maven-archetypes has several dependencies
on the DataflowPipelineRunner. This is mainly throw the utility
class called DataflowExampleUtils.

This is a temporary workaround till examples get updated
to use only beam SDK.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #65

2016-04-08 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #65

2016-04-08 Thread Apache Jenkins Server
See 




[4/4] incubator-beam git commit: Closes #154

2016-04-08 Thread dhalperi
Closes #154


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6ab4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6ab4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6ab4

Branch: refs/heads/master
Commit: 6ab4cb9bf19cc6b62ac379dab4322150e637
Parents: 2ca5474 208c0db
Author: Dan Halperin 
Authored: Fri Apr 8 18:29:29 2016 -0700
Committer: Dan Halperin 
Committed: Fri Apr 8 18:29:29 2016 -0700

--
 .../sdk/testing/TestDataflowPipelineRunner.java | 83 +--
 .../dataflow/sdk/testing/TestPipeline.java  | 48 +--
 .../sdk/runners/PipelineRunnerTest.java |  2 -
 .../cloud/dataflow/sdk/testing/PAssertTest.java | 50 ---
 .../testing/TestDataflowPipelineRunnerTest.java | 87 
 5 files changed, 168 insertions(+), 102 deletions(-)
--




[3/4] incubator-beam git commit: [BEAM-151] Remove dependence on DataflowPipelineRunner in PAssert/TestPipeline

2016-04-08 Thread dhalperi
[BEAM-151] Remove dependence on DataflowPipelineRunner in PAssert/TestPipeline

Added the ability for TestDataflowPipelineRunner to throw an AssertionError
when the pipeline fails using the first job error message.

This allows for moving Dataflow to a new package.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c726cf13
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c726cf13
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c726cf13

Branch: refs/heads/master
Commit: c726cf1376d17489a4834780e0f4a04b6edbc603
Parents: 2ca5474
Author: Luke Cwik 
Authored: Fri Apr 8 15:07:52 2016 -0700
Committer: Dan Halperin 
Committed: Fri Apr 8 18:29:28 2016 -0700

--
 .../sdk/testing/TestDataflowPipelineRunner.java | 74 +---
 .../dataflow/sdk/testing/TestPipeline.java  | 15 +---
 .../cloud/dataflow/sdk/testing/PAssertTest.java | 46 
 3 files changed, 65 insertions(+), 70 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c726cf13/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestDataflowPipelineRunner.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestDataflowPipelineRunner.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestDataflowPipelineRunner.java
index 8f7374f..4bd9484 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestDataflowPipelineRunner.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestDataflowPipelineRunner.java
@@ -42,6 +42,7 @@ import java.io.IOException;
 import java.math.BigDecimal;
 import java.util.List;
 import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutionException;
 import java.util.concurrent.Future;
 import java.util.concurrent.TimeUnit;
 
@@ -81,8 +82,6 @@ public class TestDataflowPipelineRunner extends 
PipelineRunner result;
+
   if (options.isStreaming()) {
 Future resultFuture = 
options.getExecutorService().submit(
 new Callable() {
@@ -114,24 +117,7 @@ public class TestDataflowPipelineRunner extends 
PipelineRunner messages) {
-  messageHandler.process(messages);
-  for (JobMessage message : messages) {
-if (message.getMessageImportance() != null
-&& 
message.getMessageImportance().equals("JOB_MESSAGE_ERROR")) {
-  LOG.info("Dataflow job {} threw exception, cancelling. 
Exception was: {}",
-  job.getJobId(), message.getMessageText());
-  try {
-job.cancel();
-  } catch (Exception e) {
-throw Throwables.propagate(e);
-  }
-}
-  }
-}
-  });
+State finalState = job.waitToFinish(10L, TimeUnit.MINUTES, 
messageHandler);
 if (finalState == null || finalState == State.RUNNING) {
   LOG.info("Dataflow job {} took longer than 10 minutes to complete, 
cancelling.",
   job.getJobId());
@@ -146,11 +132,18 @@ public class TestDataflowPipelineRunner extends 
PipelineRunner messages) {
+  messageHandler.process(messages);
+  for (JobMessage message : messages) {
+if (message.getMessageImportance() != null
+&& message.getMessageImportance().equals("JOB_MESSAGE_ERROR")) {
+  errorMessage = message.getMessageText();
+  LOG.info("Dataflow job {} threw exception, cancelling. Exception 
was: {}",
+  job.getJobId(), errorMessage);
+  try {
+job.cancel();
+  } catch (Exception e) {
+throw Throwables.propagate(e);
+  }
+
+}
+  }
+}
+
+private String getErrorMessage() {
+  return errorMessage;
+}
+  }
 }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c726cf13/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestPipeline.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestPipeline.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestPipeline.java
index 98d4823..9eab4b1 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestPipeline.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/TestPipeline.java
@@ -24,7 +24,6 @@ import com.google.cloud.dataflow.sdk.options.GcpOptions;
 import com.google.cloud.dataflow.sdk.options.PipelineOptions;
 import 

[jira] [Commented] (BEAM-151) Create Dataflow Runner Package

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233272#comment-15233272
 ] 

ASF GitHub Bot commented on BEAM-151:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/154


> Create Dataflow Runner Package
> --
>
> Key: BEAM-151
> URL: https://issues.apache.org/jira/browse/BEAM-151
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> Move Dataflow runner out of SDK core and into new Dataflow runner maven 
> module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-151] Remove DataflowPipelineRun...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/154


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/4] incubator-beam git commit: [BEAM-151] Remove unused reference on Dataflow client from test

2016-04-08 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 2ca54742b -> 6ab4c


[BEAM-151] Remove unused reference on Dataflow client from test


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/208c0db2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/208c0db2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/208c0db2

Branch: refs/heads/master
Commit: 208c0db2ef10584efba4ad6e09c80433027d2a4d
Parents: e87e46b
Author: Luke Cwik 
Authored: Fri Apr 8 17:08:29 2016 -0700
Committer: Dan Halperin 
Committed: Fri Apr 8 18:29:28 2016 -0700

--
 .../com/google/cloud/dataflow/sdk/runners/PipelineRunnerTest.java  | 2 --
 1 file changed, 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/208c0db2/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/PipelineRunnerTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/PipelineRunnerTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/PipelineRunnerTest.java
index c702bc1..d072e62 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/PipelineRunnerTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/PipelineRunnerTest.java
@@ -19,7 +19,6 @@ package com.google.cloud.dataflow.sdk.runners;
 
 import static org.junit.Assert.assertTrue;
 
-import com.google.api.services.dataflow.Dataflow;
 import com.google.cloud.dataflow.sdk.options.ApplicationNameOptions;
 import com.google.cloud.dataflow.sdk.options.DirectPipelineOptions;
 import com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory;
@@ -40,7 +39,6 @@ import org.mockito.MockitoAnnotations;
 @RunWith(JUnit4.class)
 public class PipelineRunnerTest {
 
-  @Mock private Dataflow mockDataflow;
   @Mock private GcsUtil mockGcsUtil;
 
   @Before



[1/2] incubator-beam git commit: [BEAM-186] Fix pubsub injector for streaming examples

2016-04-08 Thread bchambers
Repository: incubator-beam
Updated Branches:
  refs/heads/master a32a26208 -> 2ca54742b


[BEAM-186] Fix pubsub injector for streaming examples


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1ff8948c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1ff8948c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1ff8948c

Branch: refs/heads/master
Commit: 1ff8948cbee07a6f9248685cc79369679e8b6a8b
Parents: a32a262
Author: Henning Rohde 
Authored: Fri Apr 8 17:06:18 2016 -0700
Committer: bchambers 
Committed: Fri Apr 8 18:12:12 2016 -0700

--
 .../cloud/dataflow/examples/common/DataflowExampleUtils.java   | 2 ++
 .../src/main/java/common/DataflowExampleUtils.java | 2 ++
 2 files changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1ff8948c/examples/java/src/main/java/com/google/cloud/dataflow/examples/common/DataflowExampleUtils.java
--
diff --git 
a/examples/java/src/main/java/com/google/cloud/dataflow/examples/common/DataflowExampleUtils.java
 
b/examples/java/src/main/java/com/google/cloud/dataflow/examples/common/DataflowExampleUtils.java
index 1b2a57c..5b98170 100644
--- 
a/examples/java/src/main/java/com/google/cloud/dataflow/examples/common/DataflowExampleUtils.java
+++ 
b/examples/java/src/main/java/com/google/cloud/dataflow/examples/common/DataflowExampleUtils.java
@@ -359,6 +359,8 @@ public class DataflowExampleUtils {
   
copiedOptions.setServiceAccountKeyfile(options.getServiceAccountKeyfile());
 }
 copiedOptions.setStreaming(false);
+copiedOptions.setWorkerHarnessContainerImage(
+DataflowPipelineRunner.BATCH_WORKER_HARNESS_CONTAINER_IMAGE);
 
copiedOptions.setNumWorkers(options.as(DataflowExampleOptions.class).getInjectorNumWorkers());
 copiedOptions.setJobName(options.getJobName() + "-injector");
 Pipeline injectorPipeline = Pipeline.create(copiedOptions);

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/1ff8948c/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java
--
diff --git 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java
 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java
index 443a396..5042c2e 100644
--- 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java
+++ 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/src/main/java/common/DataflowExampleUtils.java
@@ -275,6 +275,8 @@ public class DataflowExampleUtils {
   public void runInjectorPipeline(String inputFile, String topic) {
 DataflowPipelineOptions copiedOptions = 
options.cloneAs(DataflowPipelineOptions.class);
 copiedOptions.setStreaming(false);
+copiedOptions.setWorkerHarnessContainerImage(
+DataflowPipelineRunner.BATCH_WORKER_HARNESS_CONTAINER_IMAGE);
 copiedOptions.setNumWorkers(
 options.as(ExamplePubsubTopicOptions.class).getInjectorNumWorkers());
 copiedOptions.setJobName(options.getJobName() + "-injector");



[jira] [Commented] (BEAM-186) The streaming WindowedWordCount example is broken on Dataflow

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233262#comment-15233262
 ] 

ASF GitHub Bot commented on BEAM-186:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/156


> The streaming WindowedWordCount example is broken on Dataflow
> -
>
> Key: BEAM-186
> URL: https://issues.apache.org/jira/browse/BEAM-186
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Henning Korsholm Rohde
>Assignee: Henning Korsholm Rohde
>Priority: Minor
>
> The streaming version of WindowedWordCount example is broken on Dataflow. 
> Currently, the injector cannot start up and emits the error 
> "java.lang.RuntimeException: DataflowWorkerHarness should not be used for 
> non-streaming pipelines". This in turn means that no messages are generated 
> for the streaming pipeline to process.
> The problem is that DataflowExampleUtils uses a batch workflow to inject 
> Google PubSub messages. This workflow reuses the real options and modifies 
> them to be batch, setting isStreaming to false etc. However, with versioned 
> containers, the option for workerHarnessContainerImage must now also be 
> changed to use the batch image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-186] Fix pubsub injector for st...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/156


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #156

2016-04-08 Thread bchambers
This closes #156


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2ca54742
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2ca54742
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2ca54742

Branch: refs/heads/master
Commit: 2ca54742b1509abdf03fcee76f476a3c5df4a055
Parents: a32a262 1ff8948
Author: bchambers 
Authored: Fri Apr 8 18:12:17 2016 -0700
Committer: bchambers 
Committed: Fri Apr 8 18:12:17 2016 -0700

--
 .../cloud/dataflow/examples/common/DataflowExampleUtils.java   | 2 ++
 .../src/main/java/common/DataflowExampleUtils.java | 2 ++
 2 files changed, 4 insertions(+)
--




Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #63

2016-04-08 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #64

2016-04-08 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #62

2016-04-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-186) The streaming WindowedWordCount example is broken on Dataflow

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233186#comment-15233186
 ] 

ASF GitHub Bot commented on BEAM-186:
-

GitHub user herohde opened a pull request:

https://github.com/apache/incubator-beam/pull/156

[BEAM-186] Fix pubsub injector for streaming examples




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/herohde/incubator-beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/156.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #156


commit 2ea17a4f1959838ef955abbe27b4b914f4169006
Author: Henning Rohde 
Date:   2016-04-09T00:06:18Z

[BEAM-186] Fix pubsub injector for streaming examples




> The streaming WindowedWordCount example is broken on Dataflow
> -
>
> Key: BEAM-186
> URL: https://issues.apache.org/jira/browse/BEAM-186
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Henning Korsholm Rohde
>Assignee: Henning Korsholm Rohde
>Priority: Minor
>
> The streaming version of WindowedWordCount example is broken on Dataflow. 
> Currently, the injector cannot start up and emits the error 
> "java.lang.RuntimeException: DataflowWorkerHarness should not be used for 
> non-streaming pipelines". This in turn means that no messages are generated 
> for the streaming pipeline to process.
> The problem is that DataflowExampleUtils uses a batch workflow to inject 
> Google PubSub messages. This workflow reuses the real options and modifies 
> them to be batch, setting isStreaming to false etc. However, with versioned 
> containers, the option for workerHarnessContainerImage must now also be 
> changed to use the batch image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-186] Fix pubsub injector for st...

2016-04-08 Thread herohde
GitHub user herohde opened a pull request:

https://github.com/apache/incubator-beam/pull/156

[BEAM-186] Fix pubsub injector for streaming examples




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/herohde/incubator-beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/156.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #156


commit 2ea17a4f1959838ef955abbe27b4b914f4169006
Author: Henning Rohde 
Date:   2016-04-09T00:06:18Z

[BEAM-186] Fix pubsub injector for streaming examples




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-115) Beam Runner API

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15233182#comment-15233182
 ] 

ASF GitHub Bot commented on BEAM-115:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/147


> Beam Runner API
> ---
>
> Key: BEAM-115
> URL: https://issues.apache.org/jira/browse/BEAM-115
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The PipelineRunner API from the SDK is not ideal for the Beam technical 
> vision.
> It has technical limitations:
>  - The user's DAG (even including library expansions) is never explicitly 
> represented, so it cannot be analyzed except incrementally, and cannot 
> necessarily be reconstructed (for example, to display it!).
>  - The flattened DAG of just primitive transforms isn't well-suited for 
> display or transform override.
>  - The TransformHierarchy isn't well-suited for optimizations.
>  - The user must realistically pre-commit to a runner, and its configuration 
> (batch vs streaming) prior to graph construction, since the runner will be 
> modifying the graph as it is built.
>  - It is fairly language- and SDK-specific.
> It has usability issues (these are not from intuition, but derived from 
> actual cases of failure to use according to the design)
>  - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner 
> is confusing.
>  - The TransformHierarchy, accessible only via visitor traversals, is 
> cumbersome.
>  - The staging of construction-time vs run-time is not always obvious.
> These are just examples. This ticket tracks designing, coming to consensus, 
> and building an API that more simply and directly supports the technical 
> vision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-186) The streaming WindowedWordCount example is broken on Dataflow

2016-04-08 Thread Henning Korsholm Rohde (JIRA)
Henning Korsholm Rohde created BEAM-186:
---

 Summary: The streaming WindowedWordCount example is broken on 
Dataflow
 Key: BEAM-186
 URL: https://issues.apache.org/jira/browse/BEAM-186
 Project: Beam
  Issue Type: Bug
  Components: examples-java
Reporter: Henning Korsholm Rohde
Assignee: Frances Perry
Priority: Minor


The streaming version of WindowedWordCount example is broken on Dataflow. 
Currently, the injector cannot start up and emits the error 
"java.lang.RuntimeException: DataflowWorkerHarness should not be used for 
non-streaming pipelines". This in turn means that no messages are generated for 
the streaming pipeline to process.

The problem is that DataflowExampleUtils uses a batch workflow to inject Google 
PubSub messages. This workflow reuses the real options and modifies them to be 
batch, setting isStreaming to false etc. However, with versioned containers, 
the option for workerHarnessContainerImage must now also be changed to use the 
batch image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #147

2016-04-08 Thread kenn
This closes #147


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a32a2620
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a32a2620
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a32a2620

Branch: refs/heads/master
Commit: a32a26208854a0875b10026d8f659ac275e37560
Parents: a43f9b8 42969cb
Author: Kenneth Knowles 
Authored: Fri Apr 8 17:04:03 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 8 17:04:03 2016 -0700

--
 .../sdk/runners/DirectPipelineRunner.java   | 47 
 .../inprocess/WindowEvaluatorFactory.java   | 13 +++---
 .../sdk/transforms/windowing/Window.java| 21 ++---
 .../cloud/dataflow/sdk/util/AssignWindows.java  | 46 +++
 4 files changed, 104 insertions(+), 23 deletions(-)
--




[1/2] incubator-beam git commit: Move expansion of Window.Bound into DirectPipelineRunner

2016-04-08 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master a43f9b820 -> a32a26208


Move expansion of Window.Bound into DirectPipelineRunner

In the Beam model, windowing is a primitive concept. The expansion provided
by the SDK is not implementable except via access to privileged methods
not intended for Beam pipeline authors.

This change is a precursor to eliminating these privileged entirely.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/42969cb6
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/42969cb6
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/42969cb6

Branch: refs/heads/master
Commit: 42969cb6744c41debe575857fb7d093ce527
Parents: 5f24cef
Author: Kenneth Knowles 
Authored: Thu Apr 7 17:34:19 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 8 15:03:00 2016 -0700

--
 .../sdk/runners/DirectPipelineRunner.java   | 47 
 .../inprocess/WindowEvaluatorFactory.java   | 13 +++---
 .../sdk/transforms/windowing/Window.java| 21 ++---
 .../cloud/dataflow/sdk/util/AssignWindows.java  | 46 +++
 4 files changed, 104 insertions(+), 23 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/42969cb6/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java
index 35e392b..57e6116 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DirectPipelineRunner.java
@@ -47,7 +47,10 @@ import com.google.cloud.dataflow.sdk.transforms.ParDo;
 import com.google.cloud.dataflow.sdk.transforms.Partition;
 import com.google.cloud.dataflow.sdk.transforms.Partition.PartitionFn;
 import com.google.cloud.dataflow.sdk.transforms.windowing.BoundedWindow;
+import com.google.cloud.dataflow.sdk.transforms.windowing.Window;
+import com.google.cloud.dataflow.sdk.transforms.windowing.WindowFn;
 import com.google.cloud.dataflow.sdk.util.AppliedCombineFn;
+import com.google.cloud.dataflow.sdk.util.AssignWindows;
 import com.google.cloud.dataflow.sdk.util.GroupByKeyViaGroupByKeyOnly;
 import 
com.google.cloud.dataflow.sdk.util.GroupByKeyViaGroupByKeyOnly.GroupByKeyOnly;
 import com.google.cloud.dataflow.sdk.util.IOChannelUtils;
@@ -57,6 +60,7 @@ import 
com.google.cloud.dataflow.sdk.util.PerKeyCombineFnRunners;
 import com.google.cloud.dataflow.sdk.util.SerializableUtils;
 import com.google.cloud.dataflow.sdk.util.TestCredential;
 import com.google.cloud.dataflow.sdk.util.WindowedValue;
+import com.google.cloud.dataflow.sdk.util.WindowingStrategy;
 import com.google.cloud.dataflow.sdk.util.common.Counter;
 import com.google.cloud.dataflow.sdk.util.common.CounterSet;
 import com.google.cloud.dataflow.sdk.values.KV;
@@ -255,6 +259,9 @@ public class DirectPipelineRunner
 } else if (transform instanceof GroupByKey) {
   return (OutputT)
   ((PCollection) input).apply(new 
GroupByKeyViaGroupByKeyOnly((GroupByKey) transform));
+} else if (transform instanceof Window.Bound) {
+  return (OutputT)
+  ((PCollection) input).apply(new 
AssignWindowsAndSetStrategy((Window.Bound) transform));
 } else {
   return super.apply(transform, input);
 }
@@ -400,6 +407,46 @@ public class DirectPipelineRunner
 }
   }
 
+  private static class AssignWindowsAndSetStrategy
+  extends PTransform {
+
+private final Window.Bound wrapped;
+
+public AssignWindowsAndSetStrategy(Window.Bound wrapped) {
+  this.wrapped = wrapped;
+}
+
+@Override
+public PCollection apply(PCollection input) {
+  WindowingStrategy outputStrategy =
+  wrapped.getOutputStrategyInternal(input.getWindowingStrategy());
+
+  WindowFn windowFn =
+  (WindowFn) outputStrategy.getWindowFn();
+
+  // If the Window.Bound transform only changed parts other than the 
WindowFn, then
+  // we skip AssignWindows even though it should be harmless in a perfect 
world.
+  // The world is not perfect, and a GBK may have set it to InvalidWindows 
to forcibly
+  // crash if another GBK is performed without explicitly setting the 
WindowFn. So we skip
+  // AssignWindows in this case.
+  if (wrapped.getWindowFn() == null) {
+return input.apply("Identity", ParDo.of(new IdentityFn()))
+ 

[GitHub] incubator-beam pull request: [BEAM-115] Move expansion of Window.B...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/147


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-185) XmlSink output file pattern missing "." in extension

2016-04-08 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-185:
-

 Summary: XmlSink output file pattern missing "." in extension
 Key: BEAM-185
 URL: https://issues.apache.org/jira/browse/BEAM-185
 Project: Beam
  Issue Type: Bug
Reporter: Scott Wegner
Priority: Minor


The XmlSink takes as input a filename prefix and adds the shard name and 
extension automatically. However, it is missing the "." when adding the 
extension.

For an XmlSink configured as:

{{XmlSink.write().toFilenamePrefix("foobar");}}

the fileNamePattern is {{foobar-S-of-Nxml}}. Instead, it should be 
{{foobar-S-of-N.xml}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: FileBasedSource: throw IOException in...

2016-04-08 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/155

FileBasedSource: throw IOException instead of Exception where possible

- Replace "throws Exception" with "throws IOException" from APIs
  that already only throw IOException.
- Refactor one function to catch, handle, and rethrow
  (Interrupted|Execution)Exception.

This makes it easier to write a new FileBasedSource, because the
FilebasedReader functions are only registered to throw IOException.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam filebasedsource

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/155.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #155


commit 8c3c94b6e8809d0ea7e276e9a0e29799182817df
Author: Dan Halperin 
Date:   2016-04-08T23:09:21Z

FileBasedSource: throw IOException instead of Exception where possible

- Replace "throws Exception" with "throws IOException" from APIs
  that already only throw IOException.
- Refactor one function to catch, handle, and rethrow
  (Interrupted|Execution)Exception.

This makes it easier to write a new FileBasedSource, because the
FilebasedReader functions are only registered to throw IOException.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: [BEAM-151] Remove DataflowPipelineRun...

2016-04-08 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/incubator-beam/pull/154

[BEAM-151] Remove DataflowPipelineRunner refs in TestPipeline/PAssert

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam dataflow-runner-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/154.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #154


commit e403691b01e8deabd424bade0127e873004627f9
Author: Luke Cwik 
Date:   2016-04-08T22:07:52Z

[BEAM-151] Remove dependence on DataflowPipelineRunner in 
PAssert/TestPipeline

Added the ability for TestDataflowPipelineRunner to throw an AssertionError
when the pipeline fails using the first job error message.

This allows for moving Dataflow to a new package.

commit d84b921176e926cd71c540285f07fd37f921b0ec
Author: Luke Cwik 
Date:   2016-04-08T22:49:53Z

[BEAM-151] Expand javadoc in TestPipeline explaining usage

Update TestDataflowPipelineRunner to capture all the failure messages.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-184) Using Merging Windows and/or Triggers without a downstream aggregation should fail

2016-04-08 Thread Ben Chambers (JIRA)
Ben Chambers created BEAM-184:
-

 Summary: Using Merging Windows and/or Triggers without a 
downstream aggregation should fail
 Key: BEAM-184
 URL: https://issues.apache.org/jira/browse/BEAM-184
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Ben Chambers
Assignee: Davor Bonaci
Priority: Minor


Both merging windows (such as sessions) and triggering only actually happen at 
an aggregation (GroupByKey). We should produce errors in any of these cases:

1. Merging window used without a downstream GroupByKey
2. Triggers used without a downstream GroupuByKey
3. Window inspected after inserting a merging window and before the GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Enforce non-null preconditi...

2016-04-08 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/153

[BEAM-22] Enforce non-null preconditions

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

The InProcessPipelineRunner expects some result values to be non-null.
Enforce those early.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_cnn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/153.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #153






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #61

2016-04-08 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_RunnableOnService_GoogleCloudDataflow #61

2016-04-08 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_RunnableOnService_GoogleCloudDataflow #60

2016-04-08 Thread Apache Jenkins Server
See 




[1/3] incubator-beam git commit: [BEAM-151] Update Dataflow worker container label

2016-04-08 Thread lcwik
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5d78420bf -> a43f9b820


[BEAM-151] Update Dataflow worker container label

This new worker container label includes the change required
to decouple the worker from depending on the deleted code
in PipelineOptionsFactory.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e74194e7
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e74194e7
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e74194e7

Branch: refs/heads/master
Commit: e74194e71fb94b915eb7080c6037127caffbbc05
Parents: 02ee745
Author: Luke Cwik 
Authored: Fri Apr 8 10:14:12 2016 -0700
Committer: Luke Cwik 
Committed: Fri Apr 8 14:27:36 2016 -0700

--
 .../cloud/dataflow/sdk/runners/DataflowPipelineRunner.java   | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e74194e7/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java
index 50ca36f..a154848 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java
@@ -212,9 +212,9 @@ public class DataflowPipelineRunner extends 
PipelineRunner
   // Default Docker container images that execute Dataflow worker harness, 
residing in Google
   // Container Registry, separately for Batch and Streaming.
   public static final String BATCH_WORKER_HARNESS_CONTAINER_IMAGE
-  = "dataflow.gcr.io/v1beta3/java-batch:1.5.0";
+  = "dataflow.gcr.io/v1beta3/java-batch:1.5.1";
   public static final String STREAMING_WORKER_HARNESS_CONTAINER_IMAGE
-  = "dataflow.gcr.io/v1beta3/java-streaming:1.5.0";
+  = "dataflow.gcr.io/v1beta3/java-streaming:1.5.1";
 
   // The limit of CreateJob request size.
   private static final int CREATE_JOB_REQUEST_LIMIT_BYTES = 10 * 1024 * 1024;



[GitHub] incubator-beam pull request: [BEAM-151] Clean up reference to Data...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/78


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/3] incubator-beam git commit: [BEAM-151] Clean up reference to DataflowWorkerHarnessOptions

2016-04-08 Thread lcwik
[BEAM-151] Clean up reference to DataflowWorkerHarnessOptions

This prevents moving DataflowWorkerHarnessOptions to Dataflow Runner maven
module.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/02ee7457
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/02ee7457
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/02ee7457

Branch: refs/heads/master
Commit: 02ee74579171a8e47909b23425dcd906f22786cd
Parents: 5d78420
Author: Luke Cwik 
Authored: Fri Mar 25 15:28:43 2016 -0700
Committer: Luke Cwik 
Committed: Fri Apr 8 14:27:36 2016 -0700

--
 .../sdk/options/PipelineOptionsFactory.java | 48 
 .../sdk/options/PipelineOptionsFactoryTest.java | 17 ---
 2 files changed, 65 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02ee7457/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactory.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactory.java
index 988d346..dac7726 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactory.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactory.java
@@ -831,54 +831,6 @@ public class PipelineOptionsFactory {
   }
 
   /**
-   * Creates a set of Dataflow worker harness options based of a set of known 
system
-   * properties. This is meant to only be used from the Dataflow worker 
harness as a method to
-   * bootstrap the worker harness.
-   *
-   * For internal use only.
-   *
-   * @return A {@link DataflowWorkerHarnessOptions} object configured for the
-   * Dataflow worker harness.
-   */
-  public static DataflowWorkerHarnessOptions 
createFromSystemPropertiesInternal()
-  throws IOException {
-return createFromSystemProperties();
-  }
-
-  /**
-   * Creates a set of {@link DataflowWorkerHarnessOptions} based of a set of 
known system
-   * properties. This is meant to only be used from the Dataflow worker 
harness as a method to
-   * bootstrap the worker harness.
-   *
-   * @return A {@link DataflowWorkerHarnessOptions} object configured for the
-   * Dataflow worker harness.
-   * @deprecated for internal use only
-   */
-  @Deprecated
-  public static DataflowWorkerHarnessOptions createFromSystemProperties() 
throws IOException {
-ObjectMapper objectMapper = new ObjectMapper();
-DataflowWorkerHarnessOptions options;
-if (System.getProperties().containsKey("sdk_pipeline_options")) {
-  String serializedOptions = System.getProperty("sdk_pipeline_options");
-  LOG.info("Worker harness starting with: " + serializedOptions);
-  options = objectMapper.readValue(serializedOptions, 
PipelineOptions.class)
-  .as(DataflowWorkerHarnessOptions.class);
-} else {
-  options = PipelineOptionsFactory.as(DataflowWorkerHarnessOptions.class);
-}
-
-// These values will not be known at job submission time and must be 
provided.
-if (System.getProperties().containsKey("worker_id")) {
-  options.setWorkerId(System.getProperty("worker_id"));
-}
-if (System.getProperties().containsKey("job_id")) {
-  options.setJobId(System.getProperty("job_id"));
-}
-
-return options;
-  }
-
-  /**
* This method is meant to emulate the behavior of {@link 
Introspector#getBeanInfo(Class, int)}
* to construct the list of {@link PropertyDescriptor}.
*

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/02ee7457/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactoryTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactoryTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactoryTest.java
index d8ba8e3..6ba1e00 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactoryTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/options/PipelineOptionsFactoryTest.java
@@ -78,23 +78,6 @@ public class PipelineOptionsFactoryTest {
   }
 
   @Test
-  public void testCreationFromSystemProperties() throws Exception {
-System.getProperties().putAll(ImmutableMap
-.builder()
-.put("worker_id", "test_worker_id")
-.put("job_id", "test_job_id")
-// Set a non-default value for testing
-

[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232974#comment-15232974
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/143


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Use a weakValues LoadingCac...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/143


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[3/3] incubator-beam git commit: [BEAM-22] This closes #143

2016-04-08 Thread lcwik
[BEAM-22] This closes #143


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5d78420b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5d78420b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5d78420b

Branch: refs/heads/master
Commit: 5d78420bf31ca7e289fc956dc5365d920afd2439
Parents: 529bcdf 6f52637
Author: Luke Cwik 
Authored: Fri Apr 8 14:21:30 2016 -0700
Committer: Luke Cwik 
Committed: Fri Apr 8 14:21:30 2016 -0700

--
 .../ExecutorServiceParallelExecutor.java| 40 ++--
 1 file changed, 28 insertions(+), 12 deletions(-)
--




[1/3] incubator-beam git commit: Use a weakValues LoadingCache for serial TransformExecutorServices

2016-04-08 Thread lcwik
Repository: incubator-beam
Updated Branches:
  refs/heads/master 529bcdf56 -> 5d78420bf


Use a weakValues LoadingCache for serial TransformExecutorServices

This allows the garbage collector to clean up references to
TransformExecutorServices which are not currently in use.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6f526374
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6f526374
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6f526374

Branch: refs/heads/master
Commit: 6f526374fbc743a0d22e37ad0f746f0a695785dd
Parents: ad58e26
Author: Thomas Groh 
Authored: Tue Mar 29 10:56:10 2016 -0700
Committer: Luke Cwik 
Committed: Fri Apr 8 14:20:20 2016 -0700

--
 .../ExecutorServiceParallelExecutor.java| 36 ++--
 1 file changed, 25 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6f526374/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java
index 4d45e8f..c770735 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java
@@ -31,6 +31,9 @@ import com.google.cloud.dataflow.sdk.values.PCollection;
 import com.google.cloud.dataflow.sdk.values.PValue;
 import com.google.common.base.MoreObjects;
 import com.google.common.base.Optional;
+import com.google.common.cache.CacheBuilder;
+import com.google.common.cache.CacheLoader;
+import com.google.common.cache.LoadingCache;
 import com.google.common.collect.ImmutableList;
 
 import org.joda.time.Instant;
@@ -69,7 +72,7 @@ final class ExecutorServiceParallelExecutor implements 
InProcessExecutor {
 
   private final InProcessEvaluationContext evaluationContext;
 
-  private final ConcurrentMap 
currentEvaluations;
+  private final LoadingCache 
executorServices;
   private final ConcurrentMap 
scheduledExecutors;
 
   private final Queue allUpdates;
@@ -107,8 +110,12 @@ final class ExecutorServiceParallelExecutor implements 
InProcessExecutor {
 this.transformEnforcements = transformEnforcements;
 this.evaluationContext = context;
 
-currentEvaluations = new ConcurrentHashMap<>();
 scheduledExecutors = new ConcurrentHashMap<>();
+// Weak Values allows TransformExecutorServices that are no longer in use 
to be reclaimed.
+// Executing TransformExecutorServices have a strong reference to their 
TransformExecutorService
+// which stops the TransformExecutorServices from being prematurely 
garbage collected
+executorServices =
+
CacheBuilder.newBuilder().weakValues().build(serialTransformExecutorServiceCacheLoader());
 
 this.allUpdates = new ConcurrentLinkedQueue<>();
 this.visibleUpdates = new ArrayBlockingQueue<>(20);
@@ -118,6 +125,16 @@ final class ExecutorServiceParallelExecutor implements 
InProcessExecutor {
 defaultCompletionCallback = new DefaultCompletionCallback();
   }
 
+  private CacheLoader
+  serialTransformExecutorServiceCacheLoader() {
+return new CacheLoader() {
+  @Override
+  public TransformExecutorService load(StepAndKey stepAndKey) throws 
Exception {
+return TransformExecutorServices.serial(executorService, 
scheduledExecutors);
+  }
+};
+  }
+
   @Override
   public void start(Collection roots) {
 rootNodes = ImmutableList.copyOf(roots);
@@ -142,7 +159,12 @@ final class ExecutorServiceParallelExecutor implements 
InProcessExecutor {
 if (bundle != null && isKeyed(bundle.getPCollection())) {
   final StepAndKey stepAndKey =
   StepAndKey.of(transform, bundle == null ? null : bundle.getKey());
-  transformExecutor = getSerialExecutorService(stepAndKey);
+  // This executor will remain reachable until it has executed all 
scheduled transforms.
+  // The TransformExecutors keep a strong reference to the Executor, the 
ExecutorService keeps
+  // a reference to the scheduled TransformExecutor callable. Follow-up 
TransformExecutors
+  // (scheduled due to the completion of another TransformExecutor) are 
provided to 

[5/5] incubator-beam git commit: This closes #146

2016-04-08 Thread davor
This closes #146


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/529bcdf5
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/529bcdf5
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/529bcdf5

Branch: refs/heads/master
Commit: 529bcdf5688139ae76bcbcaa1c0dbbf44c3008bc
Parents: 5f24cef d73ceab
Author: Davor Bonaci 
Authored: Fri Apr 8 13:55:07 2016 -0700
Committer: Davor Bonaci 
Committed: Fri Apr 8 13:55:07 2016 -0700

--
 .../contrib/joinlibrary/InnerJoinTest.java  |  10 +-
 .../contrib/joinlibrary/OuterLeftJoinTest.java  |  10 +-
 .../contrib/joinlibrary/OuterRightJoinTest.java |  10 +-
 examples/java/pom.xml   |   2 +-
 .../dataflow/examples/DebuggingWordCount.java   |  12 +-
 .../cloud/dataflow/examples/WordCountTest.java  |   4 +-
 .../examples/complete/AutoCompleteTest.java |   8 +-
 .../dataflow/examples/complete/TfIdfTest.java   |   4 +-
 .../complete/TopWikipediaSessionsTest.java  |   4 +-
 .../examples/cookbook/DeDupExampleTest.java |   6 +-
 .../examples/cookbook/JoinExamplesTest.java |   4 +-
 .../examples/cookbook/TriggerExampleTest.java   |   4 +-
 .../examples/complete/game/GameStatsTest.java   |   4 +-
 .../complete/game/HourlyTeamScoreTest.java  |   4 +-
 .../examples/complete/game/UserScoreTest.java   |   8 +-
 .../beam/runners/flink/FlinkTestPipeline.java   |   6 +-
 .../streaming/StreamingTransformTranslator.java |   4 +-
 .../apache/beam/runners/spark/DeDupTest.java|   4 +-
 .../beam/runners/spark/SimpleWordCountTest.java |   4 +-
 .../apache/beam/runners/spark/TfIdfTest.java|   4 +-
 .../spark/translation/DoFnOutputTest.java   |   4 +-
 .../translation/MultiOutputWordCountTest.java   |   4 +-
 .../spark/translation/SerializationTest.java|   4 +-
 .../translation/WindowedWordCountTest.java  |   8 +-
 .../streaming/FlattenStreamingTest.java |   8 +-
 .../streaming/KafkaStreamingTest.java   |   8 +-
 .../streaming/SimpleStreamingWordCountTest.java |   8 +-
 .../utils/DataflowAssertStreaming.java  |  42 -
 .../streaming/utils/PAssertStreaming.java   |  42 +
 .../dataflow/sdk/testing/DataflowAssert.java| 826 ---
 .../cloud/dataflow/sdk/testing/PAssert.java | 825 ++
 .../sdk/testing/SerializableMatcher.java|   2 +-
 .../dataflow/sdk/testing/SourceTestUtils.java   |   2 +-
 .../sdk/testing/TestDataflowPipelineRunner.java |   8 +-
 .../dataflow/sdk/testing/TestPipeline.java  |   6 +-
 .../google/cloud/dataflow/sdk/PipelineTest.java |  10 +-
 .../dataflow/sdk/coders/AvroCoderTest.java  |   4 +-
 .../sdk/coders/SerializableCoderTest.java   |   4 +-
 .../cloud/dataflow/sdk/io/AvroIOTest.java   |   6 +-
 .../io/BoundedReadFromUnboundedSourceTest.java  |   4 +-
 .../dataflow/sdk/io/CompressedSourceTest.java   |  10 +-
 .../dataflow/sdk/io/CountingInputTest.java  |  12 +-
 .../dataflow/sdk/io/CountingSourceTest.java |  14 +-
 .../dataflow/sdk/io/FileBasedSourceTest.java|   6 +-
 .../cloud/dataflow/sdk/io/TextIOTest.java   |   8 +-
 .../cloud/dataflow/sdk/io/XmlSourceTest.java|   8 +-
 .../sdk/io/bigtable/BigtableIOTest.java |   8 +-
 .../sdk/runners/dataflow/CustomSourcesTest.java |   6 +-
 .../runners/inprocess/InProcessCreateTest.java  |   8 +-
 .../inprocess/InProcessPipelineRunnerTest.java  |   4 +-
 .../sdk/testing/DataflowAssertTest.java | 327 
 .../cloud/dataflow/sdk/testing/PAssertTest.java | 331 
 .../testing/TestDataflowPipelineRunnerTest.java |  24 +-
 .../transforms/ApproximateQuantilesTest.java|  10 +-
 .../sdk/transforms/ApproximateUniqueTest.java   |  12 +-
 .../dataflow/sdk/transforms/CombineFnsTest.java |  12 +-
 .../dataflow/sdk/transforms/CombineTest.java|  66 +-
 .../dataflow/sdk/transforms/CountTest.java  |  10 +-
 .../dataflow/sdk/transforms/CreateTest.java |  18 +-
 .../dataflow/sdk/transforms/FilterTest.java |  18 +-
 .../sdk/transforms/FlatMapElementsTest.java |   4 +-
 .../dataflow/sdk/transforms/FlattenTest.java|  22 +-
 .../dataflow/sdk/transforms/GroupByKeyTest.java |   8 +-
 .../cloud/dataflow/sdk/transforms/KeysTest.java |   6 +-
 .../dataflow/sdk/transforms/KvSwapTest.java |   6 +-
 .../sdk/transforms/MapElementsTest.java |   6 +-
 .../dataflow/sdk/transforms/ParDoTest.java  |  66 +-
 .../dataflow/sdk/transforms/PartitionTest.java  |  14 +-
 .../sdk/transforms/RemoveDuplicatesTest.java|   8 +-
 .../dataflow/sdk/transforms/SampleTest.java |  14 +-
 .../cloud/dataflow/sdk/transforms/TopTest.java  |  32 +-
 .../dataflow/sdk/transforms/ValuesTest.java |   6 +-
 .../cloud/dataflow/sdk/transforms/ViewTest.java |  72 +-
 .../dataflow/sdk/transforms/WithKeysTest.java   |   8 +-
 

[2/5] incubator-beam git commit: Rename DataflowAssert to PAssert

2016-04-08 Thread davor
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/io/bigtable/BigtableIOTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/io/bigtable/BigtableIOTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/io/bigtable/BigtableIOTest.java
index b8ac243..7cb77c7 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/io/bigtable/BigtableIOTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/io/bigtable/BigtableIOTest.java
@@ -42,8 +42,8 @@ import com.google.cloud.dataflow.sdk.coders.Coder;
 import com.google.cloud.dataflow.sdk.io.bigtable.BigtableIO.BigtableSource;
 import com.google.cloud.dataflow.sdk.io.range.ByteKey;
 import com.google.cloud.dataflow.sdk.io.range.ByteKeyRange;
-import com.google.cloud.dataflow.sdk.testing.DataflowAssert;
 import com.google.cloud.dataflow.sdk.testing.ExpectedLogs;
+import com.google.cloud.dataflow.sdk.testing.PAssert;
 import com.google.cloud.dataflow.sdk.testing.TestPipeline;
 import com.google.cloud.dataflow.sdk.transforms.Create;
 import com.google.cloud.dataflow.sdk.values.KV;
@@ -218,7 +218,7 @@ public class BigtableIOTest {
 
 TestPipeline p = TestPipeline.create();
 PCollection rows = p.apply(defaultRead.withTableId(table));
-DataflowAssert.that(rows).empty();
+PAssert.that(rows).empty();
 
 p.run();
 logged.verifyInfo(String.format("Closing reader after reading 0 
records."));
@@ -233,7 +233,7 @@ public class BigtableIOTest {
 
 TestPipeline p = TestPipeline.create();
 PCollection rows = p.apply(defaultRead.withTableId(table));
-DataflowAssert.that(rows).containsInAnyOrder(testRows);
+PAssert.that(rows).containsInAnyOrder(testRows);
 
 p.run();
 logged.verifyInfo(String.format("Closing reader after reading %d 
records.", numRows));
@@ -278,7 +278,7 @@ public class BigtableIOTest {
 
 TestPipeline p = TestPipeline.create();
 PCollection rows = 
p.apply(defaultRead.withTableId(table).withRowFilter(filter));
-DataflowAssert.that(rows).containsInAnyOrder(filteredRows);
+PAssert.that(rows).containsInAnyOrder(filteredRows);
 
 p.run();
   }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/CustomSourcesTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/CustomSourcesTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/CustomSourcesTest.java
index fb045cd..20d2bc8 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/CustomSourcesTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/CustomSourcesTest.java
@@ -33,8 +33,8 @@ import com.google.cloud.dataflow.sdk.io.Read;
 import com.google.cloud.dataflow.sdk.options.DataflowPipelineOptions;
 import com.google.cloud.dataflow.sdk.options.PipelineOptions;
 import com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory;
-import com.google.cloud.dataflow.sdk.testing.DataflowAssert;
 import com.google.cloud.dataflow.sdk.testing.ExpectedLogs;
+import com.google.cloud.dataflow.sdk.testing.PAssert;
 import com.google.cloud.dataflow.sdk.testing.TestPipeline;
 import com.google.cloud.dataflow.sdk.transforms.Sample;
 import com.google.cloud.dataflow.sdk.transforms.Sum;
@@ -213,7 +213,7 @@ public class CustomSourcesTest {
 .apply(Sum.integersGlobally())
 .apply(Sample.any(1));
 
-DataflowAssert.thatSingleton(sum).isEqualTo(145);
+PAssert.thatSingleton(sum).isEqualTo(145);
 p.run();
   }
 
@@ -225,7 +225,7 @@ public class CustomSourcesTest {
  .apply(Window.into(FixedWindows.of(Duration.millis(3
  .apply(Sum.integersGlobally().withoutDefaults());
 // Should group into [10 11] [12 13 14] [15 16 17] [18 19].
-DataflowAssert.that(sums).containsInAnyOrder(21, 37, 39, 48);
+PAssert.that(sums).containsInAnyOrder(21, 37, 39, 48);
 p.run();
   }
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreateTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreateTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreateTest.java
index 1956b20..8bb7e51 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreateTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreateTest.java
@@ -35,7 +35,7 @@ 

[1/5] incubator-beam git commit: Rename DataflowAssert to PAssert

2016-04-08 Thread davor
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5f24cef20 -> 529bcdf56


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/PartitionTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/PartitionTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/PartitionTest.java
index 31447fc..53642f1 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/PartitionTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/PartitionTest.java
@@ -21,7 +21,7 @@ import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertTrue;
 
 import com.google.cloud.dataflow.sdk.Pipeline;
-import com.google.cloud.dataflow.sdk.testing.DataflowAssert;
+import com.google.cloud.dataflow.sdk.testing.PAssert;
 import com.google.cloud.dataflow.sdk.testing.RunnableOnService;
 import com.google.cloud.dataflow.sdk.testing.TestPipeline;
 import com.google.cloud.dataflow.sdk.transforms.Partition.PartitionFn;
@@ -70,8 +70,8 @@ public class PartitionTest implements Serializable {
 .apply(Create.of(591, 11789, 1257, 24578, 24799, 307))
 .apply(Partition.of(2, new ModFn()));
 assertTrue(outputs.size() == 2);
-DataflowAssert.that(outputs.get(0)).containsInAnyOrder(24578);
-DataflowAssert.that(outputs.get(1)).containsInAnyOrder(591, 11789, 1257,
+PAssert.that(outputs.get(0)).containsInAnyOrder(24578);
+PAssert.that(outputs.get(1)).containsInAnyOrder(591, 11789, 1257,
 24799, 307);
 pipeline.run();
   }
@@ -84,9 +84,9 @@ public class PartitionTest implements Serializable {
 .apply(Create.of(1, 2, 4, 5))
 .apply(Partition.of(3, new ModFn()));
 assertTrue(outputs.size() == 3);
-DataflowAssert.that(outputs.get(0)).empty();
-DataflowAssert.that(outputs.get(1)).containsInAnyOrder(1, 4);
-DataflowAssert.that(outputs.get(2)).containsInAnyOrder(2, 5);
+PAssert.that(outputs.get(0)).empty();
+PAssert.that(outputs.get(1)).containsInAnyOrder(1, 4);
+PAssert.that(outputs.get(2)).containsInAnyOrder(2, 5);
 pipeline.run();
   }
 
@@ -130,7 +130,7 @@ public class PartitionTest implements Serializable {
 assertTrue(outputs.size() == 2);
 
 PCollection output = 
outputs.apply(Flatten.pCollections());
-DataflowAssert.that(output).containsInAnyOrder(2, 4, 5, 7, 8, 10, 11);
+PAssert.that(output).containsInAnyOrder(2, 4, 5, 7, 8, 10, 11);
 pipeline.run();
   }
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicatesTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicatesTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicatesTest.java
index 6508738..2ae4a1f 100644
--- 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicatesTest.java
+++ 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/RemoveDuplicatesTest.java
@@ -22,7 +22,7 @@ import static org.junit.Assert.assertTrue;
 
 import com.google.cloud.dataflow.sdk.Pipeline;
 import com.google.cloud.dataflow.sdk.coders.StringUtf8Coder;
-import com.google.cloud.dataflow.sdk.testing.DataflowAssert;
+import com.google.cloud.dataflow.sdk.testing.PAssert;
 import com.google.cloud.dataflow.sdk.testing.RunnableOnService;
 import com.google.cloud.dataflow.sdk.testing.TestPipeline;
 import com.google.cloud.dataflow.sdk.values.KV;
@@ -64,7 +64,7 @@ public class RemoveDuplicatesTest {
 PCollection output =
 input.apply(RemoveDuplicates.create());
 
-DataflowAssert.that(output)
+PAssert.that(output)
 .containsInAnyOrder("k1", "k5", "k2", "k3");
 p.run();
   }
@@ -83,7 +83,7 @@ public class RemoveDuplicatesTest {
 PCollection output =
 input.apply(RemoveDuplicates.create());
 
-DataflowAssert.that(output).empty();
+PAssert.that(output).empty();
 p.run();
   }
 
@@ -125,7 +125,7 @@ public class RemoveDuplicatesTest {
 input.apply(RemoveDuplicates.withRepresentativeValueFn(new Keys()));
 
 
-DataflowAssert.that(output).satisfies(new Checker());
+PAssert.that(output).satisfies(new Checker());
 
 p.run();
   }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/SampleTest.java
--
diff --git 
a/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/SampleTest.java
 
b/sdks/java/core/src/test/java/com/google/cloud/dataflow/sdk/transforms/SampleTest.java

[GitHub] incubator-beam pull request: [BEAM-155] Rename DataflowAssert to P...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/146


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[4/5] incubator-beam git commit: Rename DataflowAssert to PAssert

2016-04-08 Thread davor
Rename DataflowAssert to PAssert


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d73ceab8
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d73ceab8
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d73ceab8

Branch: refs/heads/master
Commit: d73ceab8bc0a3f8479882bd689015fd7f869758b
Parents: 5f24cef
Author: Thomas Groh 
Authored: Thu Apr 7 15:53:15 2016 -0700
Committer: Davor Bonaci 
Committed: Fri Apr 8 13:55:01 2016 -0700

--
 .../contrib/joinlibrary/InnerJoinTest.java  |  10 +-
 .../contrib/joinlibrary/OuterLeftJoinTest.java  |  10 +-
 .../contrib/joinlibrary/OuterRightJoinTest.java |  10 +-
 examples/java/pom.xml   |   2 +-
 .../dataflow/examples/DebuggingWordCount.java   |  12 +-
 .../cloud/dataflow/examples/WordCountTest.java  |   4 +-
 .../examples/complete/AutoCompleteTest.java |   8 +-
 .../dataflow/examples/complete/TfIdfTest.java   |   4 +-
 .../complete/TopWikipediaSessionsTest.java  |   4 +-
 .../examples/cookbook/DeDupExampleTest.java |   6 +-
 .../examples/cookbook/JoinExamplesTest.java |   4 +-
 .../examples/cookbook/TriggerExampleTest.java   |   4 +-
 .../examples/complete/game/GameStatsTest.java   |   4 +-
 .../complete/game/HourlyTeamScoreTest.java  |   4 +-
 .../examples/complete/game/UserScoreTest.java   |   8 +-
 .../beam/runners/flink/FlinkTestPipeline.java   |   6 +-
 .../streaming/StreamingTransformTranslator.java |   4 +-
 .../apache/beam/runners/spark/DeDupTest.java|   4 +-
 .../beam/runners/spark/SimpleWordCountTest.java |   4 +-
 .../apache/beam/runners/spark/TfIdfTest.java|   4 +-
 .../spark/translation/DoFnOutputTest.java   |   4 +-
 .../translation/MultiOutputWordCountTest.java   |   4 +-
 .../spark/translation/SerializationTest.java|   4 +-
 .../translation/WindowedWordCountTest.java  |   8 +-
 .../streaming/FlattenStreamingTest.java |   8 +-
 .../streaming/KafkaStreamingTest.java   |   8 +-
 .../streaming/SimpleStreamingWordCountTest.java |   8 +-
 .../utils/DataflowAssertStreaming.java  |  42 -
 .../streaming/utils/PAssertStreaming.java   |  42 +
 .../dataflow/sdk/testing/DataflowAssert.java| 826 ---
 .../cloud/dataflow/sdk/testing/PAssert.java | 825 ++
 .../sdk/testing/SerializableMatcher.java|   2 +-
 .../dataflow/sdk/testing/SourceTestUtils.java   |   2 +-
 .../sdk/testing/TestDataflowPipelineRunner.java |   8 +-
 .../dataflow/sdk/testing/TestPipeline.java  |   6 +-
 .../google/cloud/dataflow/sdk/PipelineTest.java |  10 +-
 .../dataflow/sdk/coders/AvroCoderTest.java  |   4 +-
 .../sdk/coders/SerializableCoderTest.java   |   4 +-
 .../cloud/dataflow/sdk/io/AvroIOTest.java   |   6 +-
 .../io/BoundedReadFromUnboundedSourceTest.java  |   4 +-
 .../dataflow/sdk/io/CompressedSourceTest.java   |  10 +-
 .../dataflow/sdk/io/CountingInputTest.java  |  12 +-
 .../dataflow/sdk/io/CountingSourceTest.java |  14 +-
 .../dataflow/sdk/io/FileBasedSourceTest.java|   6 +-
 .../cloud/dataflow/sdk/io/TextIOTest.java   |   8 +-
 .../cloud/dataflow/sdk/io/XmlSourceTest.java|   8 +-
 .../sdk/io/bigtable/BigtableIOTest.java |   8 +-
 .../sdk/runners/dataflow/CustomSourcesTest.java |   6 +-
 .../runners/inprocess/InProcessCreateTest.java  |   8 +-
 .../inprocess/InProcessPipelineRunnerTest.java  |   4 +-
 .../sdk/testing/DataflowAssertTest.java | 327 
 .../cloud/dataflow/sdk/testing/PAssertTest.java | 331 
 .../testing/TestDataflowPipelineRunnerTest.java |  24 +-
 .../transforms/ApproximateQuantilesTest.java|  10 +-
 .../sdk/transforms/ApproximateUniqueTest.java   |  12 +-
 .../dataflow/sdk/transforms/CombineFnsTest.java |  12 +-
 .../dataflow/sdk/transforms/CombineTest.java|  66 +-
 .../dataflow/sdk/transforms/CountTest.java  |  10 +-
 .../dataflow/sdk/transforms/CreateTest.java |  18 +-
 .../dataflow/sdk/transforms/FilterTest.java |  18 +-
 .../sdk/transforms/FlatMapElementsTest.java |   4 +-
 .../dataflow/sdk/transforms/FlattenTest.java|  22 +-
 .../dataflow/sdk/transforms/GroupByKeyTest.java |   8 +-
 .../cloud/dataflow/sdk/transforms/KeysTest.java |   6 +-
 .../dataflow/sdk/transforms/KvSwapTest.java |   6 +-
 .../sdk/transforms/MapElementsTest.java |   6 +-
 .../dataflow/sdk/transforms/ParDoTest.java  |  66 +-
 .../dataflow/sdk/transforms/PartitionTest.java  |  14 +-
 .../sdk/transforms/RemoveDuplicatesTest.java|   8 +-
 .../dataflow/sdk/transforms/SampleTest.java |  14 +-
 .../cloud/dataflow/sdk/transforms/TopTest.java  |  32 +-
 .../dataflow/sdk/transforms/ValuesTest.java |   6 +-
 .../cloud/dataflow/sdk/transforms/ViewTest.java |  72 +-
 .../dataflow/sdk/transforms/WithKeysTest.java   |   8 +-
 

[3/5] incubator-beam git commit: Rename DataflowAssert to PAssert

2016-04-08 Thread davor
http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d73ceab8/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/DataflowAssert.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/DataflowAssert.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/DataflowAssert.java
deleted file mode 100644
index 8f4c066..000
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/testing/DataflowAssert.java
+++ /dev/null
@@ -1,826 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package com.google.cloud.dataflow.sdk.testing;
-
-import static org.hamcrest.Matchers.containsInAnyOrder;
-import static org.hamcrest.Matchers.equalTo;
-import static org.hamcrest.Matchers.not;
-import static org.junit.Assert.assertThat;
-
-import com.google.cloud.dataflow.sdk.Pipeline;
-import com.google.cloud.dataflow.sdk.coders.Coder;
-import com.google.cloud.dataflow.sdk.coders.CoderException;
-import com.google.cloud.dataflow.sdk.coders.IterableCoder;
-import com.google.cloud.dataflow.sdk.coders.KvCoder;
-import com.google.cloud.dataflow.sdk.coders.MapCoder;
-import com.google.cloud.dataflow.sdk.coders.VoidCoder;
-import com.google.cloud.dataflow.sdk.options.StreamingOptions;
-import com.google.cloud.dataflow.sdk.runners.PipelineRunner;
-import com.google.cloud.dataflow.sdk.transforms.Aggregator;
-import com.google.cloud.dataflow.sdk.transforms.Create;
-import com.google.cloud.dataflow.sdk.transforms.DoFn;
-import com.google.cloud.dataflow.sdk.transforms.PTransform;
-import com.google.cloud.dataflow.sdk.transforms.ParDo;
-import com.google.cloud.dataflow.sdk.transforms.SerializableFunction;
-import com.google.cloud.dataflow.sdk.transforms.Sum;
-import com.google.cloud.dataflow.sdk.transforms.View;
-import com.google.cloud.dataflow.sdk.transforms.windowing.GlobalWindows;
-import com.google.cloud.dataflow.sdk.transforms.windowing.Window;
-import com.google.cloud.dataflow.sdk.util.CoderUtils;
-import com.google.cloud.dataflow.sdk.values.KV;
-import com.google.cloud.dataflow.sdk.values.PBegin;
-import com.google.cloud.dataflow.sdk.values.PCollection;
-import com.google.cloud.dataflow.sdk.values.PCollectionView;
-import com.google.cloud.dataflow.sdk.values.PDone;
-import com.google.common.base.Optional;
-import com.google.common.collect.Iterables;
-import com.google.common.collect.Lists;
-
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.Serializable;
-import java.util.Arrays;
-import java.util.Collection;
-import java.util.Collections;
-import java.util.List;
-import java.util.Map;
-import java.util.NoSuchElementException;
-
-/**
- * An assertion on the contents of a {@link PCollection}
- * incorporated into the pipeline.  Such an assertion
- * can be checked no matter what kind of {@link PipelineRunner} is
- * used.
- *
- * Note that the {@code DataflowAssert} call must precede the call
- * to {@link Pipeline#run}.
- *
- * Examples of use:
- * {@code
- * Pipeline p = TestPipeline.create();
- * ...
- * PCollection output =
- *  input
- *  .apply(ParDo.of(new TestDoFn()));
- * DataflowAssert.that(output)
- * .containsInAnyOrder("out1", "out2", "out3");
- * ...
- * PCollection ints = ...
- * PCollection sum =
- * ints
- * .apply(Combine.globally(new SumInts()));
- * DataflowAssert.that(sum)
- * .is(42);
- * ...
- * p.run();
- * }
- *
- * JUnit and Hamcrest must be linked in by any code that uses 
DataflowAssert.
- */
-public class DataflowAssert {
-
-  private static final Logger LOG = 
LoggerFactory.getLogger(DataflowAssert.class);
-
-  static final String SUCCESS_COUNTER = "DataflowAssertSuccess";
-  static final String FAILURE_COUNTER = "DataflowAssertFailure";
-
-  private static int assertCount = 0;
-
-  // Do not instantiate.
-  private DataflowAssert() {}
-
-  /**
-   * Constructs an {@link IterableAssert} for the elements of the provided
-   * {@link PCollection}.
-   */
-  public static  IterableAssert that(PCollection actual) {
-return new IterableAssert<>(
-new CreateActual(actual, View.asIterable()),
- 

[GitHub] incubator-beam pull request: Exclude more IDE files from Spark run...

2016-04-08 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/152

Exclude more IDE files from Spark runner RAT check

It also seems that some of the prior rules were overly broad, excluding 
`*.filename` when they only should be excluding dotfiles.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam rat-ignore

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/152.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #152


commit 97af4df35218e24a7c1570036178d8b37ae7ecf4
Author: Kenneth Knowles 
Date:   2016-04-08T20:39:22Z

Exclude more IDE files from Spark runner RAT check




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam-site git commit: [BEAM-172] Link fixes

2016-04-08 Thread jamesmalone
[BEAM-172] Link fixes

This closes #11


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/0ae64c65
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/0ae64c65
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/0ae64c65

Branch: refs/heads/asf-site
Commit: 0ae64c654c9e0d119123353ee01ea870d856fea8
Parents: 38d8256 4d374e8
Author: James Malone 
Authored: Fri Apr 8 11:56:47 2016 -0700
Committer: James Malone 
Committed: Fri Apr 8 11:56:47 2016 -0700

--
 content/beam/capability/2016/03/17/capability-matrix.html| 6 +-
 .../beam/capability/2016/04/03/presentation-materials.html   | 1 +
 content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html | 6 +-
 content/beam/update/website/2016/02/22/beam-has-a-logo.html  | 6 +-
 content/blog/index.html  | 1 +
 content/capability-matrix/index.html | 8 ++--
 content/contribution-guide/index.html| 6 +-
 content/feed.xml | 4 ++--
 content/getting_started/index.html   | 6 +-
 content/index.html   | 1 +
 content/issue_tracking/index.html| 6 +-
 content/mailing_lists/index.html | 6 +-
 content/presentation-materials/index.html| 3 ++-
 content/privacy_policy/index.html| 6 +-
 content/source_repository/index.html | 6 +-
 content/team/index.html  | 6 +-
 16 files changed, 63 insertions(+), 15 deletions(-)
--




[1/2] incubator-beam-site git commit: Fix for missing links

2016-04-08 Thread jamesmalone
Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site 38d8256d8 -> 0ae64c654


Fix for missing links


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/4d374e87
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/4d374e87
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/4d374e87

Branch: refs/heads/asf-site
Commit: 4d374e875f2f203b276aba2a36b35301d32fb96f
Parents: 38d8256
Author: James Malone 
Authored: Fri Apr 8 10:43:31 2016 -0700
Committer: James Malone 
Committed: Fri Apr 8 10:43:31 2016 -0700

--
 content/beam/capability/2016/03/17/capability-matrix.html| 6 +-
 .../beam/capability/2016/04/03/presentation-materials.html   | 1 +
 content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html | 6 +-
 content/beam/update/website/2016/02/22/beam-has-a-logo.html  | 6 +-
 content/blog/index.html  | 1 +
 content/capability-matrix/index.html | 8 ++--
 content/contribution-guide/index.html| 6 +-
 content/feed.xml | 4 ++--
 content/getting_started/index.html   | 6 +-
 content/index.html   | 1 +
 content/issue_tracking/index.html| 6 +-
 content/mailing_lists/index.html | 6 +-
 content/presentation-materials/index.html| 3 ++-
 content/privacy_policy/index.html| 6 +-
 content/source_repository/index.html | 6 +-
 content/team/index.html  | 6 +-
 16 files changed, 63 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/4d374e87/content/beam/capability/2016/03/17/capability-matrix.html
--
diff --git a/content/beam/capability/2016/03/17/capability-matrix.html 
b/content/beam/capability/2016/03/17/capability-matrix.html
index be15c84..6fb26e2 100644
--- a/content/beam/capability/2016/03/17/capability-matrix.html
+++ b/content/beam/capability/2016/03/17/capability-matrix.html
@@ -44,7 +44,10 @@
   Documentation 
   
 Getting Started
-   Capability Matrix
+Presentation 
Materials
+
+Technical Documentation
+Capability Matrix
 https://goo.gl/ps8twC;>Technical Docs
 https://goo.gl/nk5OM0;>Technical Vision
   
@@ -57,6 +60,7 @@
 https://goo.gl/ps8twC;>Technical Docs
 https://goo.gl/nk5OM0;>Technical Vision
 Apache Beam Team
+Public Meetings
 
 Contribute
 Contribution Guide

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/4d374e87/content/beam/capability/2016/04/03/presentation-materials.html
--
diff --git a/content/beam/capability/2016/04/03/presentation-materials.html 
b/content/beam/capability/2016/04/03/presentation-materials.html
index 96f6dac..493cf55 100644
--- a/content/beam/capability/2016/04/03/presentation-materials.html
+++ b/content/beam/capability/2016/04/03/presentation-materials.html
@@ -60,6 +60,7 @@
 https://goo.gl/ps8twC;>Technical Docs
 https://goo.gl/nk5OM0;>Technical Vision
 Apache Beam Team
+Public Meetings
 
 Contribute
 Contribution Guide

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/4d374e87/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
--
diff --git a/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html 
b/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
index 2f833d7..e4066e6 100644
--- a/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
+++ b/content/beam/python/sdk/2016/02/25/beam-has-a-logo0.html
@@ -44,7 +44,10 @@
   Documentation 
   
 Getting Started
-   Capability Matrix
+Presentation 
Materials
+
+Technical Documentation
+Capability Matrix
 https://goo.gl/ps8twC;>Technical Docs
 https://goo.gl/nk5OM0;>Technical Vision
   
@@ -57,6 +60,7 @@
 https://goo.gl/ps8twC;>Technical Docs
 https://goo.gl/nk5OM0;>Technical Vision
 Apache Beam Team
+Public Meetings
 
 Contribute
 Contribution 

[GitHub] incubator-beam pull request: [BEAM-155] Make PassThroughWindowFn a...

2016-04-08 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/150

[BEAM-155] Make PassThroughWindowFn and NeverTrigger available

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This will be used as part of the new PAssert

PassThroughWindowFn remains package-private to restrict usage.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam pass_through_window_fn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #150


commit 2d8189ff2948e8b294e253f087d741e4d122658b
Author: Thomas Groh 
Date:   2016-03-31T21:34:06Z

Make PassThroughWindowFn and NeverTrigger available

This will be used as part of the new PAssert

PassThroughWindowFn remains package-private to restrict usage.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-155) Support asserting the contents of windows and panes in PAssert

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232713#comment-15232713
 ] 

ASF GitHub Bot commented on BEAM-155:
-

GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/150

[BEAM-155] Make PassThroughWindowFn and NeverTrigger available

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This will be used as part of the new PAssert

PassThroughWindowFn remains package-private to restrict usage.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam pass_through_window_fn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/150.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #150


commit 2d8189ff2948e8b294e253f087d741e4d122658b
Author: Thomas Groh 
Date:   2016-03-31T21:34:06Z

Make PassThroughWindowFn and NeverTrigger available

This will be used as part of the new PAssert

PassThroughWindowFn remains package-private to restrict usage.




> Support asserting the contents of windows and panes in PAssert
> --
>
> Key: BEAM-155
> URL: https://issues.apache.org/jira/browse/BEAM-155
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> This consists of reifying the output windows and panes, and running asserts 
> per-window about the contents of panes.
> This includes aggregated matching and final pane matching, e.g.
> PAssert.that(output).byOnTimePane().hasOutputElements(foo, bar);
> // For discarding mode - could have emitted (say) [spam, eggs], [spam], [], 
> [sausage], []
> PAssert.that(output).byFinalPane().hasOutputElements(spam, eggs, sausage, 
> spam);
> // For accumulating mode without late data
> PAssert.that(output).finalPane().containsInAnyOrder(spam, eggs, sausage, 
> spam);
> // For accumulating mode with late data
> PAssert.that(output).finalPane().containsInAnyOrder(foo, 
> bar).mayAlsoContain(baz, rab);
> See also: 
> https://docs.google.com/document/d/1fZUUbG2LxBtqCVabQshldXIhkMcXepsbv2vuuny8Ix4/edit#



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Add WindowIntoEvaluatorFact...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/149


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232676#comment-15232676
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/149


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This merges #149

2016-04-08 Thread kenn
This merges #149


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5f24cef2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5f24cef2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5f24cef2

Branch: refs/heads/master
Commit: 5f24cef20d481c83b3d79a2adf73595f3a34a586
Parents: 6348a1f 64a8fb7
Author: Kenneth Knowles 
Authored: Fri Apr 8 11:17:40 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 8 11:17:40 2016 -0700

--
 .../PassthroughTransformEvaluator.java  |  49 
 .../inprocess/TransformEvaluatorRegistry.java   |   2 +
 .../inprocess/WindowEvaluatorFactory.java   | 130 +++
 .../inprocess/WindowEvaluatorFactoryTest.java   | 221 +++
 4 files changed, 402 insertions(+)
--




Build failed in Jenkins: beam_RunnableOnService_GoogleCloudDataflow #55

2016-04-08 Thread Apache Jenkins Server
See 


Changes:

[bchambers] Additional APIs for registering DisplayData

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on beam2 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/incubator-beam.git # 
 > timeout=10
Fetching upstream changes from https://github.com/apache/incubator-beam.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://github.com/apache/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision 245cffa5e96663707f7d38c6f2b5e529573e7cae 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 245cffa5e96663707f7d38c6f2b5e529573e7cae
 > git rev-list d827e1b9614a8872a85c7d31e7605900acc6c59a # timeout=10
Parsing POMs
Established TCP socket on 54365
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
[beam_RunnableOnService_GoogleCloudDataflow] $ 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66/bin/java -Xmx2g 
-Xms256m -XX:MaxPermSize=512m -cp 
/home/jenkins/jenkins-slave/maven32-agent.jar:/home/jenkins/jenkins-slave/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3/boot/plexus-classworlds-2.5.2.jar:/home/jenkins/jenkins-slave/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3/conf/logging
 jenkins.maven3.agent.Maven32Main 
/home/jenkins/jenkins-slave/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3
 /home/jenkins/jenkins-slave/slave.jar 
/home/jenkins/jenkins-slave/maven32-interceptor.jar 
/home/jenkins/jenkins-slave/maven3-interceptor-commons.jar 54365
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f 

 
-Dmaven.repo.local=
 -B -e clean verify -pl .,sdks/java/core -P DataflowPipelineTests 
-DdataflowOptions=[ "--project=apache-beam-testing", 
"--stagingLocation=gs://staging-for-runnable-on-service-jenkins-tests/" ]
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Apache Beam :: Parent
[INFO] Apache Beam :: SDKs :: Java :: Core
[INFO] 
[INFO] 
[INFO] Building Apache Beam :: Parent 0.1.0-incubating-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ parent ---
[INFO] Deleting 

[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ parent ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ parent 
---
[INFO] 
[INFO] 
[INFO] Building Apache Beam :: SDKs :: Java :: Core 0.1.0-incubating-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ java-sdk-all ---
[INFO] Deleting 

[INFO] 
[INFO] --- jacoco-maven-plugin:0.7.5.201505241946:prepare-agent (default) @ 
java-sdk-all ---
[INFO] argLine set to 
-javaagent:
[INFO] 
[INFO] --- avro-maven-plugin:1.7.7:schema (schemas) @ java-sdk-all ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ java-sdk-all 
---
[INFO] 
[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ 
java-sdk-all ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 

Build failed in Jenkins: beam_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #55

2016-04-08 Thread Apache Jenkins Server
See 


Changes:

[bchambers] Additional APIs for registering DisplayData

--
[INFO] 
[INFO] 
[INFO] Building Apache Beam :: SDKs :: Java :: Core 0.1.0-incubating-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ java-sdk-all ---
[INFO] Deleting 

[INFO] 
[INFO] --- jacoco-maven-plugin:0.7.5.201505241946:prepare-agent (default) @ 
java-sdk-all ---
[INFO] argLine set to 
-javaagent:/home/jenkins/jenkins-slave/workspace/beam_RunnableOnService_GoogleCloudDataflow/.repository/org/jacoco/org.jacoco.agent/0.7.5.201505241946/org.jacoco.agent-0.7.5.201505241946-runtime.jar=destfile=
[INFO] 
[INFO] --- avro-maven-plugin:1.7.7:schema (schemas) @ java-sdk-all ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ java-sdk-all 
---
[INFO] 
[INFO] --- maven-resources-plugin:2.7:resources (default-resources) @ 
java-sdk-all ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ java-sdk-all 
---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 467 source files to 

[WARNING] 
:
 Some input files use or override a deprecated API.
[WARNING] 
:
 Recompile with -Xlint:deprecation for details.
[WARNING] 
:
 Some input files use unchecked or unsafe operations.
[WARNING] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] >>> maven-source-plugin:2.4:jar (attach-sources) > generate-sources @ 
java-sdk-all >>>
[INFO] 
[INFO] --- jacoco-maven-plugin:0.7.5.201505241946:prepare-agent (default) @ 
java-sdk-all ---
[INFO] argLine set to 
-javaagent:/home/jenkins/jenkins-slave/workspace/beam_RunnableOnService_GoogleCloudDataflow/.repository/org/jacoco/org.jacoco.agent/0.7.5.201505241946/org.jacoco.agent-0.7.5.201505241946-runtime.jar=destfile=
[INFO] 
[INFO] --- avro-maven-plugin:1.7.7:schema (schemas) @ java-sdk-all ---
[WARNING] Failed to getClass for org.apache.maven.plugin.source.SourceJarMojo
[INFO] 
[INFO] <<< maven-source-plugin:2.4:jar (attach-sources) < generate-sources @ 
java-sdk-all <<<
[INFO] 
[INFO] --- maven-source-plugin:2.4:jar (attach-sources) @ java-sdk-all ---
[INFO] Building jar: 

[INFO] 
[INFO] --- build-helper-maven-plugin:1.9.1:add-test-source (add-test-source) @ 
java-sdk-all ---
[INFO] Test Source directory: 

 added.
[INFO] 
[INFO] --- maven-resources-plugin:2.7:testResources (default-testResources) @ 
java-sdk-all ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
java-sdk-all ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 271 source files to 

[WARNING] 

[GitHub] incubator-beam pull request: [BEAM-22] Add WindowIntoEvaluatorFact...

2016-04-08 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/149

[BEAM-22] Add WindowIntoEvaluatorFactory

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is a TransformEvaluator-based implementation of the Window#into
primitve, as opposed to the DoFn ProcessContext internals-based
implementation.

Part of the new Runner API, in which Window.Bound is a primitive.
This implementation does not require access to internals within DoFn.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_window_primitive

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/149.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #149


commit 3f5762579389d150a89f9e7f871f024daa504430
Author: Thomas Groh 
Date:   2016-04-08T16:29:31Z

Add WindowIntoEvaluatorFactory

This is a TransformEvaluator-based implementation of the Window#into
primitve, as opposed to the DoFn ProcessContext internals-based
implementation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is unstable: beam_RunnableOnService_GoogleCloudDataflow » Apache Beam :: SDKs :: Java :: Core #56

2016-04-08 Thread Apache Jenkins Server
See 




Re: [jira] [Created] (BEAM-178) stdout vs logging in DataflowPipelineRunner?

2016-04-08 Thread Lukasz Cwik
I think this was because we wanted to make sure the user would get at least
the fact that they submitted a pipeline even if they haven't setup logging
correctly.

On Wed, Apr 6, 2016 at 2:19 PM, Daniel Halperin (JIRA) 
wrote:

> Daniel Halperin created BEAM-178:
> 
>
>  Summary: stdout vs logging in DataflowPipelineRunner?
>  Key: BEAM-178
>  URL: https://issues.apache.org/jira/browse/BEAM-178
>  Project: Beam
>   Issue Type: Bug
>   Components: runner-dataflow
> Reporter: Daniel Halperin
> Assignee: Davor Bonaci
> Priority: Minor
>
>
> We seem to thoroughly intermingle logging and println. Is this deliberate?
>
> e.g.,
>
> {code}
> LOG.info("To access the Dataflow monitoring console, please navigate
> to {}",
> MonitoringUtil.getJobMonitoringPageURL(options.getProject(),
> jobResult.getId()));
> System.out.println("Submitted job: " + jobResult.getId());
> {code}
>
> Original genesis for this was noticing a println in a backport Cl:
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/DataflowPipelineRunner.java#L451
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232529#comment-15232529
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/148

[BEAM-22] Add ShardControlledWrite override

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace "" in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is used for TextIO and AvroIO, which provide withNumOutputShards
methods to control the number of output files. Apply this override in
the InProcessPipelineRunner.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_avro_text_sharding

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/148.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #148


commit d0697ecc61da1cc9ddc783d46fa9e42068ae1925
Author: Thomas Groh 
Date:   2016-03-29T17:06:02Z

Add ShardControlledWrite override

This is used for TextIO and AvroIO, which provide withNumOutputShards
methods to control the number of output files. Apply this override in
the InProcessPipelineRunner.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15232523#comment-15232523
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/131


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #131

2016-04-08 Thread kenn
This closes #131


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/6348a1fe
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/6348a1fe
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/6348a1fe

Branch: refs/heads/master
Commit: 6348a1fe24683d531e053d5be1e56e38635d2957
Parents: 245cffa d1486c6
Author: Kenneth Knowles 
Authored: Fri Apr 8 10:28:32 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 8 10:28:32 2016 -0700

--
 .../inprocess/GroupByKeyEvaluatorFactory.java   | 21 +-
 .../sdk/runners/inprocess/InProcessCreate.java  | 18 
 .../inprocess/InProcessPipelineRunner.java  | 43 +++-
 .../inprocess/PTransformOverrideFactory.java| 33 +++
 .../runners/inprocess/ViewEvaluatorFactory.java | 28 +++--
 5 files changed, 110 insertions(+), 33 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6348a1fe/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java
--

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/6348a1fe/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java
--
diff --cc 
sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java
index f5b7f3c,9ce4430..fa93994
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessPipelineRunner.java
@@@ -33,9 -35,7 +35,8 @@@ import com.google.cloud.dataflow.sdk.tr
  import com.google.cloud.dataflow.sdk.transforms.Create;
  import com.google.cloud.dataflow.sdk.transforms.GroupByKey;
  import com.google.cloud.dataflow.sdk.transforms.PTransform;
 +import com.google.cloud.dataflow.sdk.transforms.ParDo;
  import com.google.cloud.dataflow.sdk.transforms.View.CreatePCollectionView;
- import com.google.cloud.dataflow.sdk.util.InstanceBuilder;
  import com.google.cloud.dataflow.sdk.util.MapAggregatorValues;
  import com.google.cloud.dataflow.sdk.util.TimerInternals.TimerData;
  import com.google.cloud.dataflow.sdk.util.UserCodeException;



[GitHub] incubator-beam pull request: [BEAM-22] Add PTransformOverride clas...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/131


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Add PTransformOverrideFactory

2016-04-08 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 245cffa5e -> 6348a1fe2


Add PTransformOverrideFactory

This constructs an appropriate custom PTransform override for the runner
it is meant to be used in, which is applied in place of the original
PTransform.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d1486c66
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d1486c66
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d1486c66

Branch: refs/heads/master
Commit: d1486c66619891903ed4c0c57c354af06ffbd23c
Parents: 1aee017
Author: Thomas Groh 
Authored: Tue Mar 29 10:03:25 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Apr 8 10:27:50 2016 -0700

--
 .../inprocess/GroupByKeyEvaluatorFactory.java   | 21 +-
 .../sdk/runners/inprocess/InProcessCreate.java  | 18 +
 .../inprocess/InProcessPipelineRunner.java  | 41 +++-
 .../inprocess/PTransformOverrideFactory.java| 33 
 .../runners/inprocess/ViewEvaluatorFactory.java | 28 +++--
 5 files changed, 109 insertions(+), 32 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d1486c66/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
index 0518eec..4b478ad 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/GroupByKeyEvaluatorFactory.java
@@ -42,6 +42,8 @@ import com.google.cloud.dataflow.sdk.util.WindowedValue;
 import com.google.cloud.dataflow.sdk.util.WindowingStrategy;
 import com.google.cloud.dataflow.sdk.values.KV;
 import com.google.cloud.dataflow.sdk.values.PCollection;
+import com.google.cloud.dataflow.sdk.values.PInput;
+import com.google.cloud.dataflow.sdk.values.POutput;
 import com.google.common.annotations.VisibleForTesting;
 
 import java.util.ArrayList;
@@ -179,10 +181,27 @@ class GroupByKeyEvaluatorFactory implements 
TransformEvaluatorFactory {
   }
 
   /**
+   * A {@link PTransformOverrideFactory} for {@link GroupByKey} PTransforms.
+   */
+  public static final class InProcessGroupByKeyOverrideFactory
+  implements PTransformOverrideFactory {
+@Override
+public  PTransform override(
+PTransform transform) {
+  if (transform instanceof GroupByKey) {
+@SuppressWarnings({"rawtypes", "unchecked"})
+PTransform override = new 
InProcessGroupByKey((GroupByKey) transform);
+return override;
+  }
+  return transform;
+}
+  }
+
+  /**
* An in-memory implementation of the {@link GroupByKey} primitive as a 
composite
* {@link PTransform}.
*/
-  public static final class InProcessGroupByKey
+  private static final class InProcessGroupByKey
   extends ForwardingPTransform>, PCollection>> {
 private final GroupByKey original;
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/d1486c66/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java
--
diff --git 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java
 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java
index efda8fc..00587ea 100644
--- 
a/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java
+++ 
b/sdks/java/core/src/main/java/com/google/cloud/dataflow/sdk/runners/inprocess/InProcessCreate.java
@@ -29,6 +29,7 @@ import com.google.cloud.dataflow.sdk.transforms.PTransform;
 import com.google.cloud.dataflow.sdk.util.CoderUtils;
 import com.google.cloud.dataflow.sdk.values.PCollection;
 import com.google.cloud.dataflow.sdk.values.PInput;
+import com.google.cloud.dataflow.sdk.values.POutput;
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.base.Optional;
 import com.google.common.collect.ImmutableList;
@@ -53,6 +54,23 @@ import javax.annotation.Nullable;
 class InProcessCreate extends ForwardingPTransform {
   private final Create.Values original;
 
+  /**
+   * A {@link PTransformOverrideFactory} for {@link 

[4/4] incubator-beam-site git commit: [BEAM-172] Public meetings page

2016-04-08 Thread jamesmalone
[BEAM-172] Public meetings page

This closes #10


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/38d8256d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/38d8256d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/38d8256d

Branch: refs/heads/asf-site
Commit: 38d8256d8cf8886ecf80937a26e4dd8c5b29606b
Parents: ece3515 4001a86
Author: James Malone 
Authored: Fri Apr 8 10:07:05 2016 -0700
Committer: James Malone 
Committed: Fri Apr 8 10:07:05 2016 -0700

--
 _data/meetings.yml |  21 
 _includes/header.html  |   1 +
 _pages/public-meetings.md  |  55 ++
 content/public-meetings/index.html | 181 
 4 files changed, 258 insertions(+)
--




[1/4] incubator-beam-site git commit: Addition of public meetings page

2016-04-08 Thread jamesmalone
Repository: incubator-beam-site
Updated Branches:
  refs/heads/asf-site ece351589 -> 38d8256d8


Addition of public meetings page


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/repo
Commit: 
http://git-wip-us.apache.org/repos/asf/incubator-beam-site/commit/970a4fd3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/tree/970a4fd3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam-site/diff/970a4fd3

Branch: refs/heads/asf-site
Commit: 970a4fd3a82bc1ced2c800c3c9c3f569f62f19c9
Parents: ece3515
Author: James Malone 
Authored: Tue Apr 5 18:22:24 2016 -0700
Committer: James Malone 
Committed: Tue Apr 5 18:22:24 2016 -0700

--
 _data/meetings.yml |  21 
 _includes/header.html  |   1 +
 _pages/public-meetings.md  |  55 ++
 content/public-meetings/index.html | 181 
 4 files changed, 258 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/970a4fd3/_data/meetings.yml
--
diff --git a/_data/meetings.yml b/_data/meetings.yml
new file mode 100644
index 000..f72e4c8
--- /dev/null
+++ b/_data/meetings.yml
@@ -0,0 +1,21 @@
+events:
+- date: 2016/04/01
+  time: "9:30 - 16:00 Pacific"
+  location: PayPalSan Jose, CA, USA
+  type: Dev/PPMC Meeting
+  materials:
+- title: Presentation - PPMC Deep Dive
+  link: 
"https://docs.google.com/presentation/d/1uTb7dx4-Y2OM_B0_3XF_whwAL2FlDTTuq2QzP9sJ4Mg/edit?usp=sharing;
+
+- title: Notes - PPMC Deep Dive
+  link: 
"https://docs.google.com/document/d/1SXSLj7FMIgKqj43nTcczFpJzqASeUMUCpbyklk2fBkg/edit?usp=sharing;
+  notes:
+
+- date: 2016/04/25
+  time: "8:30 - 16:00 Pacific"
+  location: Hilton Financial Districthttps://goo.gl/maps/L4cAuK8xdm52;>750 Kearny StSan Francisco, CA 
94108
+  type: Technical Deep Dive
+  materials:
+  notes: This is a proposed meeting. Space may be be limited.Attendance 
preference will be given to PPMC members.
+
+last_updated: 2016/04/05

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/970a4fd3/_includes/header.html
--
diff --git a/_includes/header.html b/_includes/header.html
index 6311f39..23a9ebe 100644
--- a/_includes/header.html
+++ b/_includes/header.html
@@ -27,6 +27,7 @@
 https://goo.gl/ps8twC;>Technical Docs
 https://goo.gl/nk5OM0;>Technical Vision
 Apache Beam Team
+Public 
Meetings
 
 Contribute
 Contribution 
Guide

http://git-wip-us.apache.org/repos/asf/incubator-beam-site/blob/970a4fd3/_pages/public-meetings.md
--
diff --git a/_pages/public-meetings.md b/_pages/public-meetings.md
new file mode 100644
index 000..2fc6447
--- /dev/null
+++ b/_pages/public-meetings.md
@@ -0,0 +1,55 @@
+---
+layout: page
+title: "Apache Beam Public Meetings"
+permalink: /public-meetings/
+---
+Apache Beam is a shared effort within the open source community. To grow and 
develop that effort, it helps to schedule and hold public meetings, including:
+
+* Formal meetings for the Apache Beam community for project planning and 
discussion
+* Informal "Meetup" events to discuss, share the vision and usefulness of 
Apache Beam
+
+The goal of these meetings will vary, though they will typically focus on 
technical discussions, community matters, and decision-making. These meetings 
have been held or are scheduled by the Apache Beam community.
+
+
+  
+  Date & Time
+  Location
+  Type
+  Meeting materials
+  Notes
+  
+  {% for meeting in site.data.meetings.events %}
+
+  {{ meeting.date }}{{ meeting.time }}
+  {{ meeting.location }}
+  {{ meeting.type }}
+  
+{% for material in meeting.materials %}
+{{ material.title }}
+{% endfor %}
+  
+  {{ meeting.notes }}
+
+  {% endfor %}
+
+*This list was last updated on {{ site.data.meetings.last_updated }}.*
+
+All Apache Beam meetings are open to the public and we encourage anyone to 
attend. From time to time space in our event location may be limited, so 
preference will be given to PPMC members and others on a first-come, 
first-serve basis.
+
+## I want to give a public talk about Apache Beam
+To get started, we recommend you review the Apache Beam [presentation 
materials]({{ site.baseurl }}/presentation-materials/) page to review the 
content the Apache Beam community has already created. These materials will 
possibly save you time and energy as you create content for your event.
+
+Once you have scheduled your event, we welcome you to announce it on the 

[GitHub] incubator-beam pull request: [BEAM-121] Additional APIs for regist...

2016-04-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/145


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---