[GitHub] incubator-beam pull request: Send travis-ci emails to their defaul...

2016-03-03 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/19

Send travis-ci emails to their default recipients



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam travis-emails

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/19.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #19


commit 6779cde60e2d25559b0e4cb10cd4184f2ce9a840
Author: Kenneth Knowles 
Date:   2016-03-04T03:17:07Z

Send travis-ci emails to their default recipients




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-87) Allow heap dumps on OOM to be disabled (default)

2016-03-03 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-87:
-
Assignee: Mark Shields  (was: Frances Perry)

> Allow heap dumps on OOM to be disabled (default)
> 
>
> Key: BEAM-87
> URL: https://issues.apache.org/jira/browse/BEAM-87
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>
> Runners may wish to save a heap dump on OOM. (The Google runner does so). 
> However since heap dumps can be large some runners may struggle to save them. 
> (The Google runner saves heap dumps on the vm's 'boot' disk, which is 
> typically only 20GB). We should allow users to opt-in to heap dumps, at which 
> point they can also ensure their environment is provisioned to accept them. 
> (Eg for the Google runner, increase the boot disk size.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-87) Allow heap dumps on OOM to be disabled (default)

2016-03-03 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-87:
-
Component/s: (was: runner-core)
 sdk-java-core

> Allow heap dumps on OOM to be disabled (default)
> 
>
> Key: BEAM-87
> URL: https://issues.apache.org/jira/browse/BEAM-87
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Mark Shields
>Assignee: Frances Perry
>
> Runners may wish to save a heap dump on OOM. (The Google runner does so). 
> However since heap dumps can be large some runners may struggle to save them. 
> (The Google runner saves heap dumps on the vm's 'boot' disk, which is 
> typically only 20GB). We should allow users to opt-in to heap dumps, at which 
> point they can also ensure their environment is provisioned to accept them. 
> (Eg for the Google runner, increase the boot disk size.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-89) DataflowPipelineJob should have an API that prints messages but doesn't wait for completion

2016-03-03 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-89?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-89:
-
Assignee: (was: Davor Bonaci)

> DataflowPipelineJob should have an API that prints messages but doesn't wait 
> for completion
> ---
>
> Key: BEAM-89
> URL: https://issues.apache.org/jira/browse/BEAM-89
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Priority: Minor
>
> DataflowPipelineJob has a method waitToFinish() that takes a handler for 
> printing the job's output messages, AND waits for the job to finish, printing 
> messages along the way using that handler.
> However, there are cases when a caller would like to poll for the job's 
> messages and print them, but would like to keep the job under the caller's 
> control, rather than having to wait for it to complete.
> E.g., one can imagine wanting to do the following "wait until a certain 
> Aggregator in the job reaches a certain value, and then cancel the job, 
> printing messages along the way". This is not possible with the current API, 
> without copying code of waitToFinish().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-89) DataflowPipelineJob should have an API that prints messages but doesn't wait for completion

2016-03-03 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-89?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179215#comment-15179215
 ] 

Davor Bonaci commented on BEAM-89:
--

Yup; waitToFinish() should do just that -- wait until the job finishes. Getting 
messages should be orthogonal to this.

This is on the intersection of the core SDK and Dataflow runner. These should 
be abstractions available for any job, with specific implementations in each 
runner. Thus, I'm changing the component.

> DataflowPipelineJob should have an API that prints messages but doesn't wait 
> for completion
> ---
>
> Key: BEAM-89
> URL: https://issues.apache.org/jira/browse/BEAM-89
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Eugene Kirpichov
>Assignee: Davor Bonaci
>Priority: Minor
>
> DataflowPipelineJob has a method waitToFinish() that takes a handler for 
> printing the job's output messages, AND waits for the job to finish, printing 
> messages along the way using that handler.
> However, there are cases when a caller would like to poll for the job's 
> messages and print them, but would like to keep the job under the caller's 
> control, rather than having to wait for it to complete.
> E.g., one can imagine wanting to do the following "wait until a certain 
> Aggregator in the job reaches a certain value, and then cancel the job, 
> printing messages along the way". This is not possible with the current API, 
> without copying code of waitToFinish().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-88) DataflowPipelineOptions.tempLocation doesn't really default to stagingLocation

2016-03-03 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-88?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-88:
-
Assignee: (was: Davor Bonaci)

> DataflowPipelineOptions.tempLocation doesn't really default to stagingLocation
> --
>
> Key: BEAM-88
> URL: https://issues.apache.org/jira/browse/BEAM-88
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Eugene Kirpichov
>Priority: Trivial
>
> The documentation of DataflowPipelineOptions.tempLocation says: ..."defaults 
> to using stagingLocation."...
> However calling .getTempLocation() when only --stagingLocation is specified 
> on the command line gives null.
> The "defaulting" is really done in DataflowPipelineRunner.fromOptions():
> {code}
> if (Strings.isNullOrEmpty(dataflowOptions.getTempLocation())) {
>   dataflowOptions.setTempLocation(dataflowOptions.getStagingLocation());
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is back to normal : beam #10

2016-03-03 Thread Apache Jenkins Server
See 



Jenkins build is back to normal : beam » Google Cloud Dataflow Java SDK - Parent #10

2016-03-03 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-94) CountingInput should be able to limit the rate at which output is produced

2016-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178821#comment-15178821
 ] 

ASF GitHub Bot commented on BEAM-94:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/17

[BEAM-94] Add UnboundedCountingInput#withPeriod

The period between elements controls the rate at which
UnboundedCountingInput will output elements. This is an aggregate rate
across all instances of the source, and thus elements will not
necessarily be output "smoothly", or within the first period. The
aggregate rate, however, will be approximately equal to the provided
rate.

Add package-private CountingSource.createUnbounded() to expose the
UnboundedCountingSource type. Make UnboundedCountingSource
package-private.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
rate_limited_counting_source

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/17.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #17


commit 1db69553ac3fef93b56f4d52d02c30f170ca0698
Author: Thomas Groh 
Date:   2016-02-29T21:48:09Z

Add UnboundedCountingInput#withPeriod

The period between elements controls the rate at which
UnboundedCountingInput will output elements. This is an aggregate rate
across all instances of the source, and thus elements will not
necessarily be output "smoothly", or within the first period. The
aggregate rate, however, will be approximately equal to the provided
rate.

Add package-private CountingSource.createUnbounded() to expose the
UnboundedCountingSource type. Make UnboundedCountingSource
package-private.




> CountingInput should be able to limit the rate at which output is produced
> --
>
> Key: BEAM-94
> URL: https://issues.apache.org/jira/browse/BEAM-94
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #14

2016-03-03 Thread kenn
This closes #14


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/05285702
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/05285702
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/05285702

Branch: refs/heads/master
Commit: 052857023354cea57086ed0ccc043412833d1fc5
Parents: 5a7bd80 320a75b
Author: Kenneth Knowles 
Authored: Thu Mar 3 15:19:52 2016 -0800
Committer: Kenneth Knowles 
Committed: Thu Mar 3 15:19:52 2016 -0800

--
 .../runners/dataflow/TestCountingSource.java| 31 +---
 1 file changed, 27 insertions(+), 4 deletions(-)
--




[GitHub] incubator-beam pull request: [BEAM-90] TestCountingSource can thro...

2016-03-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/14


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: [BEAM-90] TestCountingSource can throw on checkpointing

2016-03-03 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5a7bd8083 -> 052857023


[BEAM-90] TestCountingSource can throw on checkpointing


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/320a75b1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/320a75b1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/320a75b1

Branch: refs/heads/master
Commit: 320a75b1da3bbdb9dc5a30c6d0f6811163bddb85
Parents: d4dcaaa
Author: Mark Shields 
Authored: Wed Mar 2 20:49:53 2016 -0800
Committer: Kenneth Knowles 
Committed: Thu Mar 3 15:18:44 2016 -0800

--
 .../runners/dataflow/TestCountingSource.java| 31 +---
 1 file changed, 27 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/320a75b1/sdk/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/TestCountingSource.java
--
diff --git 
a/sdk/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/TestCountingSource.java
 
b/sdk/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/TestCountingSource.java
index 181ddca..d0863a4 100644
--- 
a/sdk/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/TestCountingSource.java
+++ 
b/sdk/src/test/java/com/google/cloud/dataflow/sdk/runners/dataflow/TestCountingSource.java
@@ -27,6 +27,8 @@ import com.google.cloud.dataflow.sdk.options.PipelineOptions;
 import com.google.cloud.dataflow.sdk.values.KV;
 
 import org.joda.time.Instant;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 import java.io.IOException;
 import java.util.ArrayList;
@@ -46,31 +48,47 @@ import javax.annotation.Nullable;
  */
 public class TestCountingSource
 extends UnboundedSource, 
TestCountingSource.CounterMark> {
+  private static final Logger LOG = 
LoggerFactory.getLogger(TestCountingSource.class);
+
   private static List finalizeTracker;
   private final int numMessagesPerShard;
   private final int shardNumber;
   private final boolean dedup;
+  private final boolean throwOnFirstSnapshot;
+
+  /**
+   * We only allow an exception to be thrown from getCheckpointMark
+   * at most once. This must be static since the entire TestCountingSource
+   * instance may re-serialized when the pipeline recovers and retries.
+   */
+  private static boolean thrown = false;
 
   public static void setFinalizeTracker(List finalizeTracker) {
 TestCountingSource.finalizeTracker = finalizeTracker;
   }
 
   public TestCountingSource(int numMessagesPerShard) {
-this(numMessagesPerShard, 0, false);
+this(numMessagesPerShard, 0, false, false);
   }
 
   public TestCountingSource withDedup() {
-return new TestCountingSource(numMessagesPerShard, shardNumber, true);
+return new TestCountingSource(numMessagesPerShard, shardNumber, true, 
throwOnFirstSnapshot);
   }
 
   private TestCountingSource withShardNumber(int shardNumber) {
-return new TestCountingSource(numMessagesPerShard, shardNumber, dedup);
+return new TestCountingSource(numMessagesPerShard, shardNumber, dedup, 
throwOnFirstSnapshot);
   }
 
-  private TestCountingSource(int numMessagesPerShard, int shardNumber, boolean 
dedup) {
+  public TestCountingSource withThrowOnFirstSnapshot(boolean 
throwOnFirstSnapshot) {
+return new TestCountingSource(numMessagesPerShard, shardNumber, dedup, 
throwOnFirstSnapshot);
+  }
+
+  private TestCountingSource(
+  int numMessagesPerShard, int shardNumber, boolean dedup, boolean 
throwOnFirstSnapshot) {
 this.numMessagesPerShard = numMessagesPerShard;
 this.shardNumber = shardNumber;
 this.dedup = dedup;
+this.throwOnFirstSnapshot = throwOnFirstSnapshot;
   }
 
   public int getShardNumber() {
@@ -187,6 +205,11 @@ public class TestCountingSource
 
 @Override
 public CheckpointMark getCheckpointMark() {
+  if (throwOnFirstSnapshot && !thrown) {
+thrown = true;
+LOG.error("Throwing exception while checkpointing counter");
+throw new RuntimeException("failed during checkpoint");
+  }
   return new CounterMark(current);
 }
 



[jira] [Commented] (BEAM-86) CountingSource should expose PTransform<PInput, PCollection> rather than sources directly

2016-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-86?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178804#comment-15178804
 ] 

ASF GitHub Bot commented on BEAM-86:


Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/10


> CountingSource should expose PTransform rather 
> than sources directly
> ---
>
> Key: BEAM-86
> URL: https://issues.apache.org/jira/browse/BEAM-86
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
>
> We can make the Source exposure be package-private for tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: beam #8

2016-03-03 Thread Apache Jenkins Server
See 

Changes:

[markshields] [BEAM-87] Support opt-in for heap dumps

[markshields] Clarrify heap file path.

[klk] Fix typo

[dhalperi] Add CountingInput as a PTransform

--
[...truncated 1781 lines...]
jdk1.8.0_66/jre/lib/amd64/libsaproc.so
jdk1.8.0_66/jre/lib/amd64/libdecora_sse.so
jdk1.8.0_66/jre/lib/amd64/libj2pcsc.so
jdk1.8.0_66/jre/lib/amd64/libjfxwebkit.so
jdk1.8.0_66/jre/lib/amd64/libfontmanager.so
jdk1.8.0_66/jre/lib/amd64/libjsoundalsa.so
jdk1.8.0_66/jre/lib/amd64/libbci.so
jdk1.8.0_66/jre/lib/amd64/libjdwp.so
jdk1.8.0_66/jre/lib/amd64/libjsound.so
jdk1.8.0_66/jre/lib/amd64/libjaas_unix.so
jdk1.8.0_66/jre/lib/amd64/libavplugin-53.so
jdk1.8.0_66/jre/lib/amd64/libattach.so
jdk1.8.0_66/jre/lib/amd64/libresource.so
jdk1.8.0_66/jre/lib/amd64/libjava.so
jdk1.8.0_66/jre/lib/amd64/libjfr.so
jdk1.8.0_66/jre/lib/amd64/libawt.so
jdk1.8.0_66/jre/lib/amd64/libjawt.so
jdk1.8.0_66/jre/lib/amd64/libverify.so
jdk1.8.0_66/jre/lib/amd64/libzip.so
jdk1.8.0_66/jre/lib/amd64/libjavafx_iio.so
jdk1.8.0_66/jre/lib/amd64/libjava_crw_demo.so
jdk1.8.0_66/jre/lib/amd64/libjfxmedia.so
jdk1.8.0_66/jre/lib/amd64/libnet.so
jdk1.8.0_66/jre/lib/amd64/libjavafx_font.so
jdk1.8.0_66/jre/lib/amd64/libprism_common.so
jdk1.8.0_66/jre/lib/amd64/libnio.so
jdk1.8.0_66/jre/lib/amd64/libprism_es2.so
jdk1.8.0_66/jre/lib/amd64/libinstrument.so
jdk1.8.0_66/jre/lib/amd64/libkcms.so
jdk1.8.0_66/jre/lib/amd64/libawt_xawt.so
jdk1.8.0_66/jre/lib/amd64/libmanagement.so
jdk1.8.0_66/jre/lib/amd64/libunpack.so
jdk1.8.0_66/jre/lib/amd64/libgstreamer-lite.so
jdk1.8.0_66/jre/lib/amd64/libawt_headless.so
jdk1.8.0_66/jre/lib/amd64/libsplashscreen.so
jdk1.8.0_66/jre/lib/fontconfig.properties.src
jdk1.8.0_66/jre/lib/psfont.properties.ja
jdk1.8.0_66/jre/lib/fontconfig.Turbo.properties.src
jdk1.8.0_66/jre/lib/jce.jar
jdk1.8.0_66/jre/lib/flavormap.properties
jdk1.8.0_66/jre/lib/jfxswt.jar
jdk1.8.0_66/jre/lib/fontconfig.SuSE.10.properties.src
jdk1.8.0_66/jre/lib/fontconfig.SuSE.11.bfc
jdk1.8.0_66/jre/COPYRIGHT
jdk1.8.0_66/jre/THIRDPARTYLICENSEREADME-JAVAFX.txt
jdk1.8.0_66/jre/Welcome.html
jdk1.8.0_66/jre/README
jdk1.8.0_66/README.html
Building remotely on jenkins-ubuntu-1404-4gb-5d8 (jenkins-cloud-4GB cloud-slave 
Ubuntu ubuntu) in workspace 
Cloning the remote Git repository
Cloning repository https://git-wip-us.apache.org/repos/asf/incubator-beam.git
 > git init  # timeout=10
Fetching upstream changes from 
https://git-wip-us.apache.org/repos/asf/incubator-beam.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git # timeout=10
Fetching upstream changes from 
https://git-wip-us.apache.org/repos/asf/incubator-beam.git
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision 5a7bd80832d72ed8a287d4aab1f1f9cfa6d18c8a 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5a7bd80832d72ed8a287d4aab1f1f9cfa6d18c8a
 > git rev-list 5e825041153d84c0e0e737a478d99c6b2839b096 # timeout=10
Parsing POMs
Downloaded artifact 
http://repo.maven.apache.org/maven2/com/google/google/5/google-5.pom
Established TCP socket on 46182
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
[beam] $ /jenkins/tools/hudson.model.JDK/jdk1.8.0_66/bin/java -Xmx2g -Xms256m 
-XX:MaxPermSize=512m -cp 
/jenkins/maven32-agent.jar:/jenkins/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3/boot/plexus-classworlds-2.5.2.jar:/jenkins/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3/conf/logging
 jenkins.maven3.agent.Maven32Main 
/jenkins/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3 
/jenkins/slave.jar /jenkins/maven32-interceptor.jar 
/jenkins/maven3-interceptor-commons.jar 46182
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f  
-Dmaven.repo.local=/jenkins/maven-repositories/0 -B -e clean deploy
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] Downloading: 

Build failed in Jenkins: beam » Google Cloud Dataflow Java SDK - Parent #8

2016-03-03 Thread Apache Jenkins Server
See 


--
Established TCP socket on 46182
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f 

 -Dmaven.repo.local=/jenkins/maven-repositories/0 -B -e clean deploy
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/2.4/archetype-packaging-2.4.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/2.4/archetype-packaging-2.4.pom
 (2 KB at 3.7 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/maven-archetype/2.4/maven-archetype-2.4.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/maven-archetype/2.4/maven-archetype-2.4.pom
 (13 KB at 758.3 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/2.4/archetype-packaging-2.4.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/archetype/archetype-packaging/2.4/archetype-packaging-2.4.jar
 (8 KB at 490.5 KB/sec)
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Google Cloud Dataflow Java SDK - Parent
[INFO] Google Cloud Dataflow Java SDK - All
[INFO] Google Cloud Dataflow Java Examples - All
[INFO] Google Cloud Dataflow Java SDK - Starter Archetype
[INFO] Google Cloud Dataflow Java SDK - Examples Archetype
[INFO] 
[INFO] 
[INFO] Building Google Cloud Dataflow Java SDK - Parent 1.5.0-SNAPSHOT
[INFO] 
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.4/maven-install-plugin-2.4.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.4/maven-install-plugin-2.4.pom
 (7 KB at 366.6 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.4/maven-install-plugin-2.4.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-install-plugin/2.4/maven-install-plugin-2.4.jar
 (27 KB at 1645.0 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.pom
 (6 KB at 365.4 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.jar
 (27 KB at 1871.6 KB/sec)
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0.5/plexus-utils-3.0.5.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0.5/plexus-utils-3.0.5.pom
 (3 KB at 204.4 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-settings/2.0.6/maven-settings-2.0.6.jar
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-profile/2.0.6/maven-profile-2.0.6.jar
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-registry/2.0.6/maven-plugin-registry-2.0.6.jar
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-artifact-manager/2.0.6/maven-artifact-manager-2.0.6.jar
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-repository-metadata/2.0.6/maven-repository-metadata-2.0.6.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-registry/2.0.6/maven-plugin-registry-2.0.6.jar
 (29 KB at 763.4 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0.5/plexus-utils-3.0.5.jar
[INFO] Downloaded: 

[GitHub] incubator-beam pull request: [BEAM-86] Add CountingInput as a PTra...

2016-03-03 Thread tgroh
Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/10


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-94) CountingInput should be able to limit the rate at which output is produced

2016-03-03 Thread Thomas Groh (JIRA)
Thomas Groh created BEAM-94:
---

 Summary: CountingInput should be able to limit the rate at which 
output is produced
 Key: BEAM-94
 URL: https://issues.apache.org/jira/browse/BEAM-94
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Thomas Groh
Assignee: Davor Bonaci






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


incubator-beam git commit: Add CountingInput as a PTransform

2016-03-03 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 365029863 -> 5a7bd8083


Add CountingInput as a PTransform

This transform produces an unbounded PCollection containing longs based
on a CountingSource.

Deprecate methods producing a Source in CountingSource.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5a7bd808
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5a7bd808
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5a7bd808

Branch: refs/heads/master
Commit: 5a7bd80832d72ed8a287d4aab1f1f9cfa6d18c8a
Parents: 3650298
Author: Thomas Groh 
Authored: Wed Mar 2 14:03:28 2016 -0800
Committer: Dan Halperin 
Committed: Thu Mar 3 14:33:01 2016 -0800

--
 .../cloud/dataflow/sdk/io/CountingInput.java| 191 +++
 .../cloud/dataflow/sdk/io/CountingSource.java   |  31 ++-
 .../dataflow/sdk/io/CountingInputTest.java  | 122 
 3 files changed, 334 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5a7bd808/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/CountingInput.java
--
diff --git 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/CountingInput.java 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/CountingInput.java
new file mode 100644
index 000..07609ba
--- /dev/null
+++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/CountingInput.java
@@ -0,0 +1,191 @@
+/*
+ * Copyright (C) 2016 Google Inc.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License"); you may not
+ * use this file except in compliance with the License. You may obtain a copy 
of
+ * the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+
+package com.google.cloud.dataflow.sdk.io;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import com.google.cloud.dataflow.sdk.io.CountingSource.NowTimestampFn;
+import com.google.cloud.dataflow.sdk.io.Read.Unbounded;
+import com.google.cloud.dataflow.sdk.transforms.PTransform;
+import com.google.cloud.dataflow.sdk.transforms.SerializableFunction;
+import com.google.cloud.dataflow.sdk.values.PBegin;
+import com.google.cloud.dataflow.sdk.values.PCollection;
+import com.google.cloud.dataflow.sdk.values.PCollection.IsBounded;
+import com.google.common.base.Optional;
+
+import org.joda.time.Duration;
+import org.joda.time.Instant;
+
+/**
+ * A {@link PTransform} that produces longs. When used to produce a
+ * {@link IsBounded#BOUNDED bounded} {@link PCollection}, {@link 
CountingInput} starts at {@code 0}
+ * and counts up to a specified maximum. When used to produce an
+ * {@link IsBounded#UNBOUNDED unbounded} {@link PCollection}, it counts up to 
{@link Long#MAX_VALUE}
+ * and then never produces more output. (In practice, this limit should never 
be reached.)
+ *
+ * The bounded {@link CountingInput} is implemented based on {@link 
OffsetBasedSource} and
+ * {@link OffsetBasedSource.OffsetBasedReader}, so it performs efficient 
initial splitting and it
+ * supports dynamic work rebalancing.
+ *
+ * To produce a bounded {@code PCollection}, use {@link 
CountingInput#upTo(long)}:
+ *
+ * {@code
+ * Pipeline p = ...
+ * PTransform producer = CountingInput.upTo(1000);
+ * PCollection bounded = p.apply(producer);
+ * }
+ *
+ * To produce an unbounded {@code PCollection}, use {@link 
CountingInput#unbounded()},
+ * calling {@link 
UnboundedCountingInput#withTimestampFn(SerializableFunction)} to provide values
+ * with timestamps other than {@link Instant#now}.
+ *
+ * {@code
+ * Pipeline p = ...
+ *
+ * // To create an unbounded producer that uses processing time as the element 
timestamp.
+ * PCollection unbounded = p.apply(CountingInput.unbounded());
+ * // Or, to create an unbounded source that uses a provided function to set 
the element timestamp.
+ * PCollection unboundedWithTimestamps =
+ * p.apply(CountingInput.unbounded().withTimestampFn(someFn));
+ * }
+ */
+public class CountingInput {
+  /**
+   * Creates a {@link BoundedCountingInput} that will produce the specified 
number of elements,
+   * from {@code 0} to {@code numElements - 1}.
+   */
+  public static BoundedCountingInput upTo(long numElements) {
+checkArgument(numElements > 0, "numElements (%s) must be 

[GitHub] incubator-beam pull request: [BEAM-87] Support opt-in for heap dum...

2016-03-03 Thread mshields822
Github user mshields822 closed the pull request at:

https://github.com/apache/incubator-beam/pull/8


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-87) Allow heap dumps on OOM to be disabled (default)

2016-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-87?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178730#comment-15178730
 ] 

ASF GitHub Bot commented on BEAM-87:


Github user mshields822 closed the pull request at:

https://github.com/apache/incubator-beam/pull/8


> Allow heap dumps on OOM to be disabled (default)
> 
>
> Key: BEAM-87
> URL: https://issues.apache.org/jira/browse/BEAM-87
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Frances Perry
>
> Runners may wish to save a heap dump on OOM. (The Google runner does so). 
> However since heap dumps can be large some runners may struggle to save them. 
> (The Google runner saves heap dumps on the vm's 'boot' disk, which is 
> typically only 20GB). We should allow users to opt-in to heap dumps, at which 
> point they can also ensure their environment is provisioned to accept them. 
> (Eg for the Google runner, increase the boot disk size.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/4] incubator-beam git commit: [BEAM-87] Support opt-in for heap dumps

2016-03-03 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5e8250411 -> 365029863


[BEAM-87] Support opt-in for heap dumps


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5cd05891
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5cd05891
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5cd05891

Branch: refs/heads/master
Commit: 5cd0589158707e93609e05b0e31e45970c993a13
Parents: d4dcaaa
Author: Mark Shields 
Authored: Wed Mar 2 13:17:02 2016 -0800
Committer: Mark Shields 
Committed: Wed Mar 2 14:39:20 2016 -0800

--
 .../sdk/options/DataflowPipelineDebugOptions.java  | 13 +
 1 file changed, 13 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/5cd05891/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
--
diff --git 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
index e94b56d..17728c9 100644
--- 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
+++ 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
@@ -209,6 +209,19 @@ public interface DataflowPipelineDebugOptions extends 
PipelineOptions {
   void setNumberOfWorkerHarnessThreads(int value);
 
   /**
+   * If {@literal true}, save a heap dump before killing a thread or process 
which is GC
+   * thrashing or out of memory.
+   *
+   * 
+   * CAUTION: Heap dumps can of comparable size to the default boot disk. 
Consider increasing
+   * the boot disk size before setting this flag to true.
+   */
+  @Description("If {@literal true}, save a heap dump before killing a thread 
or process "
+  + "which is GC thrashing or out of memory.")
+  boolean getDumpHeapOnOOM();
+  void setDumpHeapOnOOM(boolean dumpHeapBeforeExit);
+
+  /**
* Creates a {@link PathValidator} object using the class specified in
* {@link #getPathValidatorClass()}.
*/



[4/4] incubator-beam git commit: Closes #8

2016-03-03 Thread kenn
Closes #8


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/36502986
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/36502986
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/36502986

Branch: refs/heads/master
Commit: 365029863712926472a50f891f133e8b197f355a
Parents: 5e82504 2eda3fc
Author: Kenneth Knowles 
Authored: Thu Mar 3 14:20:13 2016 -0800
Committer: Kenneth Knowles 
Committed: Thu Mar 3 14:20:13 2016 -0800

--
 .../sdk/options/DataflowPipelineDebugOptions.java | 14 ++
 1 file changed, 14 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/36502986/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
--



[2/4] incubator-beam git commit: Clarrify heap file path.

2016-03-03 Thread kenn
Clarrify heap file path.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/616f898e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/616f898e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/616f898e

Branch: refs/heads/master
Commit: 616f898ec62e13eeccf0d6b970c5ce3495028070
Parents: 5cd0589
Author: Mark Shields 
Authored: Thu Mar 3 10:53:25 2016 -0800
Committer: Mark Shields 
Committed: Thu Mar 3 10:53:25 2016 -0800

--
 .../cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java  | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/616f898e/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
--
diff --git 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
index 17728c9..05d23be 100644
--- 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
+++ 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/options/DataflowPipelineDebugOptions.java
@@ -210,7 +210,8 @@ public interface DataflowPipelineDebugOptions extends 
PipelineOptions {
 
   /**
* If {@literal true}, save a heap dump before killing a thread or process 
which is GC
-   * thrashing or out of memory.
+   * thrashing or out of memory. The location of the heap file will either be 
echoed back
+   * to the user, or the user will be given the oppourtunity to download the 
heap file.
*
* 
* CAUTION: Heap dumps can of comparable size to the default boot disk. 
Consider increasing



Build failed in Jenkins: beam » Google Cloud Dataflow Java SDK - Parent #7

2016-03-03 Thread Apache Jenkins Server
See 


--
Established TCP socket on 43584
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f 

 -Dmaven.repo.local=/home/jenkins/jenkins-slave/maven-repositories/0 -B -e 
clean deploy
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Google Cloud Dataflow Java SDK - Parent
[INFO] Google Cloud Dataflow Java SDK - All
[INFO] Google Cloud Dataflow Java Examples - All
[INFO] Google Cloud Dataflow Java SDK - Starter Archetype
[INFO] Google Cloud Dataflow Java SDK - Examples Archetype
[INFO] 
[INFO] 
[INFO] Building Google Cloud Dataflow Java SDK - Parent 1.5.0-SNAPSHOT
[INFO] 
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.pom
 (6 KB at 11.3 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.jar
 (27 KB at 459.7 KB/sec)
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/0/com/google/cloud/dataflow/google-cloud-dataflow-java-sdk-parent/1.5.0-SNAPSHOT/google-cloud-dataflow-java-sdk-parent-1.5.0-SNAPSHOT.pom
[INFO] 
[INFO] --- maven-deploy-plugin:2.7:deploy (default-deploy) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] Downloading: 
http://oss.sonatype.org/content/repositories/google-snapshots/com/google/cloud/dataflow/google-cloud-dataflow-java-sdk-parent/1.5.0-SNAPSHOT/maven-metadata.xml
[INFO] Uploading: 
http://oss.sonatype.org/content/repositories/google-snapshots/com/google/cloud/dataflow/google-cloud-dataflow-java-sdk-parent/1.5.0-SNAPSHOT/google-cloud-dataflow-java-sdk-parent-1.5.0-20160303.221751-1.pom


Build failed in Jenkins: beam #7

2016-03-03 Thread Apache Jenkins Server
See 

Changes:

[peihe0] [BEAM-80] Swap the order of timers and elements sent to ReduceFnRunner

--
Started by an SCM change
[EnvInject] - Loading node environment variables.
Building remotely on ubuntu-us1 (Ubuntu ubuntu ubuntu-us) in workspace 

Cloning the remote Git repository
Cloning repository https://git-wip-us.apache.org/repos/asf/incubator-beam.git
 > git init  # timeout=10
Fetching upstream changes from 
https://git-wip-us.apache.org/repos/asf/incubator-beam.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git # timeout=10
Fetching upstream changes from 
https://git-wip-us.apache.org/repos/asf/incubator-beam.git
 > git -c core.askpass=true fetch --tags --progress 
 > https://git-wip-us.apache.org/repos/asf/incubator-beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision 5e825041153d84c0e0e737a478d99c6b2839b096 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 5e825041153d84c0e0e737a478d99c6b2839b096
 > git rev-list 7582212f58deb8628dfabfb26b63c9391ee3209a # timeout=10
Parsing POMs
Established TCP socket on 43584
maven32-agent.jar already up to date
maven32-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
[beam] $ 
/home/jenkins/jenkins-slave/tools/hudson.model.JDK/jdk1.8.0_66/bin/java -Xmx2g 
-Xms256m -XX:MaxPermSize=512m -cp 
/home/jenkins/jenkins-slave/maven32-agent.jar:/home/jenkins/jenkins-slave/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3/boot/plexus-classworlds-2.5.2.jar:/home/jenkins/jenkins-slave/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3/conf/logging
 jenkins.maven3.agent.Maven32Main 
/home/jenkins/jenkins-slave/tools/hudson.tasks.Maven_MavenInstallation/maven-3.3.3
 /home/jenkins/jenkins-slave/slave.jar 
/home/jenkins/jenkins-slave/maven32-interceptor.jar 
/home/jenkins/jenkins-slave/maven3-interceptor-commons.jar 43584
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f  
-Dmaven.repo.local=/home/jenkins/jenkins-slave/maven-repositories/0 -B -e clean 
deploy
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] Google Cloud Dataflow Java SDK - Parent
[INFO] Google Cloud Dataflow Java SDK - All
[INFO] Google Cloud Dataflow Java Examples - All
[INFO] Google Cloud Dataflow Java SDK - Starter Archetype
[INFO] Google Cloud Dataflow Java SDK - Examples Archetype
[INFO] 
[INFO] 
[INFO] Building Google Cloud Dataflow Java SDK - Parent 1.5.0-SNAPSHOT
[INFO] 
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.pom
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.pom
 (6 KB at 11.3 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/plugins/maven-deploy-plugin/2.7/maven-deploy-plugin-2.7.jar
 (27 KB at 459.7 KB/sec)
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] 
[INFO] --- maven-install-plugin:2.4:install (default-install) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] Installing  to 
/home/jenkins/jenkins-slave/maven-repositories/0/com/google/cloud/dataflow/google-cloud-dataflow-java-sdk-parent/1.5.0-SNAPSHOT/google-cloud-dataflow-java-sdk-parent-1.5.0-SNAPSHOT.pom
[INFO] 
[INFO] --- maven-deploy-plugin:2.7:deploy (default-deploy) @ 
google-cloud-dataflow-java-sdk-parent ---
[INFO] Downloading: 

[jira] [Commented] (BEAM-80) Support combiner lifting for (Keyed)CombineWithContext

2016-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178700#comment-15178700
 ] 

ASF GitHub Bot commented on BEAM-80:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/9


> Support combiner lifting for (Keyed)CombineWithContext
> --
>
> Key: BEAM-80
> URL: https://issues.apache.org/jira/browse/BEAM-80
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Reporter: Pei He
>Assignee: Pei He
>
> This is a missing feature of combine with context.
> Combiner lifting is currently disabled for (Keyed)CombineWithContext with a 
> passing through ParDo.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: [BEAM-80] Swap the order of timers and elements sent to ReduceFnRunner

2016-03-03 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 7582212f5 -> 5e8250411


[BEAM-80] Swap the order of timers and elements sent to ReduceFnRunner


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a359409c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a359409c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a359409c

Branch: refs/heads/master
Commit: a359409c256109b50f236761e03a52cfd1a32340
Parents: 211e76a
Author: Pei He 
Authored: Wed Mar 2 13:36:23 2016 -0800
Committer: Pei He 
Committed: Wed Mar 2 14:48:05 2016 -0800

--
 .../cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a359409c/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java
--
diff --git 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java
 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java
index f6246d1..89a4fcb 100644
--- 
a/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java
+++ 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java
@@ -80,10 +80,10 @@ public class GroupAlsoByWindowViaWindowSetDoFn<
 reduceFn,
 c.getPipelineOptions());
 
+reduceFnRunner.processElements(element.elementsIterable());
 for (TimerData timer : element.timersIterable()) {
   reduceFnRunner.onTimer(timer);
 }
-reduceFnRunner.processElements(element.elementsIterable());
 reduceFnRunner.persist();
   }
 



[2/2] incubator-beam git commit: Closes #9

2016-03-03 Thread kenn
Closes #9


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5e825041
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5e825041
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5e825041

Branch: refs/heads/master
Commit: 5e825041153d84c0e0e737a478d99c6b2839b096
Parents: 7582212 a359409
Author: Kenneth Knowles 
Authored: Thu Mar 3 14:06:16 2016 -0800
Committer: Kenneth Knowles 
Committed: Thu Mar 3 14:06:16 2016 -0800

--
 .../cloud/dataflow/sdk/util/GroupAlsoByWindowViaWindowSetDoFn.java | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[GitHub] incubator-beam pull request: [BEAM-80] Change ReduceFnRunner to pr...

2016-03-03 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/9


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: [BEAM-93] Add subnetwork support and ...

2016-03-03 Thread sammcveety
GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/16

[BEAM-93] Add subnetwork support and increment Dataflow API dependency

Add subnetwork to the Dataflow runner through a new option.  This support 
was added in the newest API release, so increment the API dependency as well.

Still need to run tests, but otherwise, so far so good?  @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/16.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-93) Support Compute Engine Subnetworks

2016-03-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178378#comment-15178378
 ] 

ASF GitHub Bot commented on BEAM-93:


GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/16

[BEAM-93] Add subnetwork support and increment Dataflow API dependency

Add subnetwork to the Dataflow runner through a new option.  This support 
was added in the newest API release, so increment the API dependency as well.

Still need to run tests, but otherwise, so far so good?  @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/16.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16






> Support Compute Engine Subnetworks
> --
>
> Key: BEAM-93
> URL: https://issues.apache.org/jira/browse/BEAM-93
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Sam McVeety
>Assignee: Davor Bonaci
>Priority: Minor
>
> The Dataflow runner for Beam should support Compute Engine subnetworks for 
> the workers: https://cloud.google.com/compute/docs/subnetworks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-93) Support Compute Engine Subnetworks

2016-03-03 Thread Sam McVeety (JIRA)
Sam McVeety created BEAM-93:
---

 Summary: Support Compute Engine Subnetworks
 Key: BEAM-93
 URL: https://issues.apache.org/jira/browse/BEAM-93
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Sam McVeety
Assignee: Davor Bonaci
Priority: Minor


The Dataflow runner for Beam should support Compute Engine subnetworks for the 
workers: https://cloud.google.com/compute/docs/subnetworks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: Register debuggee prior to job submis...

2016-03-03 Thread bjchambers
GitHub user bjchambers opened a pull request:

https://github.com/apache/incubator-beam/pull/15

Register debuggee prior to job submission

@davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bjchambers/incubator-beam cdbg

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/15.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15


commit d1bf08463e091d6900dd81bcfde4948f9f9a09b9
Author: bchambers 
Date:   2016-03-03T01:38:49Z

Register debuggee prior to job submission




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-92) Data-dependent sinks

2016-03-03 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-92?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-92:

Component/s: beam-model

> Data-dependent sinks
> 
>
> Key: BEAM-92
> URL: https://issues.apache.org/jira/browse/BEAM-92
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Eugene Kirpichov
>
> Current sink API writes all data to a single destination, but there are many 
> use cases where different pieces of data need to be routed to different 
> destinations where the set of destinations is data-dependent (so can't be 
> implemented with a Partition transform).
> One internally discussed proposal was an API of the form:
> {code}
> PCollection PCollection.apply(
> Write.using(DoFn where,
> MapFn> how)
> {code}
> so an item T gets written to a destination (or multiple destinations) 
> determined by "where"; and the writing strategy is determined by "how" that 
> produces a WriteOperation (current API - global init/write/global finalize 
> hooks) for any given destination.
> This API also has other benefits:
> * allows the SinkT to be computed dynamically (in "where"), rather than 
> specified at pipeline construction time
> * removes the necessity for a Sink class entirely
> * is sequenceable w.r.t. downstream transforms (you can stick transforms onto 
> the returned PCollection, while the current Write.to() returns a PDone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-92) Data-dependent sinks

2016-03-03 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178173#comment-15178173
 ] 

Daniel Halperin commented on BEAM-92:
-

Abstractly, this is very easy to accomplish when the Sink is replaced by a 
DoFn. You can generate a KV with a ParDo(element -> 
KV.of(where(element), element))). Then a GroupByKey will give you a KV and then all the elements can be written to the sink.

However, of course, given that a single worker is processing all the elements, 
this will essentially be single-threaded writing.

This seems much more complicated when taking into account fault tolerance, 
finalizers, etc. Lots of design to do here.

> Data-dependent sinks
> 
>
> Key: BEAM-92
> URL: https://issues.apache.org/jira/browse/BEAM-92
> Project: Beam
>  Issue Type: New Feature
>Reporter: Eugene Kirpichov
>
> Current sink API writes all data to a single destination, but there are many 
> use cases where different pieces of data need to be routed to different 
> destinations where the set of destinations is data-dependent (so can't be 
> implemented with a Partition transform).
> One internally discussed proposal was an API of the form:
> {code}
> PCollection PCollection.apply(
> Write.using(DoFn where,
> MapFn> how)
> {code}
> so an item T gets written to a destination (or multiple destinations) 
> determined by "where"; and the writing strategy is determined by "how" that 
> produces a WriteOperation (current API - global init/write/global finalize 
> hooks) for any given destination.
> This API also has other benefits:
> * allows the SinkT to be computed dynamically (in "where"), rather than 
> specified at pipeline construction time
> * removes the necessity for a Sink class entirely
> * is sequenceable w.r.t. downstream transforms (you can stick transforms onto 
> the returned PCollection, while the current Write.to() returns a PDone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-92) Data-dependent sinks

2016-03-03 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-92?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15178147#comment-15178147
 ] 

Eugene Kirpichov commented on BEAM-92:
--

Note: it's not obvious how this API generalizes to streaming sinks. However we 
don't have unbounded sinks in the current API yet either.

> Data-dependent sinks
> 
>
> Key: BEAM-92
> URL: https://issues.apache.org/jira/browse/BEAM-92
> Project: Beam
>  Issue Type: New Feature
>Reporter: Eugene Kirpichov
>
> Current sink API writes all data to a single destination, but there are many 
> use cases where different pieces of data need to be routed to different 
> destinations where the set of destinations is data-dependent (so can't be 
> implemented with a Partition transform).
> One internally discussed proposal was an API of the form:
> {code}
> PCollection PCollection.apply(
> Write.using(DoFn where,
> MapFn> how)
> {code}
> so an item T gets written to a destination (or multiple destinations) 
> determined by "where"; and the writing strategy is determined by "how" that 
> produces a WriteOperation (current API - global init/write/global finalize 
> hooks) for any given destination.
> This API also has other benefits:
> * allows the SinkT to be computed dynamically (in "where"), rather than 
> specified at pipeline construction time
> * removes the necessity for a Sink class entirely
> * is sequenceable w.r.t. downstream transforms (you can stick transforms onto 
> the returned PCollection, while the current Write.to() returns a PDone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-90] TestCountingSource can thro...

2016-03-03 Thread mshields822
GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/14

[BEAM-90] TestCountingSource can throw on checkpointing



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam 
worker-cache-invalidate

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/14.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14


commit f2b7c60b52b92685388c60055eea194f04afe334
Author: Mark Shields 
Date:   2016-03-03T04:49:53Z

[BEAM-90] TestCountingSource can throw on checkpointing




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-92) Data-dependent sinks

2016-03-03 Thread Eugene Kirpichov (JIRA)
Eugene Kirpichov created BEAM-92:


 Summary: Data-dependent sinks
 Key: BEAM-92
 URL: https://issues.apache.org/jira/browse/BEAM-92
 Project: Beam
  Issue Type: New Feature
Reporter: Eugene Kirpichov


Current sink API writes all data to a single destination, but there are many 
use cases where different pieces of data need to be routed to different 
destinations where the set of destinations is data-dependent (so can't be 
implemented with a Partition transform).

One internally discussed proposal was an API of the form:
{code}
PCollection PCollection.apply(
Write.using(DoFn where,
MapFn> how)
{code}

so an item T gets written to a destination (or multiple destinations) 
determined by "where"; and the writing strategy is determined by "how" that 
produces a WriteOperation (current API - global init/write/global finalize 
hooks) for any given destination.

This API also has other benefits:
* allows the SinkT to be computed dynamically (in "where"), rather than 
specified at pipeline construction time
* removes the necessity for a Sink class entirely
* is sequenceable w.r.t. downstream transforms (you can stick transforms onto 
the returned PCollection, while the current Write.to() returns a PDone)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)