[GitHub] incubator-beam pull request: [BEAM-22] Return a map of CommittedBu...

2016-04-28 Thread tgroh
Github user tgroh closed the pull request at:

https://github.com/apache/incubator-beam/pull/249


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: [BEAM-22] Use CommittedResult in InMe...

2016-04-28 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/265

[BEAM-22] Use CommittedResult in InMemoryWatermarkManager

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This enable unprocessed elements to be handled in the Watermark manager
after they are added to the CommittedResult structure.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
ippr_committed_result_imwm

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/265.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #265


commit 39eb869fad368b7e9a07bd55f82d87369dd8f956
Author: Thomas Groh 
Date:   2016-04-28T20:42:36Z

Use CommittedResult in InMemoryWatermarkManager

This enable unprocessed elements to be handled in the Watermark manager
after they are added to the CommittedResult structure.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to stable : beam_PostCommit_MavenVerify » Apache Beam :: Examples :: Java All #302

2016-04-28 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_MavenVerify #303

2016-04-28 Thread Apache Jenkins Server
See 

Changes:

[tgroh] Add CommittedResult

--
[...truncated 821 lines...]
[INFO] Including com.google.http-client:google-http-client-protobuf:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client-java6:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-datastore-protobuf:jar:v1beta2-rev1-4.0.0 
in the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:gcsio:jar:1.4.3 in the shaded jar.
[INFO] Including com.google.api-client:google-api-client-java6:jar:1.20.0 in 
the shaded jar.
[INFO] Including com.google.api-client:google-api-client-jackson2:jar:1.20.0 in 
the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:util:jar:1.4.3 in the shaded jar.
[INFO] Excluding com.google.guava:guava:jar:19.0 from the shaded jar.
[INFO] Including com.google.protobuf:protobuf-java:jar:3.0.0-beta-1 in the 
shaded jar.
[INFO] Including com.google.code.findbugs:jsr305:jar:3.0.1 in the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-core:jar:2.7.0 in the 
shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-annotations:jar:2.7.0 in 
the shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-databind:jar:2.7.0 in the 
shaded jar.
[INFO] Including org.slf4j:slf4j-api:jar:1.7.14 in the shaded jar.
[INFO] Including org.apache.avro:avro:jar:1.7.7 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded 
jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the 
shaded jar.
[INFO] Including com.thoughtworks.paranamer:paranamer:jar:2.3 in the shaded jar.
[INFO] Including org.xerial.snappy:snappy-java:jar:1.1.2.1 in the shaded jar.
[INFO] Including org.apache.commons:commons-compress:jar:1.9 in the shaded jar.
[INFO] Including joda-time:joda-time:jar:2.4 in the shaded jar.
[INFO] Including org.codehaus.woodstox:stax2-api:jar:3.1.4 in the shaded jar.
[INFO] Including org.codehaus.woodstox:woodstox-core-asl:jar:4.4.1 in the 
shaded jar.
[INFO] Including org.tukaani:xz:jar:1.5 in the shaded jar.
[INFO] Including com.google.auto.service:auto-service:jar:1.0-rc2 in the shaded 
jar.
[INFO] Including com.google.auto:auto-common:jar:0.3 in the shaded jar.
[WARNING] grpc-netty-0.12.0.jar, grpc-all-0.12.0.jar define 80 overlapping 
classes: 
[WARNING]   - io.grpc.netty.AbstractNettyHandler
[WARNING]   - io.grpc.netty.NettyClientTransport
[WARNING]   - io.grpc.netty.NettyClientStream$1
[WARNING]   - io.grpc.netty.SendResponseHeadersCommand
[WARNING]   - io.grpc.netty.NettyServer$2
[WARNING]   - io.grpc.netty.NettyClientHandler$4
[WARNING]   - io.grpc.netty.NettyClientTransport$3
[WARNING]   - io.grpc.netty.ProtocolNegotiators$1$1
[WARNING]   - io.grpc.netty.NettyClientHandler$FrameListener
[WARNING]   - io.grpc.netty.ProtocolNegotiators$1
[WARNING]   - 70 more...
[WARNING] grpc-protobuf-nano-0.12.0.jar, grpc-all-0.12.0.jar define 4 
overlapping classes: 
[WARNING]   - io.grpc.protobuf.nano.NanoProtoInputStream
[WARNING]   - io.grpc.protobuf.nano.NanoUtils
[WARNING]   - io.grpc.protobuf.nano.NanoUtils$1
[WARNING]   - io.grpc.protobuf.nano.MessageNanoFactory
[WARNING] grpc-all-0.12.0.jar, grpc-stub-0.12.0.jar define 29 overlapping 
classes: 
[WARNING]   - io.grpc.stub.ServerCalls$UnaryRequestMethod
[WARNING]   - io.grpc.stub.StreamObserver
[WARNING]   - io.grpc.stub.ClientCalls$CallToStreamObserverAdapter
[WARNING]   - io.grpc.stub.ServerCalls$EmptyServerCallListener
[WARNING]   - io.grpc.stub.MetadataUtils$2$1
[WARNING]   - io.grpc.stub.MetadataUtils$1$1
[WARNING]   - io.grpc.stub.MetadataUtils$2$1$1
[WARNING]   - io.grpc.stub.MetadataUtils
[WARNING]   - io.grpc.stub.ServerCalls$1
[WARNING]   - io.grpc.stub.ServerCalls$ClientStreamingMethod
[WARNING]   - 19 more...
[WARNING] grpc-protobuf-0.12.0.jar, grpc-all-0.12.0.jar define 4 overlapping 
classes: 
[WARNING]   - io.grpc.protobuf.ProtoUtils$2
[WARNING]   - io.grpc.protobuf.ProtoUtils
[WARNING]   - io.grpc.protobuf.ProtoUtils$1
[WARNING]   - io.grpc.protobuf.ProtoInputStream
[WARNING] grpc-okhttp-0.12.0.jar, grpc-all-0.12.0.jar define 76 overlapping 
classes: 
[WARNING]   - io.grpc.okhttp.OkHttpSettingsUtil
[WARNING]   - io.grpc.okhttp.NegotiationType
[WARNING]   - io.grpc.okhttp.AsyncFrameWriter$12
[WARNING]   - io.grpc.okhttp.OkHttpTlsUpgrader
[WARNING]   - io.grpc.okhttp.AsyncFrameWriter$WriteRunnable
[WARNING]   - io.grpc.okhttp.Utils
[WARNING]   - io.grpc.okhttp.OkHttpProtocolNegotiator$AndroidNegotiator
[WARNING]   - io.grpc.okhttp.OkHttpChannelBuilder$2
[WARNING]   - io.grpc.okhttp.internal.framed.Huffman$Node
[WARNING]   - io.grpc.okhttp.AsyncFrameWriter$7
[WARNING]   - 66 more...
[WARNING] grpc-auth-0.12.0.jar, grpc-all-0.12.0.jar define 2 overlapping 
classes: 
[WARNING]   - 

Build failed in Jenkins: beam_PostCommit_MavenVerify » Apache Beam :: SDKs :: Java :: Core #303

2016-04-28 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Add CommittedResult

--
[...truncated 552 lines...]

Tests run: 2769, Failures: 0, Errors: 0, Skipped: 3

[JENKINS] Recording test results
[INFO] 
[INFO] --- jacoco-maven-plugin:0.7.5.201505241946:report (report) @ 
java-sdk-all ---
[INFO] Analyzed bundle 'Apache Beam :: SDKs :: Java :: Core' with 1085 classes
[INFO] 
[INFO] --- maven-jar-plugin:2.5:jar (default-jar) @ java-sdk-all ---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
java-sdk-all ---
[INFO] 
[INFO] --- maven-jar-plugin:2.5:test-jar (default-test-jar) @ java-sdk-all ---
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-javadoc-plugin:2.10.3:jar (default) @ java-sdk-all ---
[INFO] 
1 warning
[WARNING] Javadoc Warnings
[WARNING] 
:34:
 warning - Tag @link: can't find withAllowedLateness() in 
org.apache.beam.sdk.transforms.windowing.Window
[INFO] Building jar: 

[INFO] 
[INFO] --- maven-shade-plugin:2.4.1:shade (bundle-and-repackage) @ java-sdk-all 
---
[INFO] Excluding io.grpc:grpc-all:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:0.12.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:0.12.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-4 
from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding com.twitter:hpack:jar:0.10.1 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:0.12.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:0.12.0 from the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.3.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.3.1 from 
the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.0.Beta8 from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-pubsub-v1:jar:0.0.2 from the shaded 
jar.
[INFO] Excluding com.google.api.grpc:grpc-core-proto:jar:0.0.3 from the shaded 
jar.
[INFO] Excluding com.google.api-client:google-api-client:jar:1.21.0 from the 
shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev248-1.21.0 from the 
shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev7-1.21.0 
from the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev53-1.21.0 from the shaded 
jar.
[INFO] Excluding com.google.http-client:google-http-client:jar:1.21.0 from the 
shaded jar.
[INFO] Excluding org.apache.httpcomponents:httpclient:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding org.apache.httpcomponents:httpcore:jar:4.0.1 from the shaded 
jar.
[INFO] Excluding commons-logging:commons-logging:jar:1.1.1 from the shaded jar.
[INFO] Excluding commons-codec:commons-codec:jar:1.3 from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.21.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson2:jar:1.21.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.21.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.21.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client:jar:1.21.0 from 
the shaded jar.
[INFO] Excluding 

[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263344#comment-15263344
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/260


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #260

2016-04-28 Thread kenn
This closes #260


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/4d5303d8
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/4d5303d8
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/4d5303d8

Branch: refs/heads/master
Commit: 4d5303d8a8cc175cecf3686347e7d7465f902a9e
Parents: baae901 2618aa6
Author: Kenneth Knowles 
Authored: Thu Apr 28 17:50:38 2016 -0700
Committer: Kenneth Knowles 
Committed: Thu Apr 28 17:50:38 2016 -0700

--
 .../sdk/runners/inprocess/CommittedResult.java  | 47 
 .../runners/inprocess/CompletionCallback.java   |  2 +-
 .../ExecutorServiceParallelExecutor.java| 16 ++--
 .../inprocess/InProcessEvaluationContext.java   |  4 +-
 .../runners/inprocess/TransformExecutor.java|  4 +-
 .../runners/inprocess/CommittedResultTest.java  | 78 
 .../InProcessEvaluationContextTest.java |  4 +-
 .../inprocess/TransformExecutorTest.java|  4 +-
 8 files changed, 142 insertions(+), 17 deletions(-)
--




[1/2] incubator-beam git commit: Add CommittedResult

2016-04-28 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master baae9013c -> 4d5303d8a


Add CommittedResult

Return as the output to InProcessEvaluationContext#handleResult(). This
allows a richer return type to improve possible behaviors when a result
is returned.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2618aa68
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2618aa68
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2618aa68

Branch: refs/heads/master
Commit: 2618aa68ecad95f77a6a42f8eddfe63e511edf23
Parents: a9387fc
Author: Thomas Groh 
Authored: Thu Apr 28 12:22:47 2016 -0700
Committer: Thomas Groh 
Committed: Thu Apr 28 13:18:26 2016 -0700

--
 .../sdk/runners/inprocess/CommittedResult.java  | 47 
 .../runners/inprocess/CompletionCallback.java   |  2 +-
 .../ExecutorServiceParallelExecutor.java| 16 ++--
 .../inprocess/InProcessEvaluationContext.java   |  4 +-
 .../runners/inprocess/TransformExecutor.java|  4 +-
 .../runners/inprocess/CommittedResultTest.java  | 78 
 .../InProcessEvaluationContextTest.java |  4 +-
 .../inprocess/TransformExecutorTest.java|  4 +-
 8 files changed, 142 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2618aa68/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CommittedResult.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CommittedResult.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CommittedResult.java
new file mode 100644
index 000..3ad0ae6
--- /dev/null
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CommittedResult.java
@@ -0,0 +1,47 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied
+ *
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.runners.inprocess;
+
+import 
org.apache.beam.sdk.runners.inprocess.InProcessPipelineRunner.CommittedBundle;
+import org.apache.beam.sdk.transforms.AppliedPTransform;
+
+import com.google.auto.value.AutoValue;
+
+/**
+ * A {@link InProcessTransformResult} that has been committed.
+ */
+@AutoValue
+abstract class CommittedResult {
+  /**
+   * Returns the {@link AppliedPTransform} that produced this result.
+   */
+  public abstract AppliedPTransform getTransform();
+
+  /**
+   * Returns the outputs produced by the transform.
+   */
+  public abstract Iterable> getOutputs();
+
+  public static CommittedResult create(
+  InProcessTransformResult original, Iterable> outputs) {
+return new AutoValue_CommittedResult(original.getTransform(),
+outputs);
+  }
+}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2618aa68/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CompletionCallback.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CompletionCallback.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CompletionCallback.java
index 90c488e..30a2b92 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CompletionCallback.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/CompletionCallback.java
@@ -26,7 +26,7 @@ interface CompletionCallback {
   /**
* Handle a successful result, returning the committed outputs of the result.
*/
-  Iterable> handleResult(
+  CommittedResult handleResult(
   CommittedBundle inputBundle, InProcessTransformResult result);
 
   /**

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2618aa68/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ExecutorServiceParallelExecutor.java
--
diff --git 

[GitHub] incubator-beam pull request: [BEAM-22] Stop cloning coders in Enfo...

2016-04-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/261


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263342#comment-15263342
 ] 

ASF GitHub Bot commented on BEAM-22:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/261


> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #261

2016-04-28 Thread kenn
This closes #261


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/baae9013
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/baae9013
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/baae9013

Branch: refs/heads/master
Commit: baae9013c44d7587619c9d5c6908ce8abbc0f627
Parents: 3b36a65 2a44214
Author: Kenneth Knowles 
Authored: Thu Apr 28 17:47:39 2016 -0700
Committer: Kenneth Knowles 
Committed: Thu Apr 28 17:47:39 2016 -0700

--
 .../sdk/runners/inprocess/EncodabilityEnforcementFactory.java | 3 +--
 .../sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java  | 3 +--
 .../sdk/runners/inprocess/ImmutabilityEnforcementFactory.java | 3 +--
 3 files changed, 3 insertions(+), 6 deletions(-)
--




[1/2] incubator-beam git commit: Stop cloning coders in the InProcessRunner

2016-04-28 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 3b36a65b2 -> baae9013c


Stop cloning coders in the InProcessRunner

This is excessively slow and also not useful, as coders are required to
be thread-safe


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/2a44214e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/2a44214e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/2a44214e

Branch: refs/heads/master
Commit: 2a44214e4aef5a5d377378b9a0ac3ff3360acefc
Parents: a9387fc
Author: Thomas Groh 
Authored: Thu Apr 28 15:27:03 2016 -0700
Committer: Thomas Groh 
Committed: Thu Apr 28 15:53:43 2016 -0700

--
 .../sdk/runners/inprocess/EncodabilityEnforcementFactory.java | 3 +--
 .../sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java  | 3 +--
 .../sdk/runners/inprocess/ImmutabilityEnforcementFactory.java | 3 +--
 3 files changed, 3 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2a44214e/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/EncodabilityEnforcementFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/EncodabilityEnforcementFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/EncodabilityEnforcementFactory.java
index 02a36cf..d234d4f 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/EncodabilityEnforcementFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/EncodabilityEnforcementFactory.java
@@ -23,7 +23,6 @@ import org.apache.beam.sdk.coders.Coder;
 import 
org.apache.beam.sdk.runners.inprocess.InProcessPipelineRunner.CommittedBundle;
 import org.apache.beam.sdk.transforms.AppliedPTransform;
 import org.apache.beam.sdk.util.CoderUtils;
-import org.apache.beam.sdk.util.SerializableUtils;
 import org.apache.beam.sdk.util.UserCodeException;
 import org.apache.beam.sdk.util.WindowedValue;
 import org.apache.beam.sdk.values.PCollection;
@@ -47,7 +46,7 @@ class EncodabilityEnforcementFactory implements 
ModelEnforcementFactory {
 private Coder coder;
 
 public EncodabilityEnforcement(CommittedBundle input) {
-  coder = SerializableUtils.clone(input.getPCollection().getCoder());
+  coder = input.getPCollection().getCoder();
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2a44214e/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java
index 0852269..04ece1c 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityCheckingBundleFactory.java
@@ -27,7 +27,6 @@ import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.util.IllegalMutationException;
 import org.apache.beam.sdk.util.MutationDetector;
 import org.apache.beam.sdk.util.MutationDetectors;
-import org.apache.beam.sdk.util.SerializableUtils;
 import org.apache.beam.sdk.util.UserCodeException;
 import org.apache.beam.sdk.util.WindowedValue;
 import org.apache.beam.sdk.values.PCollection;
@@ -87,7 +86,7 @@ class ImmutabilityCheckingBundleFactory implements 
BundleFactory {
 public ImmutabilityEnforcingBundle(UncommittedBundle underlying) {
   this.underlying = underlying;
   mutationDetectors = HashMultimap.create();
-  coder = SerializableUtils.clone(getPCollection().getCoder());
+  coder = getPCollection().getCoder();
 }
 
 @Override

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/2a44214e/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityEnforcementFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityEnforcementFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityEnforcementFactory.java
index 028870a..2f21032 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityEnforcementFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/runners/inprocess/ImmutabilityEnforcementFactory.java
@@ -24,7 +24,6 @@ import 

[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263336#comment-15263336
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/264

[BEAM-22] Correct sequence of pending element updates in 
InMemoryWatermarkManager

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Adding additional pending elements/timers (and thus holds) always comes
before removing existing holds, so the watermark is never observed under
more stringent restrictions than currently exist.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_wm_pending_sequence

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #264


commit 1340e826631e83835e4d00c29a339b15931b7a94
Author: Thomas Groh 
Date:   2016-04-29T00:38:50Z

Correct sequence of pending element updates in InMemoryWatermarkManager

Adding additional pending elements/timers (and thus holds) always comes
before removing existing holds, so the watermark is never observed under
more stringent restrictions than currently exist.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Correct sequence of pending...

2016-04-28 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/264

[BEAM-22] Correct sequence of pending element updates in 
InMemoryWatermarkManager

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Adding additional pending elements/timers (and thus holds) always comes
before removing existing holds, so the watermark is never observed under
more stringent restrictions than currently exist.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_wm_pending_sequence

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/264.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #264


commit 1340e826631e83835e4d00c29a339b15931b7a94
Author: Thomas Groh 
Date:   2016-04-29T00:38:50Z

Correct sequence of pending element updates in InMemoryWatermarkManager

Adding additional pending elements/timers (and thus holds) always comes
before removing existing holds, so the watermark is never observed under
more stringent restrictions than currently exist.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_MavenVerify » Apache Beam :: Examples :: Java All #301

2016-04-28 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_MavenVerify » Apache Beam :: Runners :: Spark #301

2016-04-28 Thread Apache Jenkins Server
See 


--
[...truncated 1173 lines...]
[INFO] Including com.google.api-client:google-api-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-bigquery:jar:v2-rev248-1.21.0 in the shaded 
jar.
[INFO] Including com.google.apis:google-api-services-pubsub:jar:v1-rev7-1.21.0 
in the shaded jar.
[INFO] Including 
com.google.apis:google-api-services-storage:jar:v1-rev53-1.21.0 in the shaded 
jar.
[INFO] Including com.google.http-client:google-http-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including org.apache.httpcomponents:httpclient:jar:4.0.1 in the shaded 
jar.
[INFO] Including org.apache.httpcomponents:httpcore:jar:4.0.1 in the shaded jar.
[INFO] Including commons-logging:commons-logging:jar:1.1.1 in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-jackson:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-jackson2:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.http-client:google-http-client-protobuf:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client-java6:jar:1.21.0 
in the shaded jar.
[INFO] Including com.google.oauth-client:google-oauth-client:jar:1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-datastore-protobuf:jar:v1beta2-rev1-4.0.0 
in the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:gcsio:jar:1.4.3 in the shaded jar.
[INFO] Including com.google.api-client:google-api-client-java6:jar:1.20.0 in 
the shaded jar.
[INFO] Including com.google.api-client:google-api-client-jackson2:jar:1.20.0 in 
the shaded jar.
[INFO] Including com.google.cloud.bigdataoss:util:jar:1.4.3 in the shaded jar.
[INFO] Including com.google.protobuf:protobuf-java:jar:3.0.0-beta-1 in the 
shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-core:jar:2.7.0 in the 
shaded jar.
[INFO] Including com.fasterxml.jackson.core:jackson-annotations:jar:2.7.0 in 
the shaded jar.
[INFO] Including org.apache.avro:avro:jar:1.7.7 in the shaded jar.
[INFO] Including org.apache.commons:commons-compress:jar:1.9 in the shaded jar.
[INFO] Including joda-time:joda-time:jar:2.4 in the shaded jar.
[INFO] Including 
org.apache.beam:java-examples-all:jar:0.1.0-incubating-SNAPSHOT in the shaded 
jar.
[INFO] Including 
org.apache.beam:google-cloud-dataflow-java-runner:jar:0.1.0-incubating-SNAPSHOT 
in the shaded jar.
[INFO] Including 
com.google.apis:google-api-services-clouddebugger:jar:v2-rev6-1.21.0 in the 
shaded jar.
[INFO] Including 
com.google.apis:google-api-services-dataflow:jar:v1b3-rev22-1.21.0 in the 
shaded jar.
[INFO] Including javax.servlet:javax.servlet-api:jar:3.1.0 in the shaded jar.
[INFO] Including org.apache.avro:avro-mapred:jar:hadoop2:1.7.7 in the shaded 
jar.
[INFO] Including org.apache.avro:avro-ipc:jar:1.7.7 in the shaded jar.
[INFO] Including org.mortbay.jetty:jetty:jar:6.1.26 in the shaded jar.
[INFO] Including org.mortbay.jetty:jetty-util:jar:6.1.26 in the shaded jar.
[INFO] Including org.apache.velocity:velocity:jar:1.7 in the shaded jar.
[INFO] Including commons-collections:commons-collections:jar:3.2.1 in the 
shaded jar.
[INFO] Including org.apache.avro:avro-ipc:jar:tests:1.7.7 in the shaded jar.
[INFO] Including org.codehaus.jackson:jackson-core-asl:jar:1.9.13 in the shaded 
jar.
[INFO] Including org.codehaus.jackson:jackson-mapper-asl:jar:1.9.13 in the 
shaded jar.
[WARNING] google-cloud-dataflow-java-runner-0.1.0-incubating-SNAPSHOT.jar, 
java-sdk-all-0.1.0-incubating-SNAPSHOT.jar define 1717 overlapping classes: 
[WARNING]   - 
org.apache.beam.sdk.repackaged.com.google.common.collect.TreeRangeSet$ComplementRangesByLowerBound$2
[WARNING]   - 
org.apache.beam.sdk.repackaged.com.google.common.collect.WellBehavedMap$EntrySet$1$1
[WARNING]   - 
org.apache.beam.sdk.repackaged.com.google.common.util.concurrent.CycleDetectingLockFactory$Policies$1
[WARNING]   - org.apache.beam.sdk.repackaged.com.google.common.collect.Maps$6
[WARNING]   - 
org.apache.beam.sdk.repackaged.com.google.common.primitives.UnsignedBytes$LexicographicalComparatorHolder$UnsafeComparator$1
[WARNING]   - org.apache.beam.sdk.repackaged.com.google.common.collect.Range$1
[WARNING]   - 
org.apache.beam.sdk.repackaged.com.google.common.collect.Collections2$OrderedPermutationCollection
[WARNING]   - org.apache.beam.sdk.repackaged.com.google.common.base.Splitter$2
[WARNING]   - 
org.apache.beam.sdk.repackaged.com.google.common.base.Equivalence$Identity
[WARNING]   - org.apache.beam.sdk.repackaged.com.google.common.collect.Lists$1
[WARNING]   - 1707 more...
[WARNING] grpc-all-0.12.0.jar, grpc-netty-0.12.0.jar define 80 overlapping 
classes: 
[WARNING]   - io.grpc.netty.AbstractNettyHandler
[WARNING]   - io.grpc.netty.NettyClientTransport
[WARNING]   - io.grpc.netty.NettyClientStream$1
[WARNING]   - 

Build failed in Jenkins: beam_PostCommit_MavenVerify #301

2016-04-28 Thread Apache Jenkins Server
See 

--
[...truncated 7126 lines...]
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/1.0-alpha-7/plexus-archiver-1.0-alpha-7.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpclient/4.0.2/httpclient-4.0.2.jar
 (287 KB at 851.3 KB/sec)
[INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0.8/plexus-utils-3.0.8.jar
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-plugin-descriptor/2.0/maven-plugin-descriptor-2.0.jar
 (33 KB at 91.6 KB/sec)
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/classworlds/classworlds/1.1-alpha-2/classworlds-1.1-alpha-2.jar
 (37 KB at 99.8 KB/sec)
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/maven/maven-monitor/2.0/maven-monitor-2.0.jar
 (8 KB at 20.1 KB/sec)
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-utils/3.0.8/plexus-utils-3.0.8.jar
 (227 KB at 573.1 KB/sec)
[INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/codehaus/plexus/plexus-archiver/1.0-alpha-7/plexus-archiver-1.0-alpha-7.jar
 (139 KB at 345.9 KB/sec)
[INFO] 51 implicit excludes (use -debug for more details).
[INFO] Exclude: .travis.yml
[INFO] Exclude: **/*.conf
[INFO] Exclude: **/*.iml
[INFO] Exclude: **/*.md
[INFO] Exclude: **/*.txt
[INFO] Exclude: **/.project
[INFO] Exclude: **/.checkstyle
[INFO] Exclude: **/.classpath
[INFO] Exclude: **/.settings/
[INFO] Exclude: **/gen/**
[INFO] Exclude: **/resources/**
[INFO] Exclude: **/target/**
[INFO] Exclude: **/dependency-reduced-pom.xml
[INFO] Exclude: false
[INFO] 63 resources included (use -debug for more details)
[INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 
approved: 63 licence.
[INFO] 
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ spark-runner 
---
[INFO] Installing 

 to 

[INFO] Installing 

 to 

[WARNING] Cannot calculate digest of mojo class, because mojo wasn't loaded 
from a jar, but from: 

[INFO] 
[INFO] 
[INFO] Skipping Apache Beam :: Parent
[INFO] This project has been banned from the build due to previous failures.
[INFO] 
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Beam :: Parent .. SUCCESS [ 15.644 s]
[INFO] Apache Beam :: SDKs  SUCCESS [  0.314 s]
[INFO] Apache Beam :: SDKs :: Java  SUCCESS [  0.312 s]
[INFO] Apache Beam :: SDKs :: Java :: Build Tools . SUCCESS [  3.887 s]
[INFO] Apache Beam :: SDKs :: Java :: Core  SUCCESS [02:39 min]
[INFO] Apache Beam :: SDKs :: Java :: IO .. SUCCESS [  0.348 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Google Cloud Platform SUCCESS [  
5.380 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: HDFS .. SUCCESS [ 14.144 s]
[INFO] Apache Beam :: SDKs :: Java :: IO :: Kafka . SUCCESS [  6.640 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions .. SUCCESS [  0.377 s]
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Join library SUCCESS [  
4.960 s]
[INFO] Apache Beam :: SDKs :: Java :: Tests ... SUCCESS [  5.228 s]
[INFO] Apache Beam :: Runners . SUCCESS [  0.353 s]
[INFO] Apache Beam :: Runners :: Google Cloud Dataflow  SUCCESS [ 22.311 s]
[INFO] Apache Beam :: Examples :: Java All  SUCCESS [01:15 min]
[INFO] Apache Beam :: Runners :: Flink  SUCCESS [  1.518 s]
[INFO] Apache Beam :: Runners :: Flink :: Core  SUCCESS [ 38.580 s]
[INFO] Apache Beam :: Runners :: Flink :: Examples  SUCCESS [  2.384 s]
[INFO] Apache Beam :: Runners :: Spark  FAILURE [02:29 min]
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Maven Archetypes :: 

[GitHub] incubator-beam pull request: Encapsulate cloning behavior of in-pr...

2016-04-28 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/263

Encapsulate cloning behavior of in-process ParDo evaluator

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This will make way for using the evaluator in contexts where cloning
is not appropriate, such as the evaluator for `GroupAlsoByWindow`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam ParDo-Supplier

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/263.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #263


commit 8629b00a1670e1de4b96ba06fab7bb5347c81d9c
Author: Kenneth Knowles 
Date:   2016-04-28T23:12:21Z

Encapsulate cloning behavior of in-process ParDo evaluator

This will make way for using the evluator in contexts where cloning
is not appropriate, such as evaluator GroupAlsoByWindow




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request: Create build-tools module with consol...

2016-04-28 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/246


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Consolidate checkstyle configuration in new 'build-tools' module

2016-04-28 Thread lcwik
Repository: incubator-beam
Updated Branches:
  refs/heads/master e2d5c691e -> 3b36a65b2


Consolidate checkstyle configuration in new 'build-tools' module


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/e6b42c49
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/e6b42c49
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/e6b42c49

Branch: refs/heads/master
Commit: e6b42c4928659c80b58a716bdb32d867ed34f025
Parents: e2d5c69
Author: Scott Wegner 
Authored: Tue Apr 26 10:49:26 2016 -0700
Committer: Scott Wegner 
Committed: Thu Apr 28 16:22:26 2016 -0700

--
 examples/java/pom.xml   |  22 -
 examples/java8/pom.xml  |  22 -
 .../complete/game/utils/WriteToBigQuery.java|   2 +-
 .../game/utils/WriteWindowedToBigQuery.java |   2 +-
 pom.xml |  33 ++
 runners/google-cloud-dataflow-java/pom.xml  |  23 -
 runners/spark/build-resources/header-file.txt   |  17 -
 runners/spark/pom.xml   |  12 +-
 sdks/java/build-tools/pom.xml   |  19 +
 .../src/main/resources/beam/checkstyle.xml  | 398 ++
 .../src/main/resources/beam/header-file.txt |  17 +
 sdks/java/checkstyle.xml| 417 ---
 sdks/java/core/pom.xml  |  23 +-
 .../apache/beam/sdk/io/CountingInputTest.java   |   1 -
 sdks/java/extensions/join-library/pom.xml   |  15 -
 sdks/java/io/kafka/pom.xml  |  20 -
 .../apache/beam/sdk/io/kafka/KafkaIOTest.java   |   1 -
 sdks/java/java8tests/pom.xml|  22 -
 sdks/java/pom.xml   |   1 +
 19 files changed, 473 insertions(+), 594 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e6b42c49/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index 9626bf3..342986f 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -54,28 +54,6 @@
   
 org.apache.maven.plugins
 maven-checkstyle-plugin
-2.12
-
-  
-com.puppycrawl.tools
-checkstyle
-6.6
-  
-
-
-  ../../sdks/java/checkstyle.xml
-  true
-  true
-  true
-  false
-
-
-  
-
-  check
-
-  
-
   
 
   

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e6b42c49/examples/java8/pom.xml
--
diff --git a/examples/java8/pom.xml b/examples/java8/pom.xml
index fe10dbc..95af76a 100644
--- a/examples/java8/pom.xml
+++ b/examples/java8/pom.xml
@@ -65,28 +65,6 @@
   
 org.apache.maven.plugins
 maven-checkstyle-plugin
-2.12
-
-  
-com.puppycrawl.tools
-checkstyle
-6.6
-  
-
-
-  ../../sdks/java/checkstyle.xml
-  true
-  true
-  true
-  false
-
-
-  
-
-  check
-
-  
-
   
 
   

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e6b42c49/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToBigQuery.java
--
diff --git 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToBigQuery.java
 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToBigQuery.java
index f0b3afa..5897e44 100644
--- 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToBigQuery.java
+++ 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteToBigQuery.java
@@ -1,4 +1,4 @@
-  /*
+/*
  * Licensed to the Apache Software Foundation (ASF) under one
  * or more contributor license agreements.  See the NOTICE file
  * distributed with this work for additional information

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/e6b42c49/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
--
diff --git 
a/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
 
b/examples/java8/src/main/java/org/apache/beam/examples/complete/game/utils/WriteWindowedToBigQuery.java
index 540c366..27697db 100644
--- 

[2/2] incubator-beam git commit: This closes #246

2016-04-28 Thread lcwik
This closes #246


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3b36a65b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3b36a65b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3b36a65b

Branch: refs/heads/master
Commit: 3b36a65b2887bc814556e6126f969c8f533a7521
Parents: e2d5c69 e6b42c4
Author: Luke Cwik 
Authored: Thu Apr 28 16:39:43 2016 -0700
Committer: Luke Cwik 
Committed: Thu Apr 28 16:39:43 2016 -0700

--
 examples/java/pom.xml   |  22 -
 examples/java8/pom.xml  |  22 -
 .../complete/game/utils/WriteToBigQuery.java|   2 +-
 .../game/utils/WriteWindowedToBigQuery.java |   2 +-
 pom.xml |  33 ++
 runners/google-cloud-dataflow-java/pom.xml  |  23 -
 runners/spark/build-resources/header-file.txt   |  17 -
 runners/spark/pom.xml   |  12 +-
 sdks/java/build-tools/pom.xml   |  19 +
 .../src/main/resources/beam/checkstyle.xml  | 398 ++
 .../src/main/resources/beam/header-file.txt |  17 +
 sdks/java/checkstyle.xml| 417 ---
 sdks/java/core/pom.xml  |  23 +-
 .../apache/beam/sdk/io/CountingInputTest.java   |   1 -
 sdks/java/extensions/join-library/pom.xml   |  15 -
 sdks/java/io/kafka/pom.xml  |  20 -
 .../apache/beam/sdk/io/kafka/KafkaIOTest.java   |   1 -
 sdks/java/java8tests/pom.xml|  22 -
 sdks/java/pom.xml   |   1 +
 19 files changed, 473 insertions(+), 594 deletions(-)
--




[GitHub] incubator-beam pull request: [BEAM-115] Make in-process GroupByKey...

2016-04-28 Thread kennknowles
Github user kennknowles closed the pull request at:

https://github.com/apache/incubator-beam/pull/262


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-115) Beam Runner API

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263240#comment-15263240
 ] 

ASF GitHub Bot commented on BEAM-115:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/262

[BEAM-115] Make in-process GroupByKey consistent with Beam model

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
The commits in this PR stand individual but have strong dependencies. They 
each build towards making the `InProcessPipelineRunner` correspond to the 
intended runner API / Beam model.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam InProcessGroupByKey

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit 6f3eeb4fada9fa72763980f26af8949141dbbe51
Author: Kenneth Knowles 
Date:   2016-04-28T22:50:32Z

Add WindowMatchers.isWindowedValue()

commit 34ec15d6a5923287f4db0db63083c37b87c030b7
Author: Kenneth Knowles 
Date:   2016-04-28T22:51:40Z

Add accessors for sub-coders of KeyedWorkItemCoder

commit 91643088f4032898cf67973b032d86a528eca199
Author: Kenneth Knowles 
Date:   2016-04-28T23:12:21Z

Encapsulate cloning behavior of in-process ParDo evaluator

This will make way for using the evluator in contexts where cloning
is not appropriate, such as evaluator GroupAlsoByWindow

commit 753787ff0eb10c524f336e9af837ed442f005121
Author: Kenneth Knowles 
Date:   2016-04-28T23:13:24Z

Make in-process GroupByKey respect future Beam model

This introduces top-level classes:

 - InProcessGroupByKey, which expands like GroupByKeyViaGroupByKeyOnly
   but with different intermediate PCollection types.
 - InProcessGroupByKeyOnly, which outputs KeyedWorkItem. This existed
   already under a different name.
 - InProcessGroupAlsoByWindow, which is evaluated directly and
   accepts input elements of type KeyedWorkItem.




> Beam Runner API
> ---
>
> Key: BEAM-115
> URL: https://issues.apache.org/jira/browse/BEAM-115
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The PipelineRunner API from the SDK is not ideal for the Beam technical 
> vision.
> It has technical limitations:
>  - The user's DAG (even including library expansions) is never explicitly 
> represented, so it cannot be analyzed except incrementally, and cannot 
> necessarily be reconstructed (for example, to display it!).
>  - The flattened DAG of just primitive transforms isn't well-suited for 
> display or transform override.
>  - The TransformHierarchy isn't well-suited for optimizations.
>  - The user must realistically pre-commit to a runner, and its configuration 
> (batch vs streaming) prior to graph construction, since the runner will be 
> modifying the graph as it is built.
>  - It is fairly language- and SDK-specific.
> It has usability issues (these are not from intuition, but derived from 
> actual cases of failure to use according to the design)
>  - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner 
> is confusing.
>  - The TransformHierarchy, accessible only via visitor traversals, is 
> cumbersome.
>  - The staging of construction-time vs run-time is not always obvious.
> These are just examples. This ticket tracks designing, coming to consensus, 
> and building an API that more simply and directly supports the technical 
> vision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-115] Make in-process GroupByKey...

2016-04-28 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/262

[BEAM-115] Make in-process GroupByKey consistent with Beam model

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
The commits in this PR stand individual but have strong dependencies. They 
each build towards making the `InProcessPipelineRunner` correspond to the 
intended runner API / Beam model.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam InProcessGroupByKey

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/262.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #262


commit 6f3eeb4fada9fa72763980f26af8949141dbbe51
Author: Kenneth Knowles 
Date:   2016-04-28T22:50:32Z

Add WindowMatchers.isWindowedValue()

commit 34ec15d6a5923287f4db0db63083c37b87c030b7
Author: Kenneth Knowles 
Date:   2016-04-28T22:51:40Z

Add accessors for sub-coders of KeyedWorkItemCoder

commit 91643088f4032898cf67973b032d86a528eca199
Author: Kenneth Knowles 
Date:   2016-04-28T23:12:21Z

Encapsulate cloning behavior of in-process ParDo evaluator

This will make way for using the evluator in contexts where cloning
is not appropriate, such as evaluator GroupAlsoByWindow

commit 753787ff0eb10c524f336e9af837ed442f005121
Author: Kenneth Knowles 
Date:   2016-04-28T23:13:24Z

Make in-process GroupByKey respect future Beam model

This introduces top-level classes:

 - InProcessGroupByKey, which expands like GroupByKeyViaGroupByKeyOnly
   but with different intermediate PCollection types.
 - InProcessGroupByKeyOnly, which outputs KeyedWorkItem. This existed
   already under a different name.
 - InProcessGroupAlsoByWindow, which is evaluated directly and
   accepts input elements of type KeyedWorkItem.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-115) Beam Runner API

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263214#comment-15263214
 ] 

ASF GitHub Bot commented on BEAM-115:
-

Github user kennknowles closed the pull request at:

https://github.com/apache/incubator-beam/pull/255


> Beam Runner API
> ---
>
> Key: BEAM-115
> URL: https://issues.apache.org/jira/browse/BEAM-115
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The PipelineRunner API from the SDK is not ideal for the Beam technical 
> vision.
> It has technical limitations:
>  - The user's DAG (even including library expansions) is never explicitly 
> represented, so it cannot be analyzed except incrementally, and cannot 
> necessarily be reconstructed (for example, to display it!).
>  - The flattened DAG of just primitive transforms isn't well-suited for 
> display or transform override.
>  - The TransformHierarchy isn't well-suited for optimizations.
>  - The user must realistically pre-commit to a runner, and its configuration 
> (batch vs streaming) prior to graph construction, since the runner will be 
> modifying the graph as it is built.
>  - It is fairly language- and SDK-specific.
> It has usability issues (these are not from intuition, but derived from 
> actual cases of failure to use according to the design)
>  - The interleaving of apply() methods in PTransform/Pipeline/PipelineRunner 
> is confusing.
>  - The TransformHierarchy, accessible only via visitor traversals, is 
> cumbersome.
>  - The staging of construction-time vs run-time is not always obvious.
> These are just examples. This ticket tracks designing, coming to consensus, 
> and building an API that more simply and directly supports the technical 
> vision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-115] Rename GroupByKeyEvaluator...

2016-04-28 Thread kennknowles
Github user kennknowles closed the pull request at:

https://github.com/apache/incubator-beam/pull/255


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263163#comment-15263163
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/261

[BEAM-22] Stop cloning coders in Enforcements

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is excessively slow and also not useful, as coders are required to
be thread-safe

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_stop_cloning_coders

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #261


commit 2f2584c2d82d8739e27b3d99125bd5ee2a3444fc
Author: Thomas Groh 
Date:   2016-04-28T22:27:03Z

Stop cloning coders in Enforcements

This is excessively slow and also not useful, as coders are required to
be thread-safe




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-22] Stop cloning coders in Enfo...

2016-04-28 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/261

[BEAM-22] Stop cloning coders in Enforcements

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This is excessively slow and also not useful, as coders are required to
be thread-safe

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam ippr_stop_cloning_coders

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #261


commit 2f2584c2d82d8739e27b3d99125bd5ee2a3444fc
Author: Thomas Groh 
Date:   2016-04-28T22:27:03Z

Stop cloning coders in Enforcements

This is excessively slow and also not useful, as coders are required to
be thread-safe




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-31) When triggers are changed via pipeline update, stale finished triggers data applied

2016-04-28 Thread Mark Shields (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-31?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263083#comment-15263083
 ] 

Mark Shields commented on BEAM-31:
--

A customer hit this issue and it lead to catastrophic data loss: the old 
pipeline was killed by --update, the new pipeline came up but with inconsistent 
interpretation of trigger state, eventually the new pipeline threw. Even 
reverting the trigger code and --updating would not  work because the pipeline 
was stuck in work item retries and had partially proceeded with corrupted 
trigger state. 

> When triggers are changed via pipeline update, stale finished triggers data 
> applied
> ---
>
> Key: BEAM-31
> URL: https://issues.apache.org/jira/browse/BEAM-31
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Kenneth Knowles
>  Labels: Triggers, Update
>
> TriggerRunner tracks which trigger subexpresions are finished. When the 
> trigger expession is updated via pipeline update, this data is applied 
> arbitrarily to the new trigger expression.
> Implementation note: trigger subexpressions are identified by number, and 
> their finished state stored in a bit set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-122) GlobalWindow and allowedLateness can cause inconsistent timer interpretation

2016-04-28 Thread Mark Shields (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263075#comment-15263075
 ] 

Mark Shields commented on BEAM-122:
---

It turned out to really want to live with pull/139. Thx for reminding me to 
cross ref.

> GlobalWindow and allowedLateness can cause inconsistent timer interpretation 
> -
>
> Key: BEAM-122
> URL: https://issues.apache.org/jira/browse/BEAM-122
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>
> In ReduceFnRunner we have code such as
>window.getMaxTimestamp().plus(windowingStrategy.getAllowedLateness())
> If window is global then maxTimestamp will be 
> BoundedWindow.TIMESTAMP_MAX_VALUE.
> Meanwhile, timestamps beyond BoundedWindow.TIMESTAMP_MAX_VALUE will be 
> clipped in most runners.
> This could cause the time of an expected timer (eg for garbage collection) to 
> not match the actual time of a fired timer.
> We should either make non-zero allowedLateness on the Global window illegal 
> (probably obnoxious) or explicitly clip it to zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-237) Should use allowedLateness from downstream computations in ReduceFnRunner

2016-04-28 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-237:
-
Assignee: (was: Frances Perry)

> Should use allowedLateness from downstream computations in ReduceFnRunner 
> --
>
> Key: BEAM-237
> URL: https://issues.apache.org/jira/browse/BEAM-237
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>
> Much of the reasoning about holds, late data and final panes in 
> ReduceFnRunner assume the current getAllowedLateness is an upper bound of the 
> getAllowedLateness of all downstream computations.
> There is currently no test that this is indeed the case.
> It may be much simpler (for us and our users) to have a global allowed 
> lateness setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-239) Suggest Changing RemoveDuplicates to Distinct

2016-04-28 Thread Jesse Anderson (JIRA)
Jesse Anderson created BEAM-239:
---

 Summary: Suggest Changing RemoveDuplicates to Distinct
 Key: BEAM-239
 URL: https://issues.apache.org/jira/browse/BEAM-239
 Project: Beam
  Issue Type: Wish
  Components: sdk-java-core
Reporter: Jesse Anderson
Assignee: Davor Bonaci
Priority: Minor


I had a really tough time finding this transform in the docs. I suggest 
changing this class' name to Distinct instead of RemoveDuplicates. At the very 
least, the JavaDoc for RemoveDuplicates should have the word distinct in it to 
make this more findable/searchable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: Change Counter.getName() to return Co...

2016-04-28 Thread peihe
Github user peihe closed the pull request at:

https://github.com/apache/incubator-beam/pull/253


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-238) Add link URLs for sources / sinks

2016-04-28 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-238:
-

 Summary: Add link URLs for sources / sinks
 Key: BEAM-238
 URL: https://issues.apache.org/jira/browse/BEAM-238
 Project: Beam
  Issue Type: Bug
Reporter: Scott Wegner


Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262824#comment-15262824
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/260

[BEAM-22] Add CommittedResult

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Return as the output to InProcessEvaluationContext#handleResult(). This
allows a richer return type to improve possible behaviors when a result
is returned.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
ippr_committed_result_as_handle_result

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/260.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #260


commit 13da09d9f1b4a6498cfde8067d31e96f5817ca74
Author: Thomas Groh 
Date:   2016-04-28T19:22:47Z

Add CommittedResult

Return as the output to InProcessEvaluationContext#handleResult(). This
allows a richer return type to improve possible behaviors when a result
is returned.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-122) GlobalWindow and allowedLateness can cause inconsistent timer interpretation

2016-04-28 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262731#comment-15262731
 ] 

Kenneth Knowles commented on BEAM-122:
--

You have done this, yes? Or is it in a PR I just saw that isn't in yet?

> GlobalWindow and allowedLateness can cause inconsistent timer interpretation 
> -
>
> Key: BEAM-122
> URL: https://issues.apache.org/jira/browse/BEAM-122
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>
> In ReduceFnRunner we have code such as
>window.getMaxTimestamp().plus(windowingStrategy.getAllowedLateness())
> If window is global then maxTimestamp will be 
> BoundedWindow.TIMESTAMP_MAX_VALUE.
> Meanwhile, timestamps beyond BoundedWindow.TIMESTAMP_MAX_VALUE will be 
> clipped in most runners.
> This could cause the time of an expected timer (eg for garbage collection) to 
> not match the actual time of a fired timer.
> We should either make non-zero allowedLateness on the Global window illegal 
> (probably obnoxious) or explicitly clip it to zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-237) Should use allowedLateness from downstream computations in ReduceFnRunner

2016-04-28 Thread Mark Shields (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Shields updated BEAM-237:
--
Description: 
Much of the reasoning about holds, late data and final panes in ReduceFnRunner 
assume the current getAllowedLateness is an upper bound of the 
getAllowedLateness of all downstream computations.

There is currently no test that this is indeed the case.

It may be much simpler (for us and our users) to have a global allowed lateness 
setting.

  was:
Much of the reasoning about holds, late data and final panes in ReduceFnRunner 
assuming the current getAllowedLateness is an upper bound of the 
getAllowedLateness of all downstream computations.

There is currently no test that this is indeed the case.

It may be much simpler (for us and our users) to have a global allowed lateness 
setting.


> Should use allowedLateness from downstream computations in ReduceFnRunner 
> --
>
> Key: BEAM-237
> URL: https://issues.apache.org/jira/browse/BEAM-237
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Frances Perry
>
> Much of the reasoning about holds, late data and final panes in 
> ReduceFnRunner assume the current getAllowedLateness is an upper bound of the 
> getAllowedLateness of all downstream computations.
> There is currently no test that this is indeed the case.
> It may be much simpler (for us and our users) to have a global allowed 
> lateness setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-237) Should use allowedLateness from downstream computations in ReduceFnRunner

2016-04-28 Thread Mark Shields (JIRA)
Mark Shields created BEAM-237:
-

 Summary: Should use allowedLateness from downstream computations 
in ReduceFnRunner 
 Key: BEAM-237
 URL: https://issues.apache.org/jira/browse/BEAM-237
 Project: Beam
  Issue Type: Bug
  Components: runner-core
Reporter: Mark Shields
Assignee: Frances Perry


Much of the reasoning about holds, late data and final panes in ReduceFnRunner 
assuming the current getAllowedLateness is an upper bound of the 
getAllowedLateness of all downstream computations.

There is currently no test that this is indeed the case.

It may be much simpler (for us and our users) to have a global allowed lateness 
setting.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-231) Remove ClassForDisplay infrastructure class.

2016-04-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262394#comment-15262394
 ] 

ASF GitHub Bot commented on BEAM-231:
-

GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/259

[BEAM-231] Remove ClassForDisplay helper type

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam classfordisplay

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #259


commit d1513d512d7a82208c6cfb361996a21235348662
Author: Scott Wegner 
Date:   2016-04-27T20:08:49Z

Refactor Combine display data to not use ClassForDisplay

commit 39c8e5fd3aea8f03884b38301520f3b5dba1f72b
Author: Scott Wegner 
Date:   2016-04-27T20:15:33Z

Refactor DoFnReflector.SimpleDoFnAdapter to not use display data namespace 
override

commit d6f2a453803f25a8670d76ef46bfb39956f3a82b
Author: Scott Wegner 
Date:   2016-04-27T20:25:12Z

Remove ClassForDisplay helper type

commit 16e8d5a2b1901f0342e0bcc1b4d8736a0c8e337b
Author: Scott Wegner 
Date:   2016-04-27T22:00:03Z

Add test case for display data multi-level namespace overrides




> Remove ClassForDisplay infrastructure class.
> 
>
> Key: BEAM-231
> URL: https://issues.apache.org/jira/browse/BEAM-231
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Scott Wegner
>
> See discussion here: 
> https://github.com/apache/incubator-beam/pull/247#discussion-diff-61184975
> This class should no longer be needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-231] Remove ClassForDisplay hel...

2016-04-28 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/259

[BEAM-231] Remove ClassForDisplay helper type

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam classfordisplay

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/259.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #259


commit d1513d512d7a82208c6cfb361996a21235348662
Author: Scott Wegner 
Date:   2016-04-27T20:08:49Z

Refactor Combine display data to not use ClassForDisplay

commit 39c8e5fd3aea8f03884b38301520f3b5dba1f72b
Author: Scott Wegner 
Date:   2016-04-27T20:15:33Z

Refactor DoFnReflector.SimpleDoFnAdapter to not use display data namespace 
override

commit d6f2a453803f25a8670d76ef46bfb39956f3a82b
Author: Scott Wegner 
Date:   2016-04-27T20:25:12Z

Remove ClassForDisplay helper type

commit 16e8d5a2b1901f0342e0bcc1b4d8736a0c8e337b
Author: Scott Wegner 
Date:   2016-04-27T22:00:03Z

Add test case for display data multi-level namespace overrides




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-236) Implement Windowing in batch execution

2016-04-28 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-236:

Assignee: (was: Maximilian Michels)

> Implement Windowing in batch execution
> --
>
> Key: BEAM-236
> URL: https://issues.apache.org/jira/browse/BEAM-236
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>
> Windows need to be handled correctly in the batched execution of the Flink 
> Runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-236) Implement Windowing in batch execution

2016-04-28 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15262122#comment-15262122
 ] 

Aljoscha Krettek commented on BEAM-236:
---

By the way, don't forget to correctly handle windows for side inputs. :-)

> Implement Windowing in batch execution
> --
>
> Key: BEAM-236
> URL: https://issues.apache.org/jira/browse/BEAM-236
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>
> Windows need to be handled correctly in the batched execution of the Flink 
> Runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-236) Implement Windowing in batch execution

2016-04-28 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-236:
---

 Summary: Implement Windowing in batch execution
 Key: BEAM-236
 URL: https://issues.apache.org/jira/browse/BEAM-236
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Reporter: Maximilian Michels
Assignee: Maximilian Michels


Windows need to be handled correctly in the batched execution of the Flink 
Runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-235) Remove streaming flag

2016-04-28 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-235:
---

 Summary: Remove streaming flag
 Key: BEAM-235
 URL: https://issues.apache.org/jira/browse/BEAM-235
 Project: Beam
  Issue Type: Improvement
  Components: runner-flink
Reporter: Maximilian Michels


The streaming flag of the Flink Runner should be removed in favor of an 
automated decision during translation to execute using either batch or 
streaming API. For example, if the user specified any unbounded sources, the 
program could be executed using the Flink streaming API. Otherwise, batch could 
be chosen.

To do that, we need to get the batch execution on par with the streaming 
execution. The batch part misses windowing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)