Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #717

2016-07-08 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #716

2016-07-08 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-389) DelegateCoder needs equals and hashCode

2016-07-08 Thread Pei He (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pei He closed BEAM-389.
---
   Resolution: Fixed
Fix Version/s: 0.2.0-incubating

> DelegateCoder needs equals and hashCode
> ---
>
> Key: BEAM-389
> URL: https://issues.apache.org/jira/browse/BEAM-389
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
> Fix For: 0.2.0-incubating
>
>
> Currently, DelegateCoder inherit equals() and hashCode() from StandardCoder.
> And, it makes DelegateCoder.of(VarIntCoder) equal to 
> DelegateCoder.of(BigEndianIntegerCoder).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-389) DelegateCoder needs equals and hashCode

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368815#comment-15368815
 ] 

ASF GitHub Bot commented on BEAM-389:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/559


> DelegateCoder needs equals and hashCode
> ---
>
> Key: BEAM-389
> URL: https://issues.apache.org/jira/browse/BEAM-389
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>
> Currently, DelegateCoder inherit equals() and hashCode() from StandardCoder.
> And, it makes DelegateCoder.of(VarIntCoder) equal to 
> DelegateCoder.of(BigEndianIntegerCoder).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Closes #559

2016-07-08 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 90abca193 -> 9d7002545


Closes #559


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9d700254
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9d700254
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9d700254

Branch: refs/heads/master
Commit: 9d700254530dea9a15772e744439ed80d6c25c49
Parents: 90abca1 4d6a102
Author: Dan Halperin 
Authored: Fri Jul 8 17:32:28 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 17:32:28 2016 -0700

--
 .../flink/streaming/StateSerializationTest.java | 20 
 .../apache/beam/sdk/coders/DelegateCoder.java   | 26 +-
 .../beam/sdk/coders/StringDelegateCoder.java| 51 +++-
 .../beam/sdk/coders/DelegateCoderTest.java  | 43 +
 4 files changed, 136 insertions(+), 4 deletions(-)
--




[GitHub] incubator-beam pull request #559: [BEAM-389] Provide equals and hashCode in ...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/559


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Provide equals and hashCode in DelegateCoder

2016-07-08 Thread dhalperi
Provide equals and hashCode in DelegateCoder


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/4d6a1020
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/4d6a1020
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/4d6a1020

Branch: refs/heads/master
Commit: 4d6a10203df3ba6d3923efc2c8776576de5b0d38
Parents: 90abca1
Author: Pei He 
Authored: Wed Jun 29 14:21:42 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 17:32:28 2016 -0700

--
 .../flink/streaming/StateSerializationTest.java | 20 
 .../apache/beam/sdk/coders/DelegateCoder.java   | 26 +-
 .../beam/sdk/coders/StringDelegateCoder.java| 51 +++-
 .../beam/sdk/coders/DelegateCoderTest.java  | 43 +
 4 files changed, 136 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/4d6a1020/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/StateSerializationTest.java
--
diff --git 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/StateSerializationTest.java
 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/StateSerializationTest.java
index 44f4ecb..6635d32 100644
--- 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/StateSerializationTest.java
+++ 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/StateSerializationTest.java
@@ -99,6 +99,16 @@ public class StateSerializationTest {
   public Integer apply(int[] accumulator) {
 return accumulator[0];
   }
+
+  @Override
+  public boolean equals(Object o) {
+return o != null && this.getClass() == o.getClass();
+  }
+
+ @Override
+  public int hashCode() {
+return this.getClass().hashCode();
+  }
 },
 new DelegateCoder.CodingFunction() {
   @Override
@@ -107,6 +117,16 @@ public class StateSerializationTest {
 a[0] = value;
 return a;
   }
+
+  @Override
+  public boolean equals(Object o) {
+return o != null && this.getClass() == o.getClass();
+  }
+
+  @Override
+  public int hashCode() {
+return this.getClass().hashCode();
+  }
 });
 
   private static final StateTag STRING_VALUE_ADDR =

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/4d6a1020/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
index 905178b..385c149 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
@@ -17,6 +17,8 @@
  */
 package org.apache.beam.sdk.coders;
 
+import com.google.common.base.MoreObjects;
+import com.google.common.base.Objects;
 import com.google.common.collect.Lists;
 
 import java.io.IOException;
@@ -42,7 +44,7 @@ import java.util.List;
  * @param  The type of objects coded by this Coder.
  * @param  The type of objects a {@code T} will be converted to 
for coding.
  */
-public class DelegateCoder extends CustomCoder {
+public final class DelegateCoder extends CustomCoder {
   /**
* A {@link DelegateCoder.CodingFunction CodingFunctionInputT, 
OutputT} is a serializable
* function from {@code InputT} to {@code OutputT} that may throw any {@link 
Exception}.
@@ -101,8 +103,28 @@ public class DelegateCoder extends 
CustomCoder {
   }
 
   @Override
+  public boolean equals(Object o) {
+if (o == null || this.getClass() != o.getClass()) {
+  return false;
+}
+DelegateCoder that = (DelegateCoder) o;
+return Objects.equal(this.coder, that.coder)
+&& Objects.equal(this.toFn, that.toFn)
+&& Objects.equal(this.fromFn, that.fromFn);
+  }
+
+  @Override
+  public int hashCode() {
+return Objects.hashCode(this.coder, this.toFn, this.fromFn);
+  }
+
+  @Override
   public String toString() {
-return "DelegateCoder(" + coder + ")";
+return MoreObjects.toStringHelper(getClass())
+.add("coder", coder)
+.add("toFn", toFn)
+.add("fromFn", fromFn)
+.toString();
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/4d6a1020/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StringDelegateCoder.java

[GitHub] incubator-beam pull request #590: [BEAM-426] Update bigtable-client-core to ...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/590


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-beam pull request #618: Changed JUnitMatchers to Matchers

2016-07-08 Thread ianzhou1
GitHub user ianzhou1 opened a pull request:

https://github.com/apache/incubator-beam/pull/618

Changed JUnitMatchers to Matchers



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ianzhou1/incubator-beam JUnitMatchers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/618.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #618


commit 1614a33ccc7fbc966d912b149e48036c1f342349
Author: Ian Zhou 
Date:   2016-07-09T00:28:31Z

Changed JUnitMatchers to Matchers




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-372) CoderProperties: Test that the coder doesn't consume more bytes than it produces

2016-07-08 Thread Chandni Singh (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368748#comment-15368748
 ] 

Chandni Singh commented on BEAM-372:


Sorry haven't started working on it yet. If you will like to take it up, please 
do so. I'll pick up another started ticket later.

> CoderProperties: Test that the coder doesn't consume more bytes than it 
> produces
> 
>
> Key: BEAM-372
> URL: https://issues.apache.org/jira/browse/BEAM-372
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Chandni Singh
>Priority: Minor
>  Labels: beginner, newbie, starter
>
> Add a test to CoderProperties that does the following:
> 1. Encode a value using the Coder in the nested context
> 2. Decode the value using the Coder in the nested context
> 3. Verify that the input stream wasn't requested to consume any extra bytes
> (This could possibly just be an enhancement to the existing round-trip 
> encode/decode test)
> When this fails it can lead to very difficult to debug situations in a coder 
> wrapped around the problematic coder. This would be an easy test that would 
> clearly fail *for the coder which was problematic*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-391) Exceptions in gcsio upload thread causes pipeline to stall

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368723#comment-15368723
 ] 

ASF GitHub Bot commented on BEAM-391:
-

GitHub user aaltay opened a pull request:

https://github.com/apache/incubator-beam/pull/617

[BEAM-391] Handle HttpError in GCS upload thread

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Handle HttpError in GCS upload thread, break connection to the main thread 
and propagate the exception.

Retry in auth _refresh() to guard against temporary errors in the
metadata service.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/incubator-beam is391

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #617


commit e4ccb45a9ba74e29c36c77aef4c0c4bc137c9af8
Author: Ahmet Altay 
Date:   2016-07-08T23:28:07Z

Handle HttpError in GCS upload thread, break connection to the main
thread and propagate the exception.

Retry in auth _refresh() to guard against temporary errors in the
metadata service.




> Exceptions in gcsio upload thread causes pipeline to stall
> --
>
> Key: BEAM-391
> URL: https://issues.apache.org/jira/browse/BEAM-391
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>
> gcsio got stuck with invalid bucket name
> GcsBufferedWriter._start_upload (gcsio.py) raises an exception if the bucket 
> does not exist. This causes upload thread to silenty fail. It logs exception 
> to the log but this does not stop the pipeline or closes the receiving end of 
> the multiprocessing.Pipe(). Later a call in to write() blocks at 
> self.conn.send_bytes(). Note that send may block if the buffer is full.
> Upload thread should have a finally clause to close the socket connection. Or 
> better propagating the exception to its parent. This is true for other types 
> of exceptions also.
> Another small issue in the GcsBufferedWriter.close(). It does not self 
> self.close to True.
> reproduction: python -m apache_beam.examples.wordcount --output 
> gs://no-such-thing/
> Prints the exception but goes on forever. Ctrl + C breaks the main thread 
> shows where it got stuck.
> Similarly reproducible on the service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #617: [BEAM-391] Handle HttpError in GCS upload ...

2016-07-08 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/incubator-beam/pull/617

[BEAM-391] Handle HttpError in GCS upload thread

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Handle HttpError in GCS upload thread, break connection to the main thread 
and propagate the exception.

Retry in auth _refresh() to guard against temporary errors in the
metadata service.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/incubator-beam is391

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/617.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #617


commit e4ccb45a9ba74e29c36c77aef4c0c4bc137c9af8
Author: Ahmet Altay 
Date:   2016-07-08T23:28:07Z

Handle HttpError in GCS upload thread, break connection to the main
thread and propagate the exception.

Retry in auth _refresh() to guard against temporary errors in the
metadata service.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-372) CoderProperties: Test that the coder doesn't consume more bytes than it produces

2016-07-08 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368718#comment-15368718
 ] 

Daniel Halperin commented on BEAM-372:
--

[~csingh] have you had any progress or blockers? Should I unassign?

> CoderProperties: Test that the coder doesn't consume more bytes than it 
> produces
> 
>
> Key: BEAM-372
> URL: https://issues.apache.org/jira/browse/BEAM-372
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Chandni Singh
>Priority: Minor
>  Labels: beginner, newbie, starter
>
> Add a test to CoderProperties that does the following:
> 1. Encode a value using the Coder in the nested context
> 2. Decode the value using the Coder in the nested context
> 3. Verify that the input stream wasn't requested to consume any extra bytes
> (This could possibly just be an enhancement to the existing round-trip 
> encode/decode test)
> When this fails it can lead to very difficult to debug situations in a coder 
> wrapped around the problematic coder. This would be an easy test that would 
> clearly fail *for the coder which was problematic*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-372) CoderProperties: Test that the coder doesn't consume more bytes than it produces

2016-07-08 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368716#comment-15368716
 ] 

Daniel Halperin commented on BEAM-372:
--

Another idea: put elements in a list and use a list coder to encode/decode.

> CoderProperties: Test that the coder doesn't consume more bytes than it 
> produces
> 
>
> Key: BEAM-372
> URL: https://issues.apache.org/jira/browse/BEAM-372
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Chandni Singh
>Priority: Minor
>  Labels: beginner, newbie, starter
>
> Add a test to CoderProperties that does the following:
> 1. Encode a value using the Coder in the nested context
> 2. Decode the value using the Coder in the nested context
> 3. Verify that the input stream wasn't requested to consume any extra bytes
> (This could possibly just be an enhancement to the existing round-trip 
> encode/decode test)
> When this fails it can lead to very difficult to debug situations in a coder 
> wrapped around the problematic coder. This would be an easy test that would 
> clearly fail *for the coder which was problematic*.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-360) Add a framework for creating Python-SDK sources for new file types

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368706#comment-15368706
 ] 

ASF GitHub Bot commented on BEAM-360:
-

GitHub user chamikaramj reopened a pull request:

https://github.com/apache/incubator-beam/pull/599

[BEAM-360] Some updates related to dynamic work rebalancing of custom 
sources.

Adds a class 'iobase.BoundedSourceSplit' to represent dynamic work 
rebalancing results of custom sources.

Updates Dataflow runner specific code (apiclient.py) to support dynamic 
work rebalancing custom sources.

Updates 'OffsetRangeTracker' so that the result of 
'position_at_fraction()'' is a 'long' instead of a 'float'.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/incubator-beam custom_sources_dwr

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/599.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #599


commit 19a41ccf5bcf00192e3646258eae0cbce85da23b
Author: Chamikara Jayalath 
Date:   2016-07-07T03:25:04Z

Adds a class 'iobase.BoundedSourceSplit' to represent dynamic work 
rebalancing result of custom sources.

Updates Dataflow runner specific code (apiclient.py) to support dynamic 
work rebalancing custom sources.

Updates 'OffsetRangeTracker' so that the result of 
'position_at_fraction()'' is a 'long' instead of a 'float'.

commit 4415989ef0dfd656643e6e8575b6e2090b4437b5
Author: Chamikara Jayalath 
Date:   2016-07-07T03:34:21Z

Adds more comments.

commit 6aa697465e88f827a3121a1de8bad1b810d904da
Author: Chamikara Jayalath 
Date:   2016-07-07T04:41:20Z

Some updates related to dynamic work rebalancing custom sources.

Adds a class 'iobase.BoundedSourceSplit' to represent dynamic work 
rebalancing result of custom sources.

Updates Dataflow runner specific code (apiclient.py) to support dynamic 
work rebalancing custom sources.

Updates 'OffsetRangeTracker' so that the result of 
'position_at_fraction()'' is a 'long' instead of a 'float'.

commit 1e01b1f5cd70e5b39cd064577110898c623e524a
Author: Chamikara Jayalath 
Date:   2016-07-08T19:01:42Z

Reverting some updates.

commit 171df1ecedd51c7c72db309d526dfa9badf1
Author: Chamikara Jayalath 
Date:   2016-07-08T22:34:52Z

Adds a method 'fileio.ChannelFactory.size_in_bytes()'' that can be used to 
determine the size of a single file.

Updates 'filebasedsource' to use this method when determining size of files.




> Add a framework for creating Python-SDK sources for new file types
> --
>
> Key: BEAM-360
> URL: https://issues.apache.org/jira/browse/BEAM-360
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> We already have a framework for creating new sources for Beam Python SDK - 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/iobase.py#L326
> It would be great if we can add a framework on top of this that encapsulates 
> logic common to sources that are based on files. This framework can include 
> following features that are common to sources based on files.
> (1) glob expansion
> (2) support for new file-systems
> (3) dynamic work rebalancing based on byte offsets
> (4) support for reading compressed files.
> Java SDK has a similar framework and it's available at - 
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #599: [BEAM-360] Some updates related to dynamic...

2016-07-08 Thread chamikaramj
GitHub user chamikaramj reopened a pull request:

https://github.com/apache/incubator-beam/pull/599

[BEAM-360] Some updates related to dynamic work rebalancing of custom 
sources.

Adds a class 'iobase.BoundedSourceSplit' to represent dynamic work 
rebalancing results of custom sources.

Updates Dataflow runner specific code (apiclient.py) to support dynamic 
work rebalancing custom sources.

Updates 'OffsetRangeTracker' so that the result of 
'position_at_fraction()'' is a 'long' instead of a 'float'.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/incubator-beam custom_sources_dwr

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/599.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #599


commit 19a41ccf5bcf00192e3646258eae0cbce85da23b
Author: Chamikara Jayalath 
Date:   2016-07-07T03:25:04Z

Adds a class 'iobase.BoundedSourceSplit' to represent dynamic work 
rebalancing result of custom sources.

Updates Dataflow runner specific code (apiclient.py) to support dynamic 
work rebalancing custom sources.

Updates 'OffsetRangeTracker' so that the result of 
'position_at_fraction()'' is a 'long' instead of a 'float'.

commit 4415989ef0dfd656643e6e8575b6e2090b4437b5
Author: Chamikara Jayalath 
Date:   2016-07-07T03:34:21Z

Adds more comments.

commit 6aa697465e88f827a3121a1de8bad1b810d904da
Author: Chamikara Jayalath 
Date:   2016-07-07T04:41:20Z

Some updates related to dynamic work rebalancing custom sources.

Adds a class 'iobase.BoundedSourceSplit' to represent dynamic work 
rebalancing result of custom sources.

Updates Dataflow runner specific code (apiclient.py) to support dynamic 
work rebalancing custom sources.

Updates 'OffsetRangeTracker' so that the result of 
'position_at_fraction()'' is a 'long' instead of a 'float'.

commit 1e01b1f5cd70e5b39cd064577110898c623e524a
Author: Chamikara Jayalath 
Date:   2016-07-08T19:01:42Z

Reverting some updates.

commit 171df1ecedd51c7c72db309d526dfa9badf1
Author: Chamikara Jayalath 
Date:   2016-07-08T22:34:52Z

Adds a method 'fileio.ChannelFactory.size_in_bytes()'' that can be used to 
determine the size of a single file.

Updates 'filebasedsource' to use this method when determining size of files.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-360) Add a framework for creating Python-SDK sources for new file types

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368612#comment-15368612
 ] 

ASF GitHub Bot commented on BEAM-360:
-

Github user chamikaramj closed the pull request at:

https://github.com/apache/incubator-beam/pull/599


> Add a framework for creating Python-SDK sources for new file types
> --
>
> Key: BEAM-360
> URL: https://issues.apache.org/jira/browse/BEAM-360
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> We already have a framework for creating new sources for Beam Python SDK - 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/iobase.py#L326
> It would be great if we can add a framework on top of this that encapsulates 
> logic common to sources that are based on files. This framework can include 
> following features that are common to sources based on files.
> (1) glob expansion
> (2) support for new file-systems
> (3) dynamic work rebalancing based on byte offsets
> (4) support for reading compressed files.
> Java SDK has a similar framework and it's available at - 
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSource.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #715

2016-07-08 Thread Apache Jenkins Server
See 




[GitHub] incubator-beam pull request #615: Making the dataflow temp_location argument...

2016-07-08 Thread zoyahav
GitHub user zoyahav opened a pull request:

https://github.com/apache/incubator-beam/pull/615

Making the dataflow temp_location argument optional

Making the dataflow temp_location argument optional in the python SDK to 
match the java SDK.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zoyahav/incubator-beam python-sdk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/615.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #615


commit 8c165db297fd6749cf2d18fa085a56111a070443
Author: Zohar Yahav 
Date:   2016-07-08T21:03:19Z

Migrated the changes to make temp_location optional.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-433) Make Beam examples runners agnostic

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368491#comment-15368491
 ] 

ASF GitHub Bot commented on BEAM-433:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/612


> Make Beam examples runners agnostic
> ---
>
> Key: BEAM-433
> URL: https://issues.apache.org/jira/browse/BEAM-433
> Project: Beam
>  Issue Type: Improvement
>Reporter: Pei He
>Assignee: Pei He
>
> Beam examples are ported from Dataflow, and they heavily reference to 
> Dataflow classes.
> There are following cleanup tasks:
> 1. Remove Dataflow streaming and batch injector setup (Done).
> 2. Remove references to DataflowPipelineOptions.
> 3. Move cancel() from DataflowPipelineJob to PipelineResult.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #612: [BEAM-433] Remove isUnbounded option, and ...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/612


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #612

2016-07-08 Thread dhalperi
Closes #612


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d9632b7f
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d9632b7f
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d9632b7f

Branch: refs/heads/master
Commit: d9632b7f978759f65ab505e1b2a6cacac4c420c7
Parents: 994febe 626240a
Author: Dan Halperin 
Authored: Fri Jul 8 14:15:04 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 14:15:04 2016 -0700

--
 .../apache/beam/examples/WindowedWordCount.java |  9 +++
 .../beam/examples/common/ExampleUtils.java  | 26 +++-
 .../beam/examples/complete/AutoComplete.java|  6 ++---
 .../examples/complete/StreamingWordExtract.java |  6 ++---
 .../examples/complete/TrafficMaxLaneFlow.java   | 13 +++---
 .../beam/examples/complete/TrafficRoutes.java   | 13 +++---
 .../beam/examples/cookbook/TriggerExample.java  |  8 +++---
 7 files changed, 22 insertions(+), 59 deletions(-)
--




Jenkins build became unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #714

2016-07-08 Thread Apache Jenkins Server
See 




[GitHub] incubator-beam pull request #614: Add jacoco coverage

2016-07-08 Thread swegner
Github user swegner closed the pull request at:

https://github.com/apache/incubator-beam/pull/614


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-391) Exceptions in gcsio upload thread causes pipeline to stall

2016-07-08 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368449#comment-15368449
 ] 

Ahmet Altay commented on BEAM-391:
--

Another type of Exception that result in the same behavior:

Exception in thread Thread-10:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner 
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run 
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", line 
160, in wrapper return fun(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcsio.py", line 
563, in _start_upload self.client.objects.Insert(self.insert_request, 
upload=self.upload)
File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/internal/clients/storage/storage_v1_client.py",
 line 970, in Insertdownload=download)
File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/base_api.py", 
line 687, in _RunMethodhttp_request, client=self.client)
File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/transfer.py", 
line 838, in InitializeUploadretries=self.num_retries)
File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/http_wrapper.py", 
line 351, in MakeRequestmax_retry_wait, total_wait_sec))
File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/http_wrapper.py", 
line 341, in MakeRequestcheck_response_func=check_response_func)
File "/usr/local/lib/python2.7/dist-packages/apitools/base/py/http_wrapper.py", 
line 391, in _MakeRequestNoRetry redirections=redirections, 
connection_type=connection_type)
File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 616, 
in new_request self._refresh(request_orig)
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/auth.py", 
line 90, in _refresh token_data = json.loads(urllib2.urlopen(req).read())
File "/usr/lib/python2.7/urllib2.py", line 154, in urlopen return 
opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 431, in open response = 
self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 449, in _open '_open', req)
File "/usr/lib/python2.7/urllib2.py", line 409, in _call_chain result = 
func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1227, in http_open return 
self.do_open(httplib.HTTPConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1197, in do_open raise URLError(err)

Error is coming from auth.py _refresh(). That may require retries based on the 
type of error.

> Exceptions in gcsio upload thread causes pipeline to stall
> --
>
> Key: BEAM-391
> URL: https://issues.apache.org/jira/browse/BEAM-391
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>
> gcsio got stuck with invalid bucket name
> GcsBufferedWriter._start_upload (gcsio.py) raises an exception if the bucket 
> does not exist. This causes upload thread to silenty fail. It logs exception 
> to the log but this does not stop the pipeline or closes the receiving end of 
> the multiprocessing.Pipe(). Later a call in to write() blocks at 
> self.conn.send_bytes(). Note that send may block if the buffer is full.
> Upload thread should have a finally clause to close the socket connection. Or 
> better propagating the exception to its parent. This is true for other types 
> of exceptions also.
> Another small issue in the GcsBufferedWriter.close(). It does not self 
> self.close to True.
> reproduction: python -m apache_beam.examples.wordcount --output 
> gs://no-such-thing/
> Prints the exception but goes on forever. Ctrl + C breaks the main thread 
> shows where it got stuck.
> Similarly reproducible on the service.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #614: Add jacoco coverage

2016-07-08 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/614

Add jacoco coverage

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam coveralls-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/614.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #614


commit 5537f71bd27971c85319884fe06dfc63994409f4
Author: Scott Wegner 
Date:   2016-07-08T21:02:10Z

Add jacoco coverage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-433) Make Beam examples runners agnostic

2016-07-08 Thread Pei He (JIRA)
Pei He created BEAM-433:
---

 Summary: Make Beam examples runners agnostic
 Key: BEAM-433
 URL: https://issues.apache.org/jira/browse/BEAM-433
 Project: Beam
  Issue Type: Improvement
Reporter: Pei He
Assignee: Pei He


Beam examples are ported from Dataflow, and they heavily reference to Dataflow 
classes.

There are following cleanup tasks:
1. Remove Dataflow streaming and batch injector setup (Done).
2. Remove references to DataflowPipelineOptions.
3. Move cancel() from DataflowPipelineJob to PipelineResult.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #607: Mark primitive display data tests Runnable...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/607


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Mark primitive display data tests RunnableOnService

2016-07-08 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 744b0474e -> 994febef4


Mark primitive display data tests RunnableOnService


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9e0b1b42
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9e0b1b42
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9e0b1b42

Branch: refs/heads/master
Commit: 9e0b1b423ef722cdfbec0bde89d85faa760bf322
Parents: 744b047
Author: Scott Wegner 
Authored: Thu Jul 7 13:49:30 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 12:51:39 2016 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  | 10 ---
 .../runners/dataflow/io/DataflowAvroIOTest.java | 69 --
 .../dataflow/io/DataflowBigQueryIOTest.java | 94 
 .../dataflow/io/DataflowDatastoreIOTest.java| 66 --
 .../dataflow/io/DataflowPubsubIOTest.java   | 63 -
 .../runners/dataflow/io/DataflowTextIOTest.java | 76 
 .../transforms/DataflowCombineTest.java | 58 
 .../DataflowDisplayDataEvaluator.java   | 72 ---
 .../transforms/DataflowMapElementsTest.java | 55 
 .../org/apache/beam/sdk/transforms/Combine.java |  5 ++
 .../java/org/apache/beam/sdk/io/AvroIOTest.java | 34 +++
 .../org/apache/beam/sdk/io/BigQueryIOTest.java  | 75 +++-
 .../org/apache/beam/sdk/io/PubsubIOTest.java| 30 +++
 .../java/org/apache/beam/sdk/io/TextIOTest.java | 32 +++
 .../beam/sdk/io/datastore/V1Beta3Test.java  | 33 +++
 .../apache/beam/sdk/transforms/CombineTest.java | 19 
 .../beam/sdk/transforms/MapElementsTest.java| 22 +
 .../display/DisplayDataEvaluator.java   | 12 ++-
 18 files changed, 258 insertions(+), 567 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9e0b1b42/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 76e5f80..9cd1fb4 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -290,11 +290,6 @@
 
 
 
-  org.apache.avro
-  avro
-
-
-
   com.google.api-client
   google-api-client
 
@@ -331,11 +326,6 @@
 
 
 
-  com.google.apis
-  google-api-services-bigquery
-
-
-
   com.google.cloud.bigdataoss
   util
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/9e0b1b42/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/io/DataflowAvroIOTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/io/DataflowAvroIOTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/io/DataflowAvroIOTest.java
deleted file mode 100644
index 006daa9..000
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/io/DataflowAvroIOTest.java
+++ /dev/null
@@ -1,69 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.runners.dataflow.io;
-
-import static 
org.apache.beam.sdk.transforms.display.DisplayDataMatchers.hasDisplayItem;
-
-import static org.hamcrest.Matchers.hasItem;
-import static org.junit.Assert.assertThat;
-
-import org.apache.beam.runners.dataflow.DataflowRunner;
-import 
org.apache.beam.runners.dataflow.transforms.DataflowDisplayDataEvaluator;
-import org.apache.beam.sdk.io.AvroIO;
-import org.apache.beam.sdk.transforms.display.DisplayData;
-import org.apache.beam.sdk.transforms.display.DisplayDataEvaluator;
-
-import org.apache.avro.Schema;
-import org.junit.Test;
-import org.junit.runner.RunWith;
-import org.junit.runners.JUnit4;
-
-import java.util.Set;
-
-/**
- * {@link 

[GitHub] incubator-beam pull request #612: Remove isUnbounded option, and the corresp...

2016-07-08 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/612

Remove isUnbounded option, and the corresponding ExampleUtils constructor



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam rm-unbounded-options

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/612.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #612


commit 72f3574de23b8c4ce16f0ffb6cfafc370c8e2ca1
Author: Pei He 
Date:   2016-07-08T19:46:27Z

Remove isUnbounded option, and the corresponding ExampleUtils constructor




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-124) Testing -- End to End WordCount Batch and Streaming Tests

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368153#comment-15368153
 ] 

ASF GitHub Bot commented on BEAM-124:
-

GitHub user markflyhigh opened a pull request:

https://github.com/apache/incubator-beam/pull/611

[BEAM-124] WordCountIT: Add outputs verification

Add reviewers: @dhalperi @lukecwik 

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markflyhigh/incubator-beam 
wordcountit-improvement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #611


commit 1d36caebd9270841f34688c48bccb21eeb94395c
Author: Mark Liu 
Date:   2016-07-08T17:50:26Z

[BEAM-124] WordCountIT: Add outputs verification




> Testing -- End to End WordCount Batch and Streaming Tests
> -
>
> Key: BEAM-124
> URL: https://issues.apache.org/jira/browse/BEAM-124
> Project: Beam
>  Issue Type: New Feature
>  Components: testing
>Reporter: Steve Wheeler
>Assignee: Jason Kuster
>
> Set up testing infrastructure so that an end to end test for WordCount (both 
> batch and streaming) will be run periodically. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #611: [BEAM-124] WordCountIT: Add outputs verifi...

2016-07-08 Thread markflyhigh
GitHub user markflyhigh opened a pull request:

https://github.com/apache/incubator-beam/pull/611

[BEAM-124] WordCountIT: Add outputs verification

Add reviewers: @dhalperi @lukecwik 

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markflyhigh/incubator-beam 
wordcountit-improvement

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/611.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #611


commit 1d36caebd9270841f34688c48bccb21eeb94395c
Author: Mark Liu 
Date:   2016-07-08T17:50:26Z

[BEAM-124] WordCountIT: Add outputs verification




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Modified BigtableIO to support streaming

2016-07-08 Thread dhalperi
Modified BigtableIO to support streaming


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/12f15993
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/12f15993
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/12f15993

Branch: refs/heads/master
Commit: 12f159934ec7965c3974cda319681103a817778b
Parents: a7312be
Author: Ian Zhou 
Authored: Fri Jul 1 16:14:53 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 10:43:32 2016 -0700

--
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 260 +++
 1 file changed, 92 insertions(+), 168 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/12f15993/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
index 0c485bf..4bab45e 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigtable/BigtableIO.java
@@ -23,18 +23,17 @@ import static 
com.google.common.base.Preconditions.checkState;
 
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.VarLongCoder;
 import org.apache.beam.sdk.coders.protobuf.ProtoCoder;
 import org.apache.beam.sdk.io.BoundedSource;
 import org.apache.beam.sdk.io.BoundedSource.BoundedReader;
-import org.apache.beam.sdk.io.Sink.WriteOperation;
-import org.apache.beam.sdk.io.Sink.Writer;
 import org.apache.beam.sdk.io.range.ByteKey;
 import org.apache.beam.sdk.io.range.ByteKeyRange;
 import org.apache.beam.sdk.io.range.ByteKeyRangeTracker;
 import org.apache.beam.sdk.options.PipelineOptions;
 import org.apache.beam.sdk.runners.PipelineRunner;
+import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
 import org.apache.beam.sdk.transforms.display.DisplayData;
 import org.apache.beam.sdk.util.ReleaseInfo;
 import org.apache.beam.sdk.values.KV;
@@ -453,8 +452,8 @@ public class BigtableIO {
 
 @Override
 public PDone apply(PCollection> input) {
-  Sink sink = new Sink(tableId, getBigtableService());
-  return input.apply(org.apache.beam.sdk.io.Write.to(sink));
+  input.apply(ParDo.of(new BigtableWriterFn(tableId, 
getBigtableService(;
+  return PDone.in(input.getPipeline());
 }
 
 @Override
@@ -514,6 +513,94 @@ public class BigtableIO {
   }
   return new BigtableServiceImpl(options);
 }
+
+private class BigtableWriterFn extends DoFn, Void> {
+
+  public BigtableWriterFn(String tableId, BigtableService bigtableService) 
{
+this.tableId = checkNotNull(tableId, "tableId");
+this.bigtableService = checkNotNull(bigtableService, 
"bigtableService");
+this.failures = new ConcurrentLinkedQueue<>();
+  }
+
+  @Override
+  public void startBundle(Context c) throws Exception {
+bigtableWriter = bigtableService.openForWriting(tableId);
+recordsWritten = 0;
+  }
+
+  @Override
+  public void processElement(ProcessContext c) throws Exception {
+checkForFailures();
+Futures.addCallback(
+bigtableWriter.writeRecord(c.element()), new 
WriteExceptionCallback(c.element()));
+++recordsWritten;
+  }
+
+  @Override
+  public void finishBundle(Context c) throws Exception {
+bigtableWriter.close();
+bigtableWriter = null;
+checkForFailures();
+logger.info("Wrote {} records", recordsWritten);
+  }
+
+  @Override
+  public void populateDisplayData(DisplayData.Builder builder) {
+Write.this.populateDisplayData(builder);
+  }
+
+  
///
+  private final String tableId;
+  private final BigtableService bigtableService;
+  private BigtableService.Writer bigtableWriter;
+  private long recordsWritten;
+  private final ConcurrentLinkedQueue failures;
+
+  /**
+   * If any write has asynchronously failed, fail the bundle with a useful 
error.
+   */
+  private void checkForFailures() throws IOException {
+// Note that this function is never called by multiple threads and is 
the only place that
+// we remove from 

[jira] [Commented] (BEAM-46) Unbounded sink for Google Cloud Bigtable

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-46?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15368047#comment-15368047
 ] 

ASF GitHub Bot commented on BEAM-46:


Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/596


> Unbounded sink for Google Cloud Bigtable
> 
>
> Key: BEAM-46
> URL: https://issues.apache.org/jira/browse/BEAM-46
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Ian Zhou
>
> Google Cloud Bigtable is currently in Beta. 
> https://cloud.google.com/bigtable/ A bounded sink is included in the initial 
> code for Beam, and uses asynchronous row mutations (with bounded memory) for 
> maximum throughput.
> The unbounded sink code is in principle not too different. The key areas of 
> focus are better connection management, thread management, and fault 
> tolerance (e.g., so connections are not leaked if bundles fail) in the 
> unbounded case in which there are hundreds of active threads and very small 
> bundles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Closes #596

2016-07-08 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master a7312bee3 -> 744b0474e


Closes #596


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/744b0474
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/744b0474
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/744b0474

Branch: refs/heads/master
Commit: 744b0474edb94cefbb9486664231004db8e00a72
Parents: a7312be 12f1599
Author: Dan Halperin 
Authored: Fri Jul 8 10:43:32 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 10:43:32 2016 -0700

--
 .../beam/sdk/io/gcp/bigtable/BigtableIO.java| 260 +++
 1 file changed, 92 insertions(+), 168 deletions(-)
--




[jira] [Updated] (BEAM-432) BigQueryIO javadoc is incorrect

2016-07-08 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-432:
-
Labels: newbie starter  (was: )

> BigQueryIO javadoc is incorrect
> ---
>
> Key: BEAM-432
> URL: https://issues.apache.org/jira/browse/BEAM-432
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Priority: Trivial
>  Labels: newbie, starter
>
> BigQueryIO has a broken code snippet in its Javadoc:
> The table name in fromQuery() should be within square brackets i.e.
> .fromQuery("SELECT year, mean_temp FROM samples.weather_stations")
> should be:
> .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Exclude AppleJavaExtensions from findbugs plugin deps

2016-07-08 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master a2b118afe -> a7312bee3


Exclude AppleJavaExtensions from findbugs plugin deps

This is for a GUI component of findbugs, and is not
a real runtime dependency of our build. While it is not
distributed with any of our artifacts, it is still
one more needless thing to pull down at build time,
and one more license to be aware of.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ddf5cc27
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ddf5cc27
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ddf5cc27

Branch: refs/heads/master
Commit: ddf5cc27e7b4cf7df653e916a39cef2dea1b67bd
Parents: a2b118a
Author: Kenneth Knowles 
Authored: Fri Jul 8 09:05:29 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 10:35:04 2016 -0700

--
 pom.xml | 15 +++
 1 file changed, 15 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/ddf5cc27/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 14a9c67..72d73fb 100644
--- a/pom.xml
+++ b/pom.xml
@@ -948,10 +948,25 @@
   beam-sdks-java-build-tools
   ${project.version}
 
+
+
+
+  com.google.code.findbugs
+  findbugs
+  ${findbugs.version}
+  
+
+  com.apple
+  AppleJavaExtensions
+
+  
+
   
+
   
 beam/findbugs-filter.xml
   
+
   
 
   test



[GitHub] incubator-beam pull request #610: Exclude AppleJavaExtensions from findbugs ...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/610


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #610

2016-07-08 Thread dhalperi
Closes #610


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a7312bee
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a7312bee
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a7312bee

Branch: refs/heads/master
Commit: a7312bee35e41071712b92ae5b5b6b117e538d33
Parents: a2b118a ddf5cc2
Author: Dan Halperin 
Authored: Fri Jul 8 10:35:05 2016 -0700
Committer: Dan Halperin 
Committed: Fri Jul 8 10:35:05 2016 -0700

--
 pom.xml | 15 +++
 1 file changed, 15 insertions(+)
--




[GitHub] incubator-beam pull request #574: Fix, suppress, or file bugs for FindBugs i...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/574


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #574

2016-07-08 Thread kenn
This closes #574


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a2b118af
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a2b118af
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a2b118af

Branch: refs/heads/master
Commit: a2b118afee886c80c3152af11c641ac5870abfaa
Parents: 5f0288b 3875071
Author: Kenneth Knowles 
Authored: Fri Jul 8 10:01:03 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Jul 8 10:01:03 2016 -0700

--
 .../apache/beam/sdk/util/PaneInfoTracker.java   |   4 +
 .../apache/beam/sdk/util/SystemReduceFn.java|   4 +
 .../org/apache/beam/sdk/util/TriggerRunner.java |  10 +
 .../org/apache/beam/sdk/util/WatermarkHold.java |   4 +
 .../src/main/resources/beam/findbugs-filter.xml | 412 ++-
 .../apache/beam/sdk/coders/InstantCoder.java|  33 +-
 .../main/java/org/apache/beam/sdk/io/Write.java |   3 +-
 .../sdk/options/PipelineOptionsFactory.java |  10 +-
 .../beam/sdk/testing/CoderProperties.java   |   3 +-
 .../org/apache/beam/sdk/testing/PAssert.java|   4 +
 .../beam/sdk/testing/SerializableMatchers.java  |   4 +
 .../apache/beam/sdk/transforms/DoFnTester.java  |   2 -
 .../sdk/transforms/display/DisplayData.java |   2 +-
 .../beam/sdk/transforms/join/CoGbkResult.java   |   2 +-
 .../windowing/AfterDelayFromFirstElement.java   |   6 +
 .../sdk/transforms/windowing/AfterPane.java |   4 +
 ...AttemptAndTimeBoundedExponentialBackOff.java |   4 +
 .../sdk/util/ExposedByteArrayInputStream.java   |   3 +
 .../sdk/util/ExposedByteArrayOutputStream.java  |   4 +
 .../apache/beam/sdk/util/PCollectionViews.java  |   4 +-
 .../org/apache/beam/sdk/util/ReleaseInfo.java   |  12 +-
 .../beam/sdk/util/TriggerContextFactory.java|   5 +-
 .../beam/sdk/util/common/ReflectHelpers.java|  13 +-
 .../beam/sdk/util/state/StateMerging.java   |  18 +-
 24 files changed, 137 insertions(+), 433 deletions(-)
--




[1/2] incubator-beam git commit: Fix and file bugs for FindBugs issues

2016-07-08 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 5f0288b44 -> a2b118afe


Fix and file bugs for FindBugs issues


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3875071b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3875071b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3875071b

Branch: refs/heads/master
Commit: 3875071b5289bb729ad84de97b161d679d3c3292
Parents: 1963bde
Author: Scott Wegner 
Authored: Thu Jun 30 16:16:14 2016 -0700
Committer: Scott Wegner 
Committed: Thu Jul 7 09:47:44 2016 -0700

--
 .../apache/beam/sdk/util/PaneInfoTracker.java   |   4 +
 .../apache/beam/sdk/util/SystemReduceFn.java|   4 +
 .../org/apache/beam/sdk/util/TriggerRunner.java |  10 +
 .../org/apache/beam/sdk/util/WatermarkHold.java |   4 +
 .../src/main/resources/beam/findbugs-filter.xml | 412 ++-
 .../apache/beam/sdk/coders/InstantCoder.java|  33 +-
 .../main/java/org/apache/beam/sdk/io/Write.java |   3 +-
 .../sdk/options/PipelineOptionsFactory.java |  10 +-
 .../beam/sdk/testing/CoderProperties.java   |   3 +-
 .../org/apache/beam/sdk/testing/PAssert.java|   4 +
 .../beam/sdk/testing/SerializableMatchers.java  |   4 +
 .../apache/beam/sdk/transforms/DoFnTester.java  |   2 -
 .../sdk/transforms/display/DisplayData.java |   2 +-
 .../beam/sdk/transforms/join/CoGbkResult.java   |   2 +-
 .../windowing/AfterDelayFromFirstElement.java   |   6 +
 .../sdk/transforms/windowing/AfterPane.java |   4 +
 ...AttemptAndTimeBoundedExponentialBackOff.java |   4 +
 .../sdk/util/ExposedByteArrayInputStream.java   |   3 +
 .../sdk/util/ExposedByteArrayOutputStream.java  |   4 +
 .../apache/beam/sdk/util/PCollectionViews.java  |   4 +-
 .../org/apache/beam/sdk/util/ReleaseInfo.java   |  12 +-
 .../beam/sdk/util/TriggerContextFactory.java|   5 +-
 .../beam/sdk/util/common/ReflectHelpers.java|  13 +-
 .../beam/sdk/util/state/StateMerging.java   |  18 +-
 24 files changed, 137 insertions(+), 433 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/3875071b/runners/core-java/src/main/java/org/apache/beam/sdk/util/PaneInfoTracker.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/sdk/util/PaneInfoTracker.java 
b/runners/core-java/src/main/java/org/apache/beam/sdk/util/PaneInfoTracker.java
index 0a47feb..812e99a 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/sdk/util/PaneInfoTracker.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/sdk/util/PaneInfoTracker.java
@@ -33,6 +33,8 @@ import com.google.common.annotations.VisibleForTesting;
 
 import org.joda.time.Instant;
 
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
 /**
  * Determine the timing and other properties of a new pane for a given 
computation, key and window.
  * Incorporates any previous pane, whether the pane has been produced because 
an
@@ -70,6 +72,8 @@ public class PaneInfoTracker {
 
 return new ReadableState() {
   @Override
+  @SuppressFBWarnings(value = "RV_RETURN_VALUE_IGNORED_NO_SIDE_EFFECT",
+  justification = "prefetch side effect")
   public ReadableState readLater() {
 previousPaneFuture.readLater();
 return this;

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/3875071b/runners/core-java/src/main/java/org/apache/beam/sdk/util/SystemReduceFn.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/sdk/util/SystemReduceFn.java 
b/runners/core-java/src/main/java/org/apache/beam/sdk/util/SystemReduceFn.java
index 254..f7dca94 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/sdk/util/SystemReduceFn.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/sdk/util/SystemReduceFn.java
@@ -34,6 +34,8 @@ import org.apache.beam.sdk.util.state.StateMerging;
 import org.apache.beam.sdk.util.state.StateTag;
 import org.apache.beam.sdk.util.state.StateTags;
 
+import edu.umd.cs.findbugs.annotations.SuppressFBWarnings;
+
 /**
  * {@link ReduceFn} implementing the default reduction behaviors of {@link 
GroupByKey}.
  *
@@ -114,6 +116,8 @@ public abstract class SystemReduceFn

[GitHub] incubator-beam pull request #610: Exclude AppleJavaExtensions from findbugs ...

2016-07-08 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/incubator-beam/pull/610

Exclude AppleJavaExtensions from findbugs plugin deps

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

This dependency is for a GUI component of findbugs, and is not a real 
runtime dependency of our build. While it is not distributed with any of our 
artifacts, it is still one more needless thing to pull down at build time, and 
one more license to be aware of.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/incubator-beam findbugs-exclusion

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/610.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #610


commit 13003154aa018d030c3d202cbf4364abb9acec07
Author: Kenneth Knowles 
Date:   2016-07-08T16:05:29Z

Exclude AppleJavaExtensions from findbugs plugin deps

This is for a GUI component of findbugs, and is not
a real runtime dependency of our build. While it is not
distributed with any of our artifacts, it is still
one more needless thing to pull down at build time,
and one more license to be aware of.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (BEAM-431) Examples dependencies on runners are a bit much and not enough

2016-07-08 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-431:


Assignee: (was: Kenneth Knowles)

> Examples dependencies on runners are a bit much and not enough
> --
>
> Key: BEAM-431
> URL: https://issues.apache.org/jira/browse/BEAM-431
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>
> The Java 7 examples directly depend on the Dataflow runner as a compile 
> dependency. This should just be fixed and removed.
> The Java 8 examples have optional runtime dependencies on the Dataflow and 
> Flink runners. But even optional runtime dependencies must be resolved in a 
> test scope, so it is not possible to exclude these from a hermetic testing 
> environment - quite annoying. And the Spark runner should be included as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-431) Examples dependencies on runners are a bit much and not enough

2016-07-08 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367951#comment-15367951
 ] 

Kenneth Knowles commented on BEAM-431:
--

Specifically, it does seem that my change to add the Spark runner is not going 
to work due to provided dependencies needing to be added, and we might want 
multiple profiles for different configurations.

> Examples dependencies on runners are a bit much and not enough
> --
>
> Key: BEAM-431
> URL: https://issues.apache.org/jira/browse/BEAM-431
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Java 7 examples directly depend on the Dataflow runner as a compile 
> dependency. This should just be fixed and removed.
> The Java 8 examples have optional runtime dependencies on the Dataflow and 
> Flink runners. But even optional runtime dependencies must be resolved in a 
> test scope, so it is not possible to exclude these from a hermetic testing 
> environment - quite annoying. And the Spark runner should be included as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-431) Examples dependencies on runners are a bit much and not enough

2016-07-08 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367932#comment-15367932
 ] 

Kenneth Knowles commented on BEAM-431:
--

The commit that has merged handles the extraneous dependencies and also puts 
the Spark runner on equal footing. Commentary there suggests there might be 
more to do for this ticket, so I'll leave it around.

> Examples dependencies on runners are a bit much and not enough
> --
>
> Key: BEAM-431
> URL: https://issues.apache.org/jira/browse/BEAM-431
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Java 7 examples directly depend on the Dataflow runner as a compile 
> dependency. This should just be fixed and removed.
> The Java 8 examples have optional runtime dependencies on the Dataflow and 
> Flink runners. But even optional runtime dependencies must be resolved in a 
> test scope, so it is not possible to exclude these from a hermetic testing 
> environment - quite annoying. And the Spark runner should be included as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-432) BigQueryIO javadoc is incorrect

2016-07-08 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367909#comment-15367909
 ] 

Daniel Halperin commented on BEAM-432:
--

This is a good starter issue.

> BigQueryIO javadoc is incorrect
> ---
>
> Key: BEAM-432
> URL: https://issues.apache.org/jira/browse/BEAM-432
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Priority: Trivial
>
> BigQueryIO has a broken code snippet in its Javadoc:
> The table name in fromQuery() should be within square brackets i.e.
> .fromQuery("SELECT year, mean_temp FROM samples.weather_stations")
> should be:
> .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-432) BigQueryIO javadoc is incorrect

2016-07-08 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-432:


 Summary: BigQueryIO javadoc is incorrect
 Key: BEAM-432
 URL: https://issues.apache.org/jira/browse/BEAM-432
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-gcp
Affects Versions: 0.1.0-incubating, 0.2.0-incubating
Reporter: Daniel Halperin
Priority: Trivial


BigQueryIO has a broken code snippet in its Javadoc:

The table name in fromQuery() should be within square brackets i.e.

.fromQuery("SELECT year, mean_temp FROM samples.weather_stations")

should be:

.fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]")



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-431) Examples dependencies on runners are a bit much and not enough

2016-07-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367850#comment-15367850
 ] 

ASF GitHub Bot commented on BEAM-431:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/609


> Examples dependencies on runners are a bit much and not enough
> --
>
> Key: BEAM-431
> URL: https://issues.apache.org/jira/browse/BEAM-431
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> The Java 7 examples directly depend on the Dataflow runner as a compile 
> dependency. This should just be fixed and removed.
> The Java 8 examples have optional runtime dependencies on the Dataflow and 
> Flink runners. But even optional runtime dependencies must be resolved in a 
> test scope, so it is not possible to exclude these from a hermetic testing 
> environment - quite annoying. And the Spark runner should be included as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Move example dependency on runners into a profile

2016-07-08 Thread kenn
Repository: incubator-beam
Updated Branches:
  refs/heads/master 74e1f83df -> 5f0288b44


Move example dependency on runners into a profile

Even an optional runtime dependency, such as from the examples
to a runner, does get pulled in for testing. This meant that
all the dependencies for all runners had to be resolvable in an
integration testing context. It is quite inconvient.

Explicitly excluding runners via flags such as `-pl !runners/spark`
does not work since it causes errors in dependency resolution.

This change adds a profile, active by default, that can be
explicitly disabled to avoid pulling in dependencies.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/c17768f1
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/c17768f1
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/c17768f1

Branch: refs/heads/master
Commit: c17768f14f5a57bcc2d83613818101f3b21e7260
Parents: 155409b
Author: Kenneth Knowles 
Authored: Thu Jul 7 19:41:45 2016 -0700
Committer: Kenneth Knowles 
Committed: Thu Jul 7 19:41:45 2016 -0700

--
 examples/java8/pom.xml | 63 +
 1 file changed, 47 insertions(+), 16 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/c17768f1/examples/java8/pom.xml
--
diff --git a/examples/java8/pom.xml b/examples/java8/pom.xml
index 61b8cb4..3cd1787 100644
--- a/examples/java8/pom.xml
+++ b/examples/java8/pom.xml
@@ -35,6 +35,53 @@
 
   jar
 
+  
+
+
+
+  include-runners
+  true
+  
+
+  org.apache.beam
+  beam-runners-direct-java
+  ${project.version}
+  runtime
+  true
+
+
+
+  org.apache.beam
+  beam-runners-flink_2.10
+  ${project.version}
+  runtime
+  true
+
+
+
+  org.apache.beam
+  beam-runners-spark
+  ${project.version}
+  runtime
+  true
+
+  
+
+  
+
   
 
   
@@ -164,21 +211,5 @@
   com.google.api-client
   google-api-client
 
-
-
-  org.apache.beam
-  beam-runners-direct-java
-  ${project.version}
-  runtime
-  true
-
-
-
-  org.apache.beam
-  beam-runners-flink_2.10
-  ${project.version}
-  runtime
-  true
-
   
 



[GitHub] incubator-beam pull request #609: [BEAM-431] Move example dependency on runn...

2016-07-08 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/609


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: This closes #609

2016-07-08 Thread kenn
This closes #609


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/5f0288b4
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/5f0288b4
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/5f0288b4

Branch: refs/heads/master
Commit: 5f0288b448102185bf2b469368e489481a8618aa
Parents: 74e1f83 c17768f
Author: Kenneth Knowles 
Authored: Fri Jul 8 08:48:20 2016 -0700
Committer: Kenneth Knowles 
Committed: Fri Jul 8 08:48:20 2016 -0700

--
 examples/java8/pom.xml | 63 +
 1 file changed, 47 insertions(+), 16 deletions(-)
--




Jenkins build is back to stable : beam_PostCommit_RunnableOnService_GoogleCloudDataflow #710

2016-07-08 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-378) Integrate Python SDK in the Maven build

2016-07-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15367433#comment-15367433
 ] 

Sergio Fernández commented on BEAM-378:
---

Just to summarize all the discussions in the 
[PR|https://github.com/apache/incubator-beam/pull/537] here:

* Python SDK is now integrated in the Maven build, relying on the 
{{exec-maven-plugin}} for executing the required tasks. 

* Not the whole [Maven 
Lifecycle|https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html]
 is covered, just {{compile}}, {{test}} and {{package}}; I guess {{install}} 
and {{deploy}} would need further discussion.

* I started the PR with the idea of using the common version for the Python 
SDK. But after discussing that with [~silv...@google.com] we went back to let 
the SDK manage the version independently.

* There is a issue with Jenkins that may need some support from INFRA, because 
either {{tox}} or {{virtualenv}} binaries would be required to be installed in 
the build nodes.

> Integrate Python SDK in the Maven build
> ---
>
> Key: BEAM-378
> URL: https://issues.apache.org/jira/browse/BEAM-378
> Project: Beam
>  Issue Type: Wish
>  Components: sdk-py
>Affects Versions: 0.2.0-incubating
>Reporter: Sergio Fernández
>Assignee: Robert Bradshaw
>Priority: Minor
>
> It'd be great to have the Python SDK integrated in the Maven build.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build became unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #709

2016-07-08 Thread Apache Jenkins Server
See