[jira] [Commented] (BEAM-2984) Job submission too large with embedded Beam protos

2017-09-22 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176953#comment-16176953
 ] 

Reuven Lax commented on BEAM-2984:
--

Is this a 2.2.0 blocker?

> Job submission too large with embedded Beam protos
> --
>
> Key: BEAM-2984
> URL: https://issues.apache.org/jira/browse/BEAM-2984
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Empirically, naively putting context around the {{DoFnInfo}} could cause a 
> blowup of 40%, which is too much and might cause jobs that were will under 
> API size limits to start to fail.
> There's a certain amount of wiggle room since it is hard to control the 
> submission size anyhow, but 40% is way too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2690) Make hadoop and hive dependencies provided for HCatalogIO

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176965#comment-16176965
 ] 

ASF GitHub Bot commented on BEAM-2690:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3711


> Make hadoop and hive dependencies provided for HCatalogIO
> -
>
> Key: BEAM-2690
> URL: https://issues.apache.org/jira/browse/BEAM-2690
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Affects Versions: 2.1.0
>Reporter: Nathan Howell
>Assignee: Nathan Howell
> Fix For: 2.1.0, 2.2.0
>
>
> HCatalogIO takes compile scope dependencies on libraries that are already 
> present in the classpath on a Hadoop cluster. These should have provided 
> scope instead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3711: [BEAM-2690] Fix type parameter in AvroIO.Write

2017-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3711


---


[2/2] beam git commit: This closes #3887: Revert #3859: "Send portable protos for ParDo in DataflowRunner"

2017-09-22 Thread kenn
This closes #3887: Revert #3859: "Send portable protos for ParDo in 
DataflowRunner"


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3971c7d9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3971c7d9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3971c7d9

Branch: refs/heads/master
Commit: 3971c7d9c973051cabb7c9fa1a403b15dce11bfd
Parents: 87116cc cf665b6
Author: Kenneth Knowles 
Authored: Fri Sep 22 12:30:08 2017 -0700
Committer: Kenneth Knowles 
Committed: Fri Sep 22 12:30:08 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  17 +--
 .../dataflow/DataflowPipelineTranslator.java| 142 ---
 .../runners/dataflow/TransformTranslator.java   |   3 +-
 3 files changed, 26 insertions(+), 136 deletions(-)
--




[1/2] beam git commit: Revert "This closes #3859: [BEAM-2884] Send portable protos for ParDo in DataflowRunner"

2017-09-22 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 87116cc74 -> 3971c7d9c


Revert "This closes #3859: [BEAM-2884] Send portable protos for ParDo in 
DataflowRunner"

This reverts commit 0d5d00d7060d6e4ee8273201e3432f14abf35f8a, reversing
changes made to 4e4d102124576aefc3f71e432dbf619792e3.

The blowup to the job submission was a bit much. We will instead wait to
implement a more robust longer-term solution that does not embed the protos
directly in the job submission.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cf665b61
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cf665b61
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cf665b61

Branch: refs/heads/master
Commit: cf665b6113be7f01fcb55e80d3657079055b8f95
Parents: 66b864f
Author: Kenneth Knowles 
Authored: Fri Sep 22 11:34:28 2017 -0700
Committer: Kenneth Knowles 
Committed: Fri Sep 22 11:34:28 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  17 +--
 .../dataflow/DataflowPipelineTranslator.java| 142 ---
 .../runners/dataflow/TransformTranslator.java   |   3 +-
 3 files changed, 26 insertions(+), 136 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/cf665b61/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 79614ae..eb490cb 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170921
+
beam-master-20170918
 
1
 
6
   
@@ -181,9 +181,7 @@
 
   
 com.google.guava:guava
-com.google.protobuf:protobuf-java
 
org.apache.beam:beam-runners-core-construction-java
-
org.apache.beam:beam-sdks-common-runner-api
   
 
 
@@ -209,10 +207,6 @@
 
org.apache.beam.runners.dataflow.repackaged.com.google.common
   
   
-com.google.protobuf
-
org.apache.beam.runners.dataflow.repackaged.com.google.protobuf
-  
-  
 com.google.thirdparty
 
org.apache.beam.runners.dataflow.repackaged.com.google.thirdparty
   
@@ -220,10 +214,6 @@
 org.apache.beam.runners.core
 
org.apache.beam.runners.dataflow.repackaged.org.apache.beam.runners.core
   
-  
-org.apache.beam.sdk.common.runner
-
org.apache.beam.runners.dataflow.repackaged.org.apache.beam.sdk.common.runner
-  
 
 
   
@@ -384,11 +374,6 @@
 
 
 
-  com.google.protobuf
-  protobuf-java
-
-
-
   com.fasterxml.jackson.core
   jackson-core
 

http://git-wip-us.apache.org/repos/asf/beam/blob/cf665b61/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
index 354781e..4f9b939 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
@@ -47,7 +47,6 @@ import com.google.common.collect.BiMap;
 import com.google.common.collect.ImmutableBiMap;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Iterables;
-import com.google.protobuf.ByteString;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.Collections;
@@ -57,9 +56,6 @@ import java.util.List;
 import java.util.Map;
 import java.util.concurrent.atomic.AtomicLong;
 import javax.annotation.Nullable;
-import org.apache.beam.runners.core.construction.PTransformTranslation;
-import org.apache.beam.runners.core.construction.ParDoTranslation;
-import org.apache.beam.runners.core.construction.SdkComponents;
 import org.apache.beam.runners.core.construction.SplittableParDo;
 import org.apache.beam.runners.core.construction.TransformInputs;
 import org.apache.beam.runners.core.construction.WindowingStrategyTranslation;
@@ -77,7 +73,6 @@ import 

[GitHub] beam pull request #3887: [BEAM-2984] Revert "This closes #3859: Send portabl...

2017-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3887


---


[beam-site] 02/02: This closes #324

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 91aa43ce8863f005819062589e8fa79e35416e41
Merge: ed155be eeba9b3
Author: Mergebot 
AuthorDate: Fri Sep 22 19:53:41 2017 +

This closes #324

 src/_data/capability-matrix.yml | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] branch mergebot updated (0eb4ff4 -> 91aa43c)

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 0eb4ff4  This closes #320
 add ed155be  Prepare repository for deployment.
 new eeba9b3  Update Mapreduce capability matrix when/how entries
 new 91aa43c  This closes #324

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/io/built-in/index.html   |2 +-
 content/documentation/io/io-toc/index.html |2 +-
 .../pipelines/create-your-pipeline/index.html  |2 +-
 .../pipelines/design-your-pipeline/index.html  |4 +-
 .../pipelines/test-your-pipeline/index.html|   12 +-
 content/documentation/programming-guide/index.html | 1923 +++-
 .../documentation/sdks/python-custom-io/index.html |2 +-
 .../get-started/mobile-gaming-example/index.html   |8 +-
 content/get-started/wordcount-example/index.html   |6 +-
 src/_data/capability-matrix.yml|   30 +-
 10 files changed, 1476 insertions(+), 515 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


Jenkins build is unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4026

2017-09-22 Thread Apache Jenkins Server
See 




svn commit: r21900 - in /release/beam: 2.1.1/ 2.1.1/apache-beam-2.1.1-python.zip 2.1.1/apache-beam-2.1.1-python.zip.asc 2.1.1/apache-beam-2.1.1-python.zip.md5 2.1.1/apache-beam-2.1.1-python.zip.sha1 2

2017-09-22 Thread robertwb
Author: robertwb
Date: Fri Sep 22 21:46:33 2017
New Revision: 21900

Log:
Release Apache Beam 2.1.1.

Added:
release/beam/2.1.1/
release/beam/2.1.1/apache-beam-2.1.1-python.zip   (with props)
release/beam/2.1.1/apache-beam-2.1.1-python.zip.asc
release/beam/2.1.1/apache-beam-2.1.1-python.zip.md5
release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1
release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256
Modified:
release/beam/KEYS

Added: release/beam/2.1.1/apache-beam-2.1.1-python.zip
==
Binary file - no diff available.

Propchange: release/beam/2.1.1/apache-beam-2.1.1-python.zip
--
svn:mime-type = application/octet-stream

Added: release/beam/2.1.1/apache-beam-2.1.1-python.zip.asc
==
--- release/beam/2.1.1/apache-beam-2.1.1-python.zip.asc (added)
+++ release/beam/2.1.1/apache-beam-2.1.1-python.zip.asc Fri Sep 22 21:46:33 2017
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIcBAABCgAGBQJZw234AAoJEP4oOH5D7B695xEQALNNVrsWDy6K3jmTm5n3fm+3
+cHaXdTLvrjUtnBa6TFV49JlE9OIukDDfCaZyXBfFpcUzN8vVr1Ogbe/OKr+cNM1j
+la7pRqSpqG63L2JcMq2R4htdWoS5LrJYF36DFSpmnEiRy1iS2O/aHKs8K6n5xpBO
+2UXXMBWZrn0ffaE+drrxpWW4umL2DqKibAJB9SMuqzcfKooDIjhZzulD3eH2KCZd
+4hZv8qxoSqPX6a+vBiVf1cZHJ87GgFsKA9bmaLs8uIrA7tw6qgu401lUEUJrCL80
+ge6e7wp+orhgzXPrIi0uXRQ6+vjg5jIKeARlKYYiZy/sfUxBx9zmul/4k28xpA4E
+c/xycyIoP8fe86sz6XR5DR1nVIMYCDkFcrE3gM5IzQo0yZkCrjUNBmLPcYUyFcJj
+yfC22keYIbql3GWkpFRncDtcJnBX/3qpkb6xgUCfPgzh1iATjql8tNMj5FHSWR/D
+tYU7Lomb0WQFOwVMWVk6pRfff9qzp8QdBd8K+D1PMXAtUlTTo3Ipu2+5QwdCvagU
+Qoj4LTpGtQb4QFMdGnO5i9IPiM6hAZOYv2EPLEGL0DF0ue6wDUeUWbrTEp6O8dJu
+kIn/3EQ1Px3f/iz8y66/OqoYj9yQsK0YKdnMXW0JKcqIiSs93OIW+iLRgEpWLAVl
+UUlixDnrQdWP9RPp3O/n
+=THtB
+-END PGP SIGNATURE-

Added: release/beam/2.1.1/apache-beam-2.1.1-python.zip.md5
==
--- release/beam/2.1.1/apache-beam-2.1.1-python.zip.md5 (added)
+++ release/beam/2.1.1/apache-beam-2.1.1-python.zip.md5 Fri Sep 22 21:46:33 2017
@@ -0,0 +1 @@
+MD5 (apache-beam-2.1.1-python.zip) = d7ebd955bb9ec7871f23af987582f881

Added: release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1
==
--- release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1 (added)
+++ release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha1 Fri Sep 22 21:46:33 
2017
@@ -0,0 +1 @@
+93e903076fa2efdbe9ede9b399c43da9de3d0e6a  apache-beam-2.1.1-python.zip

Added: release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256
==
--- release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256 (added)
+++ release/beam/2.1.1/apache-beam-2.1.1-python.zip.sha256 Fri Sep 22 21:46:33 
2017
@@ -0,0 +1 @@
+e8ebbe6b20b490357b56e1cf1cbf62e8fd915d4803108ef7af541869793b1793  
apache-beam-2.1.1-python.zip

Modified: release/beam/KEYS
==
--- release/beam/KEYS (original)
+++ release/beam/KEYS Fri Sep 22 21:46:33 2017
@@ -255,3 +255,59 @@ V3m2QtLzHBUqN+/FTzUyDASO8kO4J0OaBB6iOTlZ
 ic/pJUsMBOm0ADPVB7YO
 =5xpx
 -END PGP PUBLIC KEY BLOCK-
+pub   4096R/43EC1EBD 2017-09-21 [expires: 2021-09-21]
+uid   [ultimate] Robert Bradshaw 
+sig 343EC1EBD 2017-09-21  Robert Bradshaw 
+sub   4096R/B4A515D5 2017-09-21 [expires: 2021-09-21]
+sig  43EC1EBD 2017-09-21  Robert Bradshaw 
+-BEGIN PGP PUBLIC KEY BLOCK-
+
+mQINBFnDXGMBEADyuxKOkcO9hBk9xkKcLrHhw3s7D782fo9OrLMaMCInTcNJB/tp
+HMvOUuBlFa/2rn74ZPQhfFyTErlord4e60kTVPtKt34Ouj5udFxY8M8W0YsFOyoL
+dAHaNB6vwmAnOKgRm2PFwC1wMyeIzGinYWoml8OYpnwK3lf4MurizktHU4/h3tTg
++j2gDJDn3X2TObJ6t6Q19pyQ7yI8VdRl3KpbXSY5s6gEfcx7ihU2fg651Txmof8e
+mN5GjjkEcktkkEz8icHHI9vWLX4h7PSz2zwHm52Yet8ysVbBLB7HKnOStOpBebJf
+ja2EDt0AtbYpuOBSq+xcxKQQI44IXcqOcRNzThY7BJ62ZuMZ838oqsG6gFqOuhqA
+lkfPxubE61inQoU+ce8N9JSsjv5dcqmpE6Ul3StvZPNDf01G8E1tUzlzPt7UJiHM
+rrj3XI9oGfU+hRGaOwLne4IJqnU+noEUvBWnKICC35+R3/x9dtHe+qEKONpekPAV
+oG986icjW1L85r+6OVOJZPotzDgkQn4ArbOF92nnyXa8BJumF5ZNTr9mB3+L8pFt
+RgjLltxULDH3zejlBXgRoWfv2lVXigQNbRMlZnIKH5PwN4RwoskDbDA3WmzFfCPi
+rg7m1RPPZpbXiXAErq8vSsyceMVxgq7AFfkzwdgIHkCkwUcLlyAEd92eRQARAQAB
+tCVSb2JlcnQgQnJhZHNoYXcgPHJvYmVydHdiQGFwYWNoZS5vcmc+iQI9BBMBCgAn
+BQJZw1xjAhsDBQkHhh+ABQsJCAcDBRUKCQgLBRYCAwEAAh4BAheAAAoJEP4oOH5D
+7B69gCwQANqA3hO3yfulsf8ZZxFaFbypsit4TcThCgEStz/shs929rbsMah22REy
+x3ZUv62/2uxCnc3pncSnNVmMsECoL2uE5SKGOPGUvh3SFu5Huy6CpW2/xCLq0GSl
+AAWBqrmWI6yNGrQ77UfQAPPvFHft9NYyl1gGr3ypMDBR6I27vocT9FVLFD8Y8z5O
+FHLhO+UWpmFj2uIIBm3/UAnPu1ELn1EbiRvLq4nc59e/9ueXHIB232ndhdwxNDoc
+uQMoIP7X3s2iEVSMN5i/eNSAIdK9LBXzpsB6Kw+c9xVnIN1e2PU7zL/30EmlvQdr

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4854

2017-09-22 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Apex #2448

2017-09-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #3193

2017-09-22 Thread Apache Jenkins Server
See 


Changes:

[klk] Revert "This closes #3859: [BEAM-2884] Send portable protos for ParDo in

[chamikara] Revert "Initial set of pipeline jobs."

--
[...truncated 990.95 KB...]
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s11"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s13", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Unkey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s12"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Unkey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s14", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_equal"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  

[2/3] beam git commit: Add a Local FS implementation of the Artifact Staging API

2017-09-22 Thread tgroh
Add a Local FS implementation of the Artifact Staging API


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2f178fbe
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2f178fbe
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2f178fbe

Branch: refs/heads/master
Commit: 2f178fbeab2846f940fb98b2518cc9aa9c24b31d
Parents: 465ecfc
Author: Thomas Groh 
Authored: Wed Sep 13 13:32:20 2017 -0700
Committer: Thomas Groh 
Committed: Fri Sep 22 15:02:43 2017 -0700

--
 runners/local-artifact-service-java/pom.xml | 116 
 .../LocalFileSystemArtifactStagerService.java   | 276 +++
 .../beam/artifact/local/package-info.java   |  22 ++
 ...ocalFileSystemArtifactStagerServiceTest.java | 274 ++
 runners/pom.xml |   1 +
 5 files changed, 689 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2f178fbe/runners/local-artifact-service-java/pom.xml
--
diff --git a/runners/local-artifact-service-java/pom.xml 
b/runners/local-artifact-service-java/pom.xml
new file mode 100644
index 000..0215798
--- /dev/null
+++ b/runners/local-artifact-service-java/pom.xml
@@ -0,0 +1,116 @@
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+
+  4.0.0
+
+  
+org.apache.beam
+beam-runners-parent
+2.2.0-SNAPSHOT
+../pom.xml
+  
+
+  beam-local-artifact-service-java
+  Apache Beam :: Runners :: Java Local Artifact Service
+  The Beam Artifact Service exposes APIs to stage and retrieve
+artifacts in a manner independent of the underlying storage system, for use
+by the Beam portability framework. The local implementation uses the local
+File System as the underlying storage system.
+
+  jar
+
+  
+
+  
+org.apache.maven.plugins
+maven-surefire-plugin
+  
+
+  
+  
+org.jacoco
+jacoco-maven-plugin
+  
+
+  
+
+  
+
+  org.apache.beam
+  beam-sdks-common-runner-api
+
+
+
+
+
+  com.google.code.findbugs
+  jsr305
+
+
+
+  com.google.guava
+  guava
+
+
+
+  io.grpc
+  grpc-core
+
+
+
+  io.grpc
+  grpc-stub
+
+
+
+  com.google.protobuf
+  protobuf-java
+
+
+
+  org.slf4j
+  slf4j-api
+
+
+
+
+  org.hamcrest
+  hamcrest-all
+  test
+
+
+
+  org.mockito
+  mockito-all
+  test
+
+
+
+  junit
+  junit
+  test
+
+
+
+  org.slf4j
+  slf4j-jdk14
+  test
+
+  
+

http://git-wip-us.apache.org/repos/asf/beam/blob/2f178fbe/runners/local-artifact-service-java/src/main/java/org/apache/beam/artifact/local/LocalFileSystemArtifactStagerService.java
--
diff --git 
a/runners/local-artifact-service-java/src/main/java/org/apache/beam/artifact/local/LocalFileSystemArtifactStagerService.java
 
b/runners/local-artifact-service-java/src/main/java/org/apache/beam/artifact/local/LocalFileSystemArtifactStagerService.java
new file mode 100644
index 000..6b42a3b
--- /dev/null
+++ 
b/runners/local-artifact-service-java/src/main/java/org/apache/beam/artifact/local/LocalFileSystemArtifactStagerService.java
@@ -0,0 +1,276 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.artifact.local;
+
+import static com.google.common.base.Preconditions.checkState;
+
+import com.google.common.base.Throwables;
+import io.grpc.Status;
+import io.grpc.StatusException;
+import io.grpc.StatusRuntimeException;
+import io.grpc.stub.StreamObserver;
+import java.io.File;
+import java.io.FileOutputStream;
+import java.io.IOException;
+import java.io.OutputStream;
+import java.util.ArrayList;
+import 

[1/3] beam git commit: This closes #3852

2017-09-22 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 74a5c9e91 -> 7dacfdcda


This closes #3852


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7dacfdcd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7dacfdcd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7dacfdcd

Branch: refs/heads/master
Commit: 7dacfdcdad173cbda55c801e0424abdbfccdd2fd
Parents: 74a5c9e 2f178fb
Author: Thomas Groh 
Authored: Fri Sep 22 15:02:43 2017 -0700
Committer: Thomas Groh 
Committed: Fri Sep 22 15:02:43 2017 -0700

--
 runners/local-artifact-service-java/pom.xml | 116 
 .../LocalFileSystemArtifactStagerService.java   | 276 +++
 .../beam/artifact/local/package-info.java   |  22 ++
 ...ocalFileSystemArtifactStagerServiceTest.java | 274 ++
 runners/pom.xml |   1 +
 .../src/main/proto/beam_artifact_api.proto  |  20 +-
 6 files changed, 703 insertions(+), 6 deletions(-)
--




[3/3] beam git commit: Artifact API Cleanup

2017-09-22 Thread tgroh
Artifact API Cleanup

Have an explicit checksum message to encapsulate a (algorithm, value)

Include the entire metadata when uploading an artifact.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/465ecfc3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/465ecfc3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/465ecfc3

Branch: refs/heads/master
Commit: 465ecfc39606ad5d936492f38015311e24d5641f
Parents: 74a5c9e
Author: Thomas Groh 
Authored: Tue Sep 19 11:45:22 2017 -0700
Committer: Thomas Groh 
Committed: Fri Sep 22 15:02:43 2017 -0700

--
 .../src/main/proto/beam_artifact_api.proto  | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/465ecfc3/sdks/common/runner-api/src/main/proto/beam_artifact_api.proto
--
diff --git a/sdks/common/runner-api/src/main/proto/beam_artifact_api.proto 
b/sdks/common/runner-api/src/main/proto/beam_artifact_api.proto
index 6e39d88..f713fa7 100644
--- a/sdks/common/runner-api/src/main/proto/beam_artifact_api.proto
+++ b/sdks/common/runner-api/src/main/proto/beam_artifact_api.proto
@@ -55,20 +55,28 @@ service ArtifactRetrievalService {
 }
 
 // An artifact identifier and associated metadata.
-message Artifact {
+message ArtifactMetadata {
   // (Required) The name of the artifact.
   string name = 1;
 
   // (Optional) The Unix-like permissions of the artifact
   int32 permissions = 2;
 
-  // (Optional) The md5 checksum of the artifact.
-  string md5 = 3;
+  // (Optional) The checksum of the artifact.
+  Checksum checksum = 3;
+}
+
+message Checksum {
+  // (Required) the algorithm used to generate this checksum
+  string algorithm = 1;
+
+  // (Required) the value of this checksum
+  bytes value = 2;
 }
 
 // A collection of artifacts.
 message Manifest {
-  repeated Artifact artifact = 1;
+  repeated ArtifactMetadata artifact = 1;
 }
 
 // A request to get the manifest of a Job.
@@ -94,9 +102,9 @@ message ArtifactChunk {
 message PutArtifactRequest {
   // (Required)
   oneof content {
-// The name of the artifact. The first message in a PutArtifact call must 
contain the name
+// The Artifact metadata. The first message in a PutArtifact call must 
contain the name
 // of the artifact.
-string name = 1;
+ArtifactMetadata metadata = 1;
 
 // A chunk of the artifact. All messages after the first in a PutArtifact 
call must contain a
 // chunk.



[jira] [Commented] (BEAM-2885) Support job+artifact APIs locally

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177228#comment-16177228
 ] 

ASF GitHub Bot commented on BEAM-2885:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3852


> Support job+artifact APIs locally
> -
>
> Key: BEAM-2885
> URL: https://issues.apache.org/jira/browse/BEAM-2885
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Henning Rohde
>Assignee: Thomas Groh
>  Labels: portability
>
> As per https://s.apache.org/beam-job-api, use local support for 
> submission-side. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3852: [BEAM-2885] Add a Local FS implementation of the Ar...

2017-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3852


---


Jenkins build is back to normal : beam_PostCommit_Python_Verify #3194

2017-09-22 Thread Apache Jenkins Server
See 




[beam-site] 01/02: Update Mapreduce capability matrix when/how entries

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit eeba9b3321238f03071c32e7fe507145169552df
Author: melissa 
AuthorDate: Thu Sep 21 09:41:58 2017 -0700

Update Mapreduce capability matrix when/how entries
---
 src/_data/capability-matrix.yml | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml
index 1c1171d..b0ea35a 100644
--- a/src/_data/capability-matrix.yml
+++ b/src/_data/capability-matrix.yml
@@ -577,7 +577,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: It is a batch-only runner, and intermediate trigger firings 
are effectively meaningless.
+l2: batch-only runner
 l3: ''
 
   - name: Event-time triggers
@@ -608,7 +608,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: Currently watermark progress jumps from the beginning of time 
to the end of time once the input has been fully consumed, thus no additional 
triggering granularity is available.
+l2: ''
 l3: ''
 
   - name: Processing-time triggers
@@ -639,7 +639,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: From the perspective of triggers, processing time currently 
jumps from the beginning of time to the end of time once the input has been 
fully consumed, thus no additional triggering granularity is available.
+l2: ''
 l3: ''
 
   - name: Count triggers
@@ -670,7 +670,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: Elements are processed in the largest bundles possible, so 
count-based triggers are effectively meaningless.
+l2: ''
 l3: ''
 
   - name: '[Meta]data driven triggers'
@@ -702,7 +702,7 @@ categories:
 l3:
   - class: mapreduce
 l1: 'No'
-l2: pending model support
+l2: ''
 l3:
 
   - name: Composite triggers
@@ -732,8 +732,8 @@ categories:
 l2: ''
 l3: ''
   - class: mapreduce
-l1: 'Yes'
-l2: fully supported
+l1: 'No'
+l2: ''
 l3: ''
 
   - name: Allowed lateness
@@ -764,7 +764,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: No data is ever late.
+l2: ''
 l3: ''
 
   - name: Timers
@@ -794,8 +794,8 @@ categories:
 l2: not implemented
 l3: ''
   - class: mapreduce
-l1: 'Partially'
-l2: not implemented
+l1: 'No'
+l2: ''
 l3: ''
 
   - description: How do refinements relate?
@@ -833,8 +833,8 @@ categories:
 l2: fully supported
 l3: ''
   - class: mapreduce
-l1: 'Yes'
-l2: fully supported
+l1: 'No'
+l2: batch-only runner
 l3: ''
 
   - name: Accumulating
@@ -864,8 +864,8 @@ categories:
 l2: ''
 l3: ''
   - class: mapreduce
-l1: 'Yes'
-l2: fully supported
+l1: 'No'
+l2: ''
 l3: ''
 
   - name: 'Accumulating  Retracting'
@@ -897,5 +897,5 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: pending model support
+l2: ''
 l3: ''

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam] Git Push Summary

2017-09-22 Thread robertwb
Repository: beam
Updated Tags:  refs/tags/v2.1.1 [created] d6e6ea9c7


[jira] [Created] (BEAM-2985) BigQuery IO write transform is broken for DirectRunner

2017-09-22 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-2985:


 Summary: BigQuery IO write transform is broken for DirectRunner
 Key: BEAM-2985
 URL: https://issues.apache.org/jira/browse/BEAM-2985
 Project: Beam
  Issue Type: Bug
  Components: sdk-py-core
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


I get following error when trying to run BigQuery tornadoes using DirectRunner.

DataflowRunner seems to be working fine.

python -m apache_beam.examples.cookbook.bigquery_tornadoes --output 
. --project 

 Request missing required parameter projectId
 Traceback for above exception (most recent call last):
  File "apache_beam/utils/retry.py", line 175, in wrapper
return fun(*args, **kwargs)
  File "apache_beam/io/gcp/bigquery.py", line 828, in _get_table
response = self.client.tables.Get(request)
  File "apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", 
line 608, in Get
config, request, global_params=global_params)
  File 
"/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
 line 695, in _RunMethod
download)
  File 
"/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
 line 676, in PrepareHttpRequest
method_config, request, relative_path=url_builder.relative_path)
  File 
"/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
 line 580, in __ConstructRelativePath
relative_path=relative_path)
  File 
"/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/util.py",
 line 124, in ExpandRelativePath
'Request missing required parameter %s' % param)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3903

2017-09-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #3192

2017-09-22 Thread Apache Jenkins Server
See 


Changes:

[relax] Fix type parameter in AvroIO.Write

--
[...truncated 1.11 MB...]
  "is_pair_like": true
}
  ], 
  "is_stream_like": true
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Group/GroupByKey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s10"
}, 
"serialized_fn": 
"%0AZ%22X%0A%1Dref_Coder_GlobalWindowCoder_1%127%0A5%0A3%0A1urn%3Abeam%3Acoders%3Aurn%3Abeam%3Acoders%3Aglobal_window%3A0.1jJ%0A%25%0A%23%0A%21beam%3Awindowfn%3Aglobal_windows%3Av0.1%1A%1Dref_Coder_GlobalWindowCoder_1%22%02%3A%00",
 
"user_name": "assert_that/Group/GroupByKey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s12", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_merge_tagged_vals_under_key"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s11"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s13", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 

[GitHub] beam pull request #3890: Introduces Reshuffle.viaRandomKey()

2017-09-22 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3890

Introduces Reshuffle.viaRandomKey()

It's a commonly used pattern for breaking fusion 
https://cloud.google.com/dataflow/service/dataflow-service-desc#fusion-optimization

viaRandomKey() only abstracts away the current commonly used pattern. It 
has the same caveats as using Reshuffle.of() directly - the semantics are 
technically not guaranteed by the Beam model, but it works in practice, and 
this is the pattern we keep recommending to users.

The naming is deliberately operational rather than semantic, to emphasize 
that we don't have the semantics figured out, and the transform promises only 
that it expands into exactly the sequence "pair with random key, reshuffle, 
drop key". The goal of this change is just to reduce copy-paste.

See prior discussion at 
https://lists.apache.org/thread.html/ac34c9ac665a8d9f67b0254015e44c59ea65ecc1360d4014b95d3b2e@%3Cdev.beam.apache.org%3E

This change also converts several existing usages to use it, and adds 
another one in Match.

R: @bjchambers 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam match-fusion-break

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3890.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3890


commit 0b4d801e4afb0be1463b196f419e1293265b68c1
Author: Eugene Kirpichov 
Date:   2017-09-22T22:24:36Z

Introduces Reshuffle.viaRandomKey()

It's a commonly used pattern for breaking fusion

https://cloud.google.com/dataflow/service/dataflow-service-desc#fusion-optimization

viaRandomKey() only abstracts away the current commonly used pattern.
It has the same caveats as using Reshuffle.of() directly - the semantics
are technically not guaranteed by the Beam model, but it works in
practice, and this is the pattern we keep recommending to users.

The naming is deliberately operational rather than semantic, to
emphasize that we don't have the semantics figured out, and the
transform promises only that it expands into exactly the sequence
"pair with random key, reshuffle, drop key".
The goal of this change is just to reduce copy-paste.

See prior discussion at

https://lists.apache.org/thread.html/ac34c9ac665a8d9f67b0254015e44c59ea65ecc1360d4014b95d3b2e@%3Cdev.beam.apache.org%3E

This change also converts several existing usages to use it, and adds 
another
one in Match.




---


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #4858

2017-09-22 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3904

2017-09-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Apex #2449

2017-09-22 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Revert "Initial set of pipeline jobs."

--
[...truncated 479.44 KB...]
2017-09-22T21:07:27.379 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.pom
 (2 KB at 62.1 KB/sec)
2017-09-22T21:07:27.382 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-parent/1.6.1/slf4j-parent-1.6.1.pom
2017-09-22T21:07:27.408 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-parent/1.6.1/slf4j-parent-1.6.1.pom
 (10 KB at 337.2 KB/sec)
2017-09-22T21:07:27.411 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/log4j/log4j/1.2.16/log4j-1.2.16.pom
2017-09-22T21:07:27.439 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/log4j/log4j/1.2.16/log4j-1.2.16.pom (20 KB 
at 709.4 KB/sec)
2017-09-22T21:07:27.442 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/jline/jline/0.9.94/jline-0.9.94.pom
2017-09-22T21:07:27.468 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/jline/jline/0.9.94/jline-0.9.94.pom (7 KB 
at 238.8 KB/sec)
2017-09-22T21:07:27.470 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/io/netty/netty/3.7.0.Final/netty-3.7.0.Final.pom
2017-09-22T21:07:27.499 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/io/netty/netty/3.7.0.Final/netty-3.7.0.Final.pom
 (26 KB at 913.9 KB/sec)
2017-09-22T21:07:27.504 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/curator/curator-framework/2.7.1/curator-framework-2.7.1.pom
2017-09-22T21:07:27.530 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/curator/curator-framework/2.7.1/curator-framework-2.7.1.pom
 (3 KB at 81.3 KB/sec)
2017-09-22T21:07:27.531 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/curator/apache-curator/2.7.1/apache-curator-2.7.1.pom
2017-09-22T21:07:27.562 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/curator/apache-curator/2.7.1/apache-curator-2.7.1.pom
 (32 KB at 1002.6 KB/sec)
2017-09-22T21:07:27.567 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/curator/curator-client/2.7.1/curator-client-2.7.1.pom
2017-09-22T21:07:27.594 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/curator/curator-client/2.7.1/curator-client-2.7.1.pom
 (3 KB at 81.7 KB/sec)
2017-09-22T21:07:27.598 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/com/jcraft/jsch/0.1.42/jsch-0.1.42.pom
2017-09-22T21:07:27.625 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/jcraft/jsch/0.1.42/jsch-0.1.42.pom 
(967 B at 35.0 KB/sec)
2017-09-22T21:07:27.626 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/curator/curator-recipes/2.7.1/curator-recipes-2.7.1.pom
2017-09-22T21:07:27.654 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/curator/curator-recipes/2.7.1/curator-recipes-2.7.1.pom
 (3 KB at 82.5 KB/sec)
2017-09-22T21:07:27.657 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/htrace/htrace-core/3.1.0-incubating/htrace-core-3.1.0-incubating.pom
2017-09-22T21:07:27.685 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/htrace/htrace-core/3.1.0-incubating/htrace-core-3.1.0-incubating.pom
 (4 KB at 142.6 KB/sec)
2017-09-22T21:07:27.686 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/htrace/htrace/3.1.0-incubating/htrace-3.1.0-incubating.pom
2017-09-22T21:07:27.713 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/htrace/htrace/3.1.0-incubating/htrace-3.1.0-incubating.pom
 (12 KB at 420.2 KB/sec)
2017-09-22T21:07:27.715 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/apache/12/apache-12.pom
2017-09-22T21:07:27.743 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/apache/12/apache-12.pom (16 KB 
at 541.3 KB/sec)
2017-09-22T21:07:27.746 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/com/datatorrent/netlet/1.2.1/netlet-1.2.1.pom
2017-09-22T21:07:27.776 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/datatorrent/netlet/1.2.1/netlet-1.2.1.pom
 (19 KB at 608.8 KB/sec)
2017-09-22T21:07:27.781 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-log4j12/1.7.5/slf4j-log4j12-1.7.5.pom
2017-09-22T21:07:27.810 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/slf4j/slf4j-log4j12/1.7.5/slf4j-log4j12-1.7.5.pom
 (2 KB at 53.8 KB/sec)
2017-09-22T21:07:27.812 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/com/esotericsoftware/kryo/kryo/2.24.0/kryo-2.24.0.pom
2017-09-22T21:07:27.840 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/com/esotericsoftware/kryo/kryo/2.24.0/kryo-2.24.0.pom
 (7 KB at 216.9 KB/sec)
2017-09-22T21:07:27.842 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/objenesis/objenesis/2.1/objenesis-2.1.pom

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3905

2017-09-22 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-2377) Cross compile flink runner to scala 2.11

2017-09-22 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2377:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Cross compile flink runner to scala 2.11
> 
>
> Key: BEAM-2377
> URL: https://issues.apache.org/jira/browse/BEAM-2377
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Ole Langbehn
>Assignee: Aljoscha Krettek
> Fix For: 2.3.0
>
>
> The flink runner is compiled for flink built against scala 2.10. flink cross 
> compiles its scala artifacts against 2.10 and 2.11.
> In order to make it possible to use beam with the flink runner in scala 2.11 
> projects, it would be nice if you could publish the flink runner for 2.11 
> next to 2.10.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2984) Job submission too large with embedded Beam protos

2017-09-22 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2984:
--
Fix Version/s: (was: Not applicable)
   2.2.0

> Job submission too large with embedded Beam protos
> --
>
> Key: BEAM-2984
> URL: https://issues.apache.org/jira/browse/BEAM-2984
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Empirically, naively putting context around the {{DoFnInfo}} could cause a 
> blowup of 40%, which is too much and might cause jobs that were will under 
> API size limits to start to fail.
> There's a certain amount of wiggle room since it is hard to control the 
> submission size anyhow, but 40% is way too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2984) Job submission too large with embedded Beam protos

2017-09-22 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-2984.
---
   Resolution: Fixed
Fix Version/s: (was: 2.2.0)
   Not applicable

> Job submission too large with embedded Beam protos
> --
>
> Key: BEAM-2984
> URL: https://issues.apache.org/jira/browse/BEAM-2984
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: Not applicable
>
>
> Empirically, naively putting context around the {{DoFnInfo}} could cause a 
> blowup of 40%, which is too much and might cause jobs that were will under 
> API size limits to start to fail.
> There's a certain amount of wiggle room since it is hard to control the 
> submission size anyhow, but 40% is way too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2984) Job submission too large with embedded Beam protos

2017-09-22 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177004#comment-16177004
 ] 

Kenneth Knowles commented on BEAM-2984:
---

Yes, definitely blocked 2.2.0. Resolved now.

> Job submission too large with embedded Beam protos
> --
>
> Key: BEAM-2984
> URL: https://issues.apache.org/jira/browse/BEAM-2984
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Empirically, naively putting context around the {{DoFnInfo}} could cause a 
> blowup of 40%, which is too much and might cause jobs that were will under 
> API size limits to start to fail.
> There's a certain amount of wiggle room since it is hard to control the 
> submission size anyhow, but 40% is way too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2981) Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177135#comment-16177135
 ] 

ASF GitHub Bot commented on BEAM-2981:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3889

[BEAM-2981] Update Dataflow worker version to attain ProtoCoder 
serialVersionUID fix

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam proto-worker-version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3889.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3889


commit 85901d3f5d5e648862155c144f9158ec21a874b2
Author: Kenneth Knowles <k...@google.com>
Date:   2017-09-22T19:01:13Z

Update Dataflow worker to 20170922-01




> Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch
> ---
>
> Key: BEAM-2981
> URL: https://issues.apache.org/jira/browse/BEAM-2981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Jenkins failure: 
> https://builds.apache.org/job/beam_Release_NightlySnapshot/540/org.apache.beam$beam-runners-google-cloud-dataflow-java/#showFailuresLink
> {code}
> Caused by: java.io.InvalidClassException: 
> org.apache.beam.sdk.extensions.protobuf.ProtoCoder; local class incompatible: 
> stream classdesc serialVersionUID = -5043999806040629525, local class 
> serialVersionUID = 5772992315255168068
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[beam-site] branch mergebot updated (8c78c9a -> 0eb4ff4)

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


 discard 8c78c9a  This closes #324
 discard d9e10e6  Update Mapreduce capability matrix when/how entries
 new 4964e52  [BEAM-2039] Number programming guide chapters
 new c5d8da6  Updates with review feedback
 new 0eb4ff4  This closes #320

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (8c78c9a)
\
 N -- N -- N   refs/heads/mergebot (0eb4ff4)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/_data/capability-matrix.yml|   30 +-
 src/documentation/io/built-in.md   |2 +-
 src/documentation/io/io-toc.md |2 +-
 .../pipelines/create-your-pipeline.md  |4 +-
 .../pipelines/design-your-pipeline.md  |4 +-
 src/documentation/pipelines/test-your-pipeline.md  |   12 +-
 src/documentation/programming-guide.md | 1902 ++--
 src/documentation/sdks/python-custom-io.md |2 +-
 src/get-started/mobile-gaming-example.md   |8 +-
 src/get-started/wordcount-example.md   |6 +-
 10 files changed, 1396 insertions(+), 576 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 02/03: Updates with review feedback

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit c5d8da69530aaeae85654c334868abeb7b032b13
Author: melissa 
AuthorDate: Fri Sep 22 10:12:00 2017 -0700

Updates with review feedback
---
 src/documentation/programming-guide.md | 39 +-
 src/get-started/wordcount-example.md   |  4 ++--
 2 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 7d263e4..ebe8a8e 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -942,7 +942,7 @@ pc = ...
 %}```
 
 If you are combining a `PCollection` of key-value pairs, [per-key
-combining](#combining-values-in-a-key-grouped-collection) is often enough. If
+combining](#combining-values-in-a-keyed-pcollection) is often enough. If
 you need the combining strategy to change based on the key (for example, MIN 
for
 some users and MAX for other users), you can define a `KeyedCombineFn` to 
access
 the key within the combining strategy.
@@ -1007,10 +1007,10 @@ applying `Combine`:
   the result of your pipeline's `Combine` is to be used as a side input later 
in
   the pipeline.
 
-# 4.2.4.6. Combining values in a key-grouped collection
+# 4.2.4.6. Combining values in a keyed PCollection
 
-After creating a key-grouped collection (for example, by using a `GroupByKey`
-transform) a common pattern is to combine the collection of values associated
+After creating a keyed PCollection (for example, by using a `GroupByKey`
+transform), a common pattern is to combine the collection of values associated
 with each key into a single, merged value. Drawing on the previous example from
 `GroupByKey`, a key-grouped `PCollection` called `groupedWords` looks like 
this:
 ```
@@ -1434,7 +1434,7 @@ reference pages for a list of transforms:
   * [Pre-written Beam transforms for Java]({{ site.baseurl 
}}/documentation/sdks/javadoc/{{ site.release_latest 
}}/index.html?org/apache/beam/sdk/transforms/package-summary.html)
   * [Pre-written Beam transforms for Python]({{ site.baseurl 
}}/documentation/sdks/pydoc/{{ site.release_latest 
}}/apache_beam.transforms.html)
 
- 4.6.1. Composite transform example
+ 4.6.1. An example composite transform
 
 The `CountWords` transform in the [WordCount example program]({{ site.baseurl 
}}/get-started/wordcount-example/)
 is an example of a composite transform. `CountWords` is a `PTransform` subclass
@@ -1544,10 +1544,10 @@ transforms to be nested within the structure of your 
pipeline.
 
  4.6.3. PTransform Style Guide
 
-When you create a new `PTransform`, be sure to read the [PTransform Style
-Guide]({{ site.baseurl }}/contribute/ptransform-style-guide/). The guide
-contains additional helpful information such as style guidelines, logging and
-testing guidance, and language-specific considerations.
+The [PTransform Style Guide]({{ site.baseurl 
}}/contribute/ptransform-style-guide/)
+contains additional information not included here, such as style guidelines,
+logging and testing guidance, and language-specific considerations.  The guide
+is a useful starting point when you want to write new composite PTransforms.
 
 ## 5. Pipeline I/O
 
@@ -2040,7 +2040,7 @@ for that `PCollection`.  The `GroupByKey` transform 
groups the elements of the
 subsequent `ParDo` transform gets applied multiple times per key, once for each
 window.
 
-### 7.2. Beam windowing functions
+### 7.2. Provided windowing functions
 
 You can define different kinds of windows to divide the elements of your
 `PCollection`. Beam provides several windowing functions, including:
@@ -2051,11 +2051,14 @@ You can define different kinds of windows to divide the 
elements of your
 *  Single Global Window
 *  Calendar-based Windows (not supported by the Beam SDK for Python)
 
+You can also define your own `WindowFn` if you have a more complex need.
+
 Note that each element can logically belong to more than one window, depending
 on the windowing function you use. Sliding time windowing, for example, creates
 overlapping windows wherein a single element can be assigned to multiple
 windows.
 
+
  7.2.1. Fixed time windows
 
 The simplest form of windowing is using **fixed time windows**: given a
@@ -2109,15 +2112,15 @@ the start of a new window.
 **Figure:** Session windows, with a minimum gap duration. Note how each data 
key
 has different windows, according to its data distribution.
 
- 7.2.4. Single global window
+ 7.2.4. The single global window
 
-By default, all data in a `PCollection` is assigned to a single global window,
+By default, all data in a `PCollection` is assigned to the single global 
window,
 and late data is discarded. If your data set is of a fixed size, you can use 
the
 global window default for your `PCollection`.

[beam-site] branch asf-site updated (f478921 -> ed155be)

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from f478921  Prepare repository for deployment.
 add 4964e52  [BEAM-2039] Number programming guide chapters
 add c5d8da6  Updates with review feedback
 add 0eb4ff4  This closes #320
 new ed155be  Prepare repository for deployment.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/documentation/io/built-in/index.html   |2 +-
 content/documentation/io/io-toc/index.html |2 +-
 .../pipelines/create-your-pipeline/index.html  |2 +-
 .../pipelines/design-your-pipeline/index.html  |4 +-
 .../pipelines/test-your-pipeline/index.html|   12 +-
 content/documentation/programming-guide/index.html | 1923 +++-
 .../documentation/sdks/python-custom-io/index.html |2 +-
 .../get-started/mobile-gaming-example/index.html   |8 +-
 content/get-started/wordcount-example/index.html   |6 +-
 src/documentation/io/built-in.md   |2 +-
 src/documentation/io/io-toc.md |2 +-
 .../pipelines/create-your-pipeline.md  |4 +-
 .../pipelines/design-your-pipeline.md  |4 +-
 src/documentation/pipelines/test-your-pipeline.md  |   12 +-
 src/documentation/programming-guide.md | 1902 +--
 src/documentation/sdks/python-custom-io.md |2 +-
 src/get-started/mobile-gaming-example.md   |8 +-
 src/get-started/wordcount-example.md   |6 +-
 18 files changed, 2842 insertions(+), 1061 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[jira] [Commented] (BEAM-2298) Java WordCount doesn't work in Window OS for glob expressions or file: prefixed paths

2017-09-22 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176954#comment-16176954
 ] 

Reuven Lax commented on BEAM-2298:
--

I am resolving this for now. Please reopen if you believe the issue still 
exists.

> Java WordCount doesn't work in Window OS for glob expressions or file: 
> prefixed paths
> -
>
> Key: BEAM-2298
> URL: https://issues.apache.org/jira/browse/BEAM-2298
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Flavio Fiszman
> Fix For: 2.2.0
>
>
> I am not able to build beam repo in Windows OS, so I copied the jar file from 
> my Mac.
> WordCount failed with the following cmd:
> java -cp beam-examples-java-2.0.0-jar-with-dependencies.jar
>  org.apache.beam.examples.WordCount --inputFile=input.txt --output=counts
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource 
> getEstimatedSizeB
> ytes
> INFO: Filepattern input.txt matched 1 files with total size 0
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource 
> expandFilePattern
> INFO: Matched 1 files for pattern input.txt
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource split
> INFO: Splitting filepattern input.txt into bundles of size 0 took 0 ms and 
> produ
> ced 1 files and 0 bundles
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.WriteFiles$2 processElement
> INFO: Finalizing write operation 
> TextWriteOperation{tempDirectory=C:\Users\Pei\D
> esktop\.temp-beam-2017-05-135_13-09-48-1\, windowedWrites=false}.
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.WriteFiles$2 processElement
> INFO: Creating 1 empty output shards in addition to 0 written for a total of 
> 1.
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionExcepti
> on: java.lang.IllegalStateException: Unable to find registrar for c
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.wait
> UntilFinish(DirectRunner.java:322)
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.wait
> UntilFinish(DirectRunner.java:292)
> at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:200
> )
> at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:63)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
> at org.apache.beam.examples.WordCount.main(WordCount.java:184)
> Caused by: java.lang.IllegalStateException: Unable to find registrar for c
> at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.
> java:447)
> at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:111)
> at 
> org.apache.beam.sdk.io.FileSystems.matchResources(FileSystems.java:17
> 4)
> at 
> org.apache.beam.sdk.io.FileSystems.filterMissingFiles(FileSystems.jav
> a:367)
> at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:251)
> at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles
> (FileBasedSink.java:641)
> at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBase
> dSink.java:529)
> at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:59
> 2)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2298) Java WordCount doesn't work in Window OS for glob expressions or file: prefixed paths

2017-09-22 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax resolved BEAM-2298.
--
Resolution: Fixed

> Java WordCount doesn't work in Window OS for glob expressions or file: 
> prefixed paths
> -
>
> Key: BEAM-2298
> URL: https://issues.apache.org/jira/browse/BEAM-2298
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Flavio Fiszman
> Fix For: 2.2.0
>
>
> I am not able to build beam repo in Windows OS, so I copied the jar file from 
> my Mac.
> WordCount failed with the following cmd:
> java -cp beam-examples-java-2.0.0-jar-with-dependencies.jar
>  org.apache.beam.examples.WordCount --inputFile=input.txt --output=counts
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource 
> getEstimatedSizeB
> ytes
> INFO: Filepattern input.txt matched 1 files with total size 0
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource 
> expandFilePattern
> INFO: Matched 1 files for pattern input.txt
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource split
> INFO: Splitting filepattern input.txt into bundles of size 0 took 0 ms and 
> produ
> ced 1 files and 0 bundles
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.WriteFiles$2 processElement
> INFO: Finalizing write operation 
> TextWriteOperation{tempDirectory=C:\Users\Pei\D
> esktop\.temp-beam-2017-05-135_13-09-48-1\, windowedWrites=false}.
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.WriteFiles$2 processElement
> INFO: Creating 1 empty output shards in addition to 0 written for a total of 
> 1.
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionExcepti
> on: java.lang.IllegalStateException: Unable to find registrar for c
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.wait
> UntilFinish(DirectRunner.java:322)
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.wait
> UntilFinish(DirectRunner.java:292)
> at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:200
> )
> at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:63)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
> at org.apache.beam.examples.WordCount.main(WordCount.java:184)
> Caused by: java.lang.IllegalStateException: Unable to find registrar for c
> at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.
> java:447)
> at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:111)
> at 
> org.apache.beam.sdk.io.FileSystems.matchResources(FileSystems.java:17
> 4)
> at 
> org.apache.beam.sdk.io.FileSystems.filterMissingFiles(FileSystems.jav
> a:367)
> at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:251)
> at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles
> (FileBasedSink.java:641)
> at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBase
> dSink.java:529)
> at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:59
> 2)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[1/2] beam git commit: Fix type parameter in AvroIO.Write

2017-09-22 Thread reuvenlax
Repository: beam
Updated Branches:
  refs/heads/master 66b864f2b -> 87116cc74


Fix type parameter in AvroIO.Write

Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3bd7ddbf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3bd7ddbf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3bd7ddbf

Branch: refs/heads/master
Commit: 3bd7ddbf7a836091092e8116e6637b97e306cbc4
Parents: 66b864f
Author: Neville Li 
Authored: Wed Aug 9 17:35:21 2017 -0400
Committer: Reuven Lax 
Committed: Fri Sep 22 12:03:36 2017 -0700

--
 .../core/src/main/java/org/apache/beam/sdk/io/AvroIO.java | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/3bd7ddbf/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 108054f..e05ffb5 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -1238,12 +1238,12 @@ public class AvroIO {
 }
 
 /** See {@link TypedWrite#to(DynamicAvroDestinations)}. */
-public Write to(DynamicAvroDestinations dynamicDestinations) {
+public Write to(DynamicAvroDestinations dynamicDestinations) {
   return new 
Write<>(inner.to(dynamicDestinations).withFormatFunction(null));
 }
 
 /** See {@link TypedWrite#withSchema}. */
-public Write withSchema(Schema schema) {
+public Write withSchema(Schema schema) {
   return new Write<>(inner.withSchema(schema));
 }
 /** See {@link TypedWrite#withTempDirectory(ValueProvider)}. */
@@ -1278,8 +1278,8 @@ public class AvroIO {
 }
 
 /** See {@link TypedWrite#withWindowedWrites}. */
-public Write withWindowedWrites() {
-  return new Write(inner.withWindowedWrites());
+public Write withWindowedWrites() {
+  return new Write<>(inner.withWindowedWrites());
 }
 
 /** See {@link TypedWrite#withCodec}. */
@@ -1302,7 +1302,7 @@ public class AvroIO {
 }
 
 /** See {@link TypedWrite#withMetadata} . */
-public Write withMetadata(Map metadata) {
+public Write withMetadata(Map metadata) {
   return new Write<>(inner.withMetadata(metadata));
 }
 



[2/2] beam git commit: This closes #3711

2017-09-22 Thread reuvenlax
This closes #3711


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/87116cc7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/87116cc7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/87116cc7

Branch: refs/heads/master
Commit: 87116cc74f6a84d976fc47ada1710d9c5a78b7cc
Parents: 66b864f 3bd7ddb
Author: Reuven Lax 
Authored: Fri Sep 22 12:17:44 2017 -0700
Committer: Reuven Lax 
Committed: Fri Sep 22 12:17:44 2017 -0700

--
 .../core/src/main/java/org/apache/beam/sdk/io/AvroIO.java | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)
--




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3902

2017-09-22 Thread Apache Jenkins Server
See 




[1/2] beam git commit: Revert "Initial set of pipeline jobs."

2017-09-22 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 3971c7d9c -> 74a5c9e91


Revert "Initial set of pipeline jobs."

This reverts commit 4f7e0d65c514f022c0675dec50853ac3c7554be7.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5a66ce93
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5a66ce93
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5a66ce93

Branch: refs/heads/master
Commit: 5a66ce93ee085d8b388100d47092de0a12264a7d
Parents: 3971c7d
Author: Jason Kuster 
Authored: Fri Sep 22 11:02:51 2017 -0700
Committer: chamik...@google.com 
Committed: Fri Sep 22 12:47:18 2017 -0700

--
 .test-infra/jenkins/PreCommit_Pipeline.groovy   |  89 -
 .../jenkins/common_job_properties.groovy| 185 +--
 .test-infra/jenkins/job_beam_Java_Build.groovy  |  82 
 .../jenkins/job_beam_Java_CodeHealth.groovy |  39 
 .../job_beam_Java_IntegrationTest.groovy|  63 ---
 .../jenkins/job_beam_Java_UnitTest.groovy   |  49 -
 .../jenkins/job_beam_PreCommit_Pipeline.groovy  |  81 
 .../jenkins/job_beam_Python_UnitTest.groovy |  40 
 8 files changed, 47 insertions(+), 581 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/5a66ce93/.test-infra/jenkins/PreCommit_Pipeline.groovy
--
diff --git a/.test-infra/jenkins/PreCommit_Pipeline.groovy 
b/.test-infra/jenkins/PreCommit_Pipeline.groovy
deleted file mode 100644
index 20eaa56..000
--- a/.test-infra/jenkins/PreCommit_Pipeline.groovy
+++ /dev/null
@@ -1,89 +0,0 @@
-#!groovy
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-import hudson.model.Result
-
-int NO_BUILD = -1
-
-// These are args for the GitHub Pull Request Builder (ghprb) Plugin. 
Providing these arguments is
-// necessary due to a bug in the ghprb plugin where environment variables are 
not correctly passed
-// to jobs downstream of a Pipeline job.
-// Tracked by https://github.com/jenkinsci/ghprb-plugin/issues/572.
-List ghprbArgs = [
-string(name: 'ghprbGhRepository', value: "${ghprbGhRepository}"),
-string(name: 'ghprbActualCommit', value: "${ghprbActualCommit}"),
-string(name: 'ghprbPullId', value: "${ghprbPullId}")
-]
-
-// This argument is the commit at which to build.
-List commitArg = [string(name: 'commit', value: 
"origin/pr/${ghprbPullId}/head")]
-
-int javaBuildNum = NO_BUILD
-
-// This (and the below) define "Stages" of a pipeline. These stages run 
serially, and inside can
-// have "parallel" blocks which execute several work steps concurrently. This 
work is limited to
-// simple operations -- more complicated operations need to be performed on an 
actual node. In this
-// case we are using the pipeline to trigger downstream builds.
-stage('Build') {
-parallel (
-java: {
-def javaBuild = build job: 'beam_Java_Build', parameters: 
commitArg + ghprbArgs
-if(javaBuild.getResult() == Result.SUCCESS.toString()) {
-javaBuildNum = javaBuild.getNumber()
-}
-},
-python_unit: { // Python doesn't have a build phase, so we include 
this here.
-build job: 'beam_Python_UnitTest', parameters: commitArg + 
ghprbArgs
-}
-)
-}
-
-// This argument is provided to downstream jobs so they know from which build 
to pull artifacts.
-javaBuildArg = [string(name: 'buildNum', value: "${javaBuildNum}")]
-javaUnitPassed = false
-
-stage('Unit Test / Code Health') {
-parallel (
-java_unit: {
-if(javaBuildNum != NO_BUILD) {
-def javaTest = build job: 'beam_Java_UnitTest', parameters: 
javaBuildArg + ghprbArgs
-if(javaTest.getResult() == Result.SUCCESS.toString()) {
-javaUnitPassed = true
-}
-}
-},
-java_codehealth: {
-if(javaBuildNum != NO_BUILD) {
-build job: 'beam_Java_CodeHealth', parameters: 

[2/2] beam git commit: This closes #3884

2017-09-22 Thread chamikara
This closes #3884


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/74a5c9e9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/74a5c9e9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/74a5c9e9

Branch: refs/heads/master
Commit: 74a5c9e910cae59493ac918daeb1d88aa4426a12
Parents: 3971c7d 5a66ce9
Author: chamik...@google.com 
Authored: Fri Sep 22 12:47:26 2017 -0700
Committer: chamik...@google.com 
Committed: Fri Sep 22 12:47:26 2017 -0700

--
 .test-infra/jenkins/PreCommit_Pipeline.groovy   |  89 -
 .../jenkins/common_job_properties.groovy| 185 +--
 .test-infra/jenkins/job_beam_Java_Build.groovy  |  82 
 .../jenkins/job_beam_Java_CodeHealth.groovy |  39 
 .../job_beam_Java_IntegrationTest.groovy|  63 ---
 .../jenkins/job_beam_Java_UnitTest.groovy   |  49 -
 .../jenkins/job_beam_PreCommit_Pipeline.groovy  |  81 
 .../jenkins/job_beam_Python_UnitTest.groovy |  40 
 8 files changed, 47 insertions(+), 581 deletions(-)
--




[GitHub] beam pull request #3884: Revert "Initial set of pipeline jobs."

2017-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3884


---


[GitHub] beam pull request #3889: [BEAM-2981] Update Dataflow worker version to attai...

2017-09-22 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3889

[BEAM-2981] Update Dataflow worker version to attain ProtoCoder 
serialVersionUID fix

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam proto-worker-version

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3889.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3889


commit 85901d3f5d5e648862155c144f9158ec21a874b2
Author: Kenneth Knowles <k...@google.com>
Date:   2017-09-22T19:01:13Z

Update Dataflow worker to 20170922-01




---


[beam-site] 03/03: This closes #320

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 0eb4ff496c320d372cde575ba4708faaadde0b19
Merge: f478921 c5d8da6
Author: Mergebot 
AuthorDate: Fri Sep 22 19:05:36 2017 +

This closes #320

 src/documentation/io/built-in.md   |2 +-
 src/documentation/io/io-toc.md |2 +-
 .../pipelines/create-your-pipeline.md  |4 +-
 .../pipelines/design-your-pipeline.md  |4 +-
 src/documentation/pipelines/test-your-pipeline.md  |   12 +-
 src/documentation/programming-guide.md | 1902 ++--
 src/documentation/sdks/python-custom-io.md |2 +-
 src/get-started/mobile-gaming-example.md   |8 +-
 src/get-started/wordcount-example.md   |6 +-
 9 files changed, 1381 insertions(+), 561 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[jira] [Commented] (BEAM-2984) Job submission too large with embedded Beam protos

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176988#comment-16176988
 ] 

ASF GitHub Bot commented on BEAM-2984:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3887


> Job submission too large with embedded Beam protos
> --
>
> Key: BEAM-2984
> URL: https://issues.apache.org/jira/browse/BEAM-2984
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Empirically, naively putting context around the {{DoFnInfo}} could cause a 
> blowup of 40%, which is too much and might cause jobs that were will under 
> API size limits to start to fail.
> There's a certain amount of wiggle room since it is hard to control the 
> submission size anyhow, but 40% is way too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4027

2017-09-22 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Apex #2450

2017-09-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #3190

2017-09-22 Thread Apache Jenkins Server
See 


--
[...truncated 44.47 KB...]
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting six<1.11,>=1.9 (from apache-beam==2.2.0.dev0)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.6-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-N4R1sa-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmpOFv7Ggpip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-N4R1sa-build/setup.py", line 203, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-N4R1sa-build/setup.py", line 143, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, six, pbr, 
funcsigs, mock, pyasn1, pyasn1-modules, rsa, oauth2client, protobuf, pyyaml, 
typing, apache-beam
  Found existing installation: six 1.11.0
Uninstalling six-1.11.0:
  Successfully uninstalled six-1.11.0
  Found existing installation: protobuf 3.4.0
Uninstalling protobuf-3.4.0:
  Successfully uninstalled protobuf-3.4.0
  Running setup.py install for apache-beam: started
Running setup.py install for apache-beam: finished with status 'error'
Complete output from 

[jira] [Commented] (BEAM-2980) BagState.isEmpty needs a tighter spec

2017-09-22 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176119#comment-16176119
 ] 

Aljoscha Krettek commented on BEAM-2980:


I think this is a more specific version of BEAM-2975.

> BagState.isEmpty needs a tighter spec
> -
>
> Key: BEAM-2980
> URL: https://issues.apache.org/jira/browse/BEAM-2980
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> Consider the following:
> {code}
> BagState myBag = // empty
> ReadableState isMyBagEmpty = myBag.isEmpty();
> myBag.add(bizzle);
> bool empty = isMyBagEmpty.read();
> {code}
> Should {{empty}} be true or false? We need a consistent answer, across all 
> kinds of state, when snapshots are required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow #4022

2017-09-22 Thread Apache Jenkins Server
See 


--
[...truncated 8.57 MB...]
[INFO] 2017-09-22T07:28:15.788Z: (1c2f90c43fea7e04): Executing operation 
input/Read(CreateSource)/Read(CreateSource)/Read(BoundedToUnboundedSourceAdapter)/DataflowRunner.StreamingUnboundedRead.ReadWithIds+input/Read(CreateSource)/Read(CreateSource)/Read(BoundedToUnboundedSourceAdapter)/StripIds+ParDo(SDFWithSideInput)/ParMultiDo(SDFWithSideInput)/Pair
 with initial 
restriction+ParDo(SDFWithSideInput)/ParMultiDo(SDFWithSideInput)/Split 
restriction+ParDo(SDFWithSideInput)/ParMultiDo(SDFWithSideInput)/Explode 
windows+ParDo(SDFWithSideInput)/ParMultiDo(SDFWithSideInput)/Assign unique 
key/AddKeys/Map+s19/GroupByKeyRaw/WriteStream
[INFO] 2017-09-22T07:28:15.788Z: (263117a702ff1ce6): Executing operation 
PAssert$326/GroupGlobally/GatherAllOutputs/GroupByKey/ReadStream+PAssert$326/GroupGlobally/GatherAllOutputs/GroupByKey/MergeBuckets+PAssert$326/GroupGlobally/GatherAllOutputs/Values/Values/Map+PAssert$326/GroupGlobally/RewindowActuals/Window.Assign+PAssert$326/GroupGlobally/KeyForDummy/AddKeys/Map+PAssert$326/GroupGlobally/GroupDummyAndContents/WriteStream
[INFO] 2017-09-22T07:28:15.788Z: (5436ecd8f6f9394f): Executing operation 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)/GroupByKey/ReadStream+View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)/Combine.GroupedValues+View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)/Combine.GroupedValues/Extract+View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map+View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/ParDo(StreamingPCollectionViewWriter)
[INFO] 2017-09-22T07:28:15.816Z: (1a905f364014439a): Executing operation 
s19/GroupByKeyRaw/ReadStream+s19/SplittableProcess+PAssert$326/GroupGlobally/Window.Into()/Window.Assign+PAssert$326/GroupGlobally/GatherAllOutputs/ParDo(ReifyTimestampsAndWindows)+PAssert$326/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map+PAssert$326/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign+PAssert$326/GroupGlobally/GatherAllOutputs/GroupByKey/WriteStream
[INFO] 2017-09-22T07:28:16.072Z: (787ccb1b91400c15): Autoscaling: Resized 
worker pool from 1 to 0.
[INFO] 2017-09-22T07:28:16.074Z: (787ccb1b91400c37): Autoscaling: Would further 
reduce the number of workers but reached the minimum number allowed for the job.
[INFO] 2017-09-22T07:28:15.325Z: (8629eb55bb61): Workers have started 
successfully.
[INFO] 2017-09-22T07:28:17.637Z: (e9d1c9d1afc0efa6): Executing operation 
WindowingTest.WindowedCount/Count.PerElement/Combine.perKey(Count)/GroupByKey/Close
[INFO] 2017-09-22T07:28:17.673Z: (e9d1c9d1afc0e5de): Executing operation 
PAssert$319/GroupGlobally/GatherAllOutputs/GroupByKey/Create
[INFO] 2017-09-22T07:28:17.760Z: (6f09a451c70330d): Executing operation 
WindowingTest.WindowedCount/Count.PerElement/Combine.perKey(Count)/GroupByKey/Read+WindowingTest.WindowedCount/Count.PerElement/Combine.perKey(Count)/Combine.GroupedValues+WindowingTest.WindowedCount/FormatCounts+PAssert$319/GroupGlobally/Window.Into()/Window.Assign+PAssert$319/GroupGlobally/GatherAllOutputs/ParDo(ReifyTimestampsAndWindows)+PAssert$319/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map+PAssert$319/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign+PAssert$319/GroupGlobally/GatherAllOutputs/GroupByKey/Reify+PAssert$319/GroupGlobally/GatherAllOutputs/GroupByKey/Write
[INFO] 2017-09-22T07:28:16.075Z: (ee4deeb46b6831b2): Autoscaling: Raised the 
number of workers to 0 based on the rate of progress in the currently running 
step(s).
[INFO] 2017-09-22T07:28:22.675Z: (6f09a451c70367b): Executing operation 
PAssert$319/GroupGlobally/GatherAllOutputs/GroupByKey/Close
[INFO] 2017-09-22T07:28:22.692Z: (6f09a451c703473): Executing operation 
PAssert$319/GroupGlobally/GroupDummyAndContents/Create
[INFO] 2017-09-22T07:28:22.739Z: (6f09a451c703f5f): Executing operation 
PAssert$319/GroupGlobally/Create.Values/Read(CreateSource)+PAssert$319/GroupGlobally/WindowIntoDummy/Window.Assign+PAssert$319/GroupGlobally/GroupDummyAndContents/Reify+PAssert$319/GroupGlobally/GroupDummyAndContents/Write
[INFO] 2017-09-22T07:28:22.753Z: (e9d1c9d1afc0e1f9): Executing operation 

[GitHub] beam pull request #3891: Removes codepaths for reading unsplit BigQuery sour...

2017-09-22 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3891

Removes codepaths for reading unsplit BigQuery sources

R: @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam rm-bqio-reader

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3891.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3891


commit 2502fdeb6b87a95dd27fd0f5be6e7494b1f4c916
Author: Eugene Kirpichov 
Date:   2017-09-22T23:28:38Z

Removes codepaths for reading unsplit BigQuery sources




---


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4028

2017-09-22 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1934) Code examples for CoGroupByKey

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177436#comment-16177436
 ] 

ASF GitHub Bot commented on BEAM-1934:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3867


> Code examples for CoGroupByKey
> --
>
> Key: BEAM-1934
> URL: https://issues.apache.org/jira/browse/BEAM-1934
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Aviem Zur
>Assignee: Melissa Pashniak
>
> Add code examples for usage of {{CoGroupByKey}}.
> Also, it would probably be wise to give introductions to the components of a 
> {{CoGroupByKey}} such as {{KeyedPCollectionTuple}} and {{TupleTag}} to help 
> users understand how to use it correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: This closes #3867

2017-09-22 Thread chamikara
This closes #3867


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/aa2604a3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/aa2604a3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/aa2604a3

Branch: refs/heads/master
Commit: aa2604a39f6f9a6f7d3273b61f224dbac358bb69
Parents: 892e6c8 13ed7ff
Author: chamik...@google.com 
Authored: Fri Sep 22 19:00:47 2017 -0700
Committer: chamik...@google.com 
Committed: Fri Sep 22 19:00:47 2017 -0700

--
 .../apache_beam/examples/snippets/snippets.py   |  8 +++
 .../examples/snippets/snippets_test.py  | 25 +---
 2 files changed, 26 insertions(+), 7 deletions(-)
--




[jira] [Commented] (BEAM-2985) BigQuery IO write transform is broken for DirectRunner

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177415#comment-16177415
 ] 

ASF GitHub Bot commented on BEAM-2985:
--

GitHub user chamikaramj opened a pull request:

https://github.com/apache/beam/pull/3892

[BEAM-2985] Updates WriteToBigQuery PTransform to get project id from 
GoogleCloud…

…Options when using DirectRunner.

WriteToBigQuery PTransform behaves differently for DirectRunner and 
DataflowRunner when it comes to determining the project that the output table 
belongs to. If a project is not specified, DataflowRunner defauls to 
GoogleCloudOptions.project while DirectRunner does not. This PR fixes this 
inconsistency by defaulting to GoogleCloudOptions.project for DirectRunner as 
well.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/beam bq_direct_runner_write

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3892.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3892


commit f99db7932cab90dda2741d22b291e7f1eaad7336
Author: chamik...@google.com 
Date:   2017-09-23T00:59:50Z

Updates WriteToBigQuery PTransform to get project id from 
GoogleCloudOptions when using DirectRunner.

WriteToBigQuery PTransform behaves differently for DirectRunner and 
DataflowRunner when it comes to determining the project that the output table 
belongs to. If a project is not specified, DataflowRunner defauls to 
GoogleCloudOptions.project while DirectRunner does not. This PR fixes this 
inconsistency by defaulting to GoogleCloudOptions.project for DirectRunner as 
well.




> BigQuery IO write transform is broken for DirectRunner
> --
>
> Key: BEAM-2985
> URL: https://issues.apache.org/jira/browse/BEAM-2985
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> I get following error when trying to run BigQuery tornadoes using 
> DirectRunner.
> DataflowRunner seems to be working fine.
> python -m apache_beam.examples.cookbook.bigquery_tornadoes --output 
> . --project 
>  Request missing required parameter projectId
>  Traceback for above exception (most recent call last):
>   File "apache_beam/utils/retry.py", line 175, in wrapper
> return fun(*args, **kwargs)
>   File "apache_beam/io/gcp/bigquery.py", line 828, in _get_table
> response = self.client.tables.Get(request)
>   File "apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", 
> line 608, in Get
> config, request, global_params=global_params)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 695, in _RunMethod
> download)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 676, in PrepareHttpRequest
> method_config, request, relative_path=url_builder.relative_path)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 580, in __ConstructRelativePath
> relative_path=relative_path)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/util.py",
>  line 124, in ExpandRelativePath
> 'Request missing required parameter %s' % param)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3892: [BEAM-2985] Updates WriteToBigQuery PTransform to g...

2017-09-22 Thread chamikaramj
GitHub user chamikaramj opened a pull request:

https://github.com/apache/beam/pull/3892

[BEAM-2985] Updates WriteToBigQuery PTransform to get project id from 
GoogleCloud…

…Options when using DirectRunner.

WriteToBigQuery PTransform behaves differently for DirectRunner and 
DataflowRunner when it comes to determining the project that the output table 
belongs to. If a project is not specified, DataflowRunner defauls to 
GoogleCloudOptions.project while DirectRunner does not. This PR fixes this 
inconsistency by defaulting to GoogleCloudOptions.project for DirectRunner as 
well.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/chamikaramj/beam bq_direct_runner_write

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3892.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3892


commit f99db7932cab90dda2741d22b291e7f1eaad7336
Author: chamik...@google.com 
Date:   2017-09-23T00:59:50Z

Updates WriteToBigQuery PTransform to get project id from 
GoogleCloudOptions when using DirectRunner.

WriteToBigQuery PTransform behaves differently for DirectRunner and 
DataflowRunner when it comes to determining the project that the output table 
belongs to. If a project is not specified, DataflowRunner defauls to 
GoogleCloudOptions.project while DirectRunner does not. This PR fixes this 
inconsistency by defaulting to GoogleCloudOptions.project for DirectRunner as 
well.




---


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4030

2017-09-22 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4029

2017-09-22 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3867: Included immediate results after CoGroupByKey for b...

2017-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3867


---


[1/2] beam git commit: Included immediate results after CoGroupByKey for better readability in docs

2017-09-22 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 892e6c833 -> aa2604a39


Included immediate results after CoGroupByKey for better readability in docs


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/13ed7ff9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/13ed7ff9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/13ed7ff9

Branch: refs/heads/master
Commit: 13ed7ff920a45293a5a4d75f4dfdb52bbbf2b799
Parents: 892e6c8
Author: David Cavazos 
Authored: Tue Sep 19 12:15:38 2017 -0700
Committer: chamik...@google.com 
Committed: Fri Sep 22 19:00:36 2017 -0700

--
 .../apache_beam/examples/snippets/snippets.py   |  8 +++
 .../examples/snippets/snippets_test.py  | 25 +---
 2 files changed, 26 insertions(+), 7 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/13ed7ff9/sdks/python/apache_beam/examples/snippets/snippets.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets.py 
b/sdks/python/apache_beam/examples/snippets/snippets.py
index eac87a2..0ced3f1 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets.py
@@ -1159,15 +1159,15 @@ def model_co_group_by_key_tuple(email_list, phone_list, 
output_path):
 # For instance, if 'emails' contained ('joe', 'j...@example.com') and
 # ('joe', 'j...@gmail.com'), then 'result' will contain the element:
 # ('joe', {'emails': ['j...@example.com', 'j...@gmail.com'], 'phones': 
...})
-result = ({'emails': emails_pcoll, 'phones': phones_pcoll}
-  | beam.CoGroupByKey())
+results = ({'emails': emails_pcoll, 'phones': phones_pcoll}
+   | beam.CoGroupByKey())
 
-contact_lines = result | beam.Map(
+formatted_results = results | beam.Map(
 lambda (name, info):\
'%s; %s; %s' %\
(name, sorted(info['emails']), sorted(info['phones'])))
 # [END model_group_by_key_cogroupbykey_tuple]
-contact_lines | beam.io.WriteToText(output_path)
+formatted_results | beam.io.WriteToText(output_path)
 
 
 def model_join_using_side_inputs(

http://git-wip-us.apache.org/repos/asf/beam/blob/13ed7ff9/sdks/python/apache_beam/examples/snippets/snippets_test.py
--
diff --git a/sdks/python/apache_beam/examples/snippets/snippets_test.py 
b/sdks/python/apache_beam/examples/snippets/snippets_test.py
index a700ba5..269a241 100644
--- a/sdks/python/apache_beam/examples/snippets/snippets_test.py
+++ b/sdks/python/apache_beam/examples/snippets/snippets_test.py
@@ -711,14 +711,33 @@ class SnippetsTest(unittest.TestCase):
 result_path = self.create_temp_file()
 snippets.model_co_group_by_key_tuple(email_list, phone_list, result_path)
 # [START model_group_by_key_cogroupbykey_tuple_outputs]
-contact_lines = [
+results = [
+('amy', {
+'emails': ['a...@example.com'],
+'phones': ['111-222-', '333-444-']}),
+('carl', {
+'emails': ['c...@email.com', 'c...@example.com'],
+'phones': ['444-555-']}),
+('james', {
+'emails': [],
+'phones': ['222-333-']}),
+('julia', {
+'emails': ['ju...@example.com'],
+'phones': []}),
+]
+# [END model_group_by_key_cogroupbykey_tuple_outputs]
+# [START model_group_by_key_cogroupbykey_tuple_formatted_outputs]
+formatted_results = [
 "amy; ['a...@example.com']; ['111-222-', '333-444-']",
 "carl; ['c...@email.com', 'c...@example.com']; ['444-555-']",
 "james; []; ['222-333-']",
 "julia; ['ju...@example.com']; []",
 ]
-# [END model_group_by_key_cogroupbykey_tuple_outputs]
-self.assertEqual(contact_lines, self.get_output(result_path))
+# [END model_group_by_key_cogroupbykey_tuple_formatted_outputs]
+expected_results = ['%s; %s; %s' % (name, info['emails'], info['phones'])
+for name, info in results]
+self.assertEqual(expected_results, formatted_results)
+self.assertEqual(formatted_results, self.get_output(result_path))
 
   def test_model_use_and_query_metrics(self):
 """DebuggingWordCount example snippets."""



[GitHub] beam pull request #3889: [BEAM-2981] Update Dataflow worker version to attai...

2017-09-22 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3889


---


[jira] [Commented] (BEAM-2981) Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177420#comment-16177420
 ] 

ASF GitHub Bot commented on BEAM-2981:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3889


> Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch
> ---
>
> Key: BEAM-2981
> URL: https://issues.apache.org/jira/browse/BEAM-2981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Jenkins failure: 
> https://builds.apache.org/job/beam_Release_NightlySnapshot/540/org.apache.beam$beam-runners-google-cloud-dataflow-java/#showFailuresLink
> {code}
> Caused by: java.io.InvalidClassException: 
> org.apache.beam.sdk.extensions.protobuf.ProtoCoder; local class incompatible: 
> stream classdesc serialVersionUID = -5043999806040629525, local class 
> serialVersionUID = 5772992315255168068
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: This closes #2889: [BEAM-2981] Update Dataflow worker version to attain ProtoCoder serialVersionUID fix

2017-09-22 Thread kenn
This closes #2889: [BEAM-2981] Update Dataflow worker version to attain 
ProtoCoder serialVersionUID fix

  Update Dataflow worker to 20170922-01


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/892e6c83
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/892e6c83
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/892e6c83

Branch: refs/heads/master
Commit: 892e6c833758b6c9aae3ddc9399c0cb28fac
Parents: 7dacfdc 85901d3
Author: Kenneth Knowles <k...@google.com>
Authored: Fri Sep 22 18:14:50 2017 -0700
Committer: Kenneth Knowles <k...@google.com>
Committed: Fri Sep 22 18:14:50 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[1/2] beam git commit: Update Dataflow worker to 20170922-01

2017-09-22 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master 7dacfdcda -> 892e6c833


Update Dataflow worker to 20170922-01


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/85901d3f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/85901d3f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/85901d3f

Branch: refs/heads/master
Commit: 85901d3f5d5e648862155c144f9158ec21a874b2
Parents: 74a5c9e
Author: Kenneth Knowles <k...@google.com>
Authored: Fri Sep 22 12:01:13 2017 -0700
Committer: Kenneth Knowles <k...@google.com>
Committed: Fri Sep 22 14:10:11 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/85901d3f/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index eb490cb..4d2c5ee 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170918
+    
beam-master-20170922-01
 
1
 
6
   



[jira] [Resolved] (BEAM-2981) Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch

2017-09-22 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-2981.
---
Resolution: Fixed

> Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch
> ---
>
> Key: BEAM-2981
> URL: https://issues.apache.org/jira/browse/BEAM-2981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Jenkins failure: 
> https://builds.apache.org/job/beam_Release_NightlySnapshot/540/org.apache.beam$beam-runners-google-cloud-dataflow-java/#showFailuresLink
> {code}
> Caused by: java.io.InvalidClassException: 
> org.apache.beam.sdk.extensions.protobuf.ProtoCoder; local class incompatible: 
> stream classdesc serialVersionUID = -5043999806040629525, local class 
> serialVersionUID = 5772992315255168068
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3907

2017-09-22 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3906

2017-09-22 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2985) BigQuery IO write transform is broken for DirectRunner

2017-09-22 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177405#comment-16177405
 ] 

Chamikara Jayalath commented on BEAM-2985:
--

Seems like a this is an inconsistency between DirectRunner and DataflowRunner 
for the recently introduced WriteToBigQuery PTransform. 

If a project ID is not specified for WriteToBigQuery PTransform, DataflowRunner 
currently defaults to GoogleCloudOptions.project while DirectRunner does not. 
WriteToBigQuery will work for DirectRunner if a project is specified (either 
explicitly or as a part of a table string). Will send out a PR to fix this 
inconsistency.

> BigQuery IO write transform is broken for DirectRunner
> --
>
> Key: BEAM-2985
> URL: https://issues.apache.org/jira/browse/BEAM-2985
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> I get following error when trying to run BigQuery tornadoes using 
> DirectRunner.
> DataflowRunner seems to be working fine.
> python -m apache_beam.examples.cookbook.bigquery_tornadoes --output 
> . --project 
>  Request missing required parameter projectId
>  Traceback for above exception (most recent call last):
>   File "apache_beam/utils/retry.py", line 175, in wrapper
> return fun(*args, **kwargs)
>   File "apache_beam/io/gcp/bigquery.py", line 828, in _get_table
> response = self.client.tables.Get(request)
>   File "apache_beam/io/gcp/internal/clients/bigquery/bigquery_v2_client.py", 
> line 608, in Get
> config, request, global_params=global_params)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 695, in _RunMethod
> download)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 676, in PrepareHttpRequest
> method_config, request, relative_path=url_builder.relative_path)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 580, in __ConstructRelativePath
> relative_path=relative_path)
>   File 
> "/Users/chamikara/testing/beam_bq_09_22_2017/env1/lib/python2.7/site-packages/apitools/base/py/util.py",
>  line 124, in ExpandRelativePath
> 'Request missing required parameter %s' % param)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3900

2017-09-22 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow #4023

2017-09-22 Thread Apache Jenkins Server
See 


--
[...truncated 3.39 MB...]
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030333030] to 
[6b6579303030303030333030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030343030] to 
[6b6579303030303030343030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030353030] to 
[6b6579303030303030353030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030363030] to 
[6b6579303030303030363030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030373030] to 
[6b6579303030303030373030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030383030] to 
[6b6579303030303030383030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Adjusting range start from [6b6579303030303030393030] to 
[6b6579303030303030393030] as position of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Closing reader after reading 0 records.
[INFO] Wrote 1 records
[INFO] Adjusting range start from [] to [6b6579303030303030303030] as position 
of first returned record
[INFO] Closing reader after reading 100 records.
[INFO] Enabling the use of null credentials. This should not be used in 
production.
2017-09-22T12:10:35.880 [INFO] Tests run: 27, Failures: 0, Errors: 0, Skipped: 
0, Time elapsed: 4.9 s - in org.apache.beam.sdk.io.gcp.bigtable.BigtableIOTest
2017-09-22T12:10:35.881 [INFO] Running 
org.apache.beam.sdk.io.gcp.datastore.AdaptiveThrottlerTest
2017-09-22T12:10:36.036 [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 
0, Time elapsed: 0.15 s - in 
org.apache.beam.sdk.io.gcp.datastore.AdaptiveThrottlerTest
2017-09-22T12:10:36.036 [INFO] Running 
org.apache.beam.sdk.io.gcp.datastore.DatastoreV1Test
[ERROR] Unable to update metrics on the current thread. Most likely caused by 
using metrics outside the managed work-execution thread.
[INFO] Latest stats timestamp for kind testKind is 123400
[INFO] Estimated size bytes for the query is: 1342177280
[INFO] Splitting the query into 20 splits
[WARNING] Failed to translate Gql query 'SELECT * from DummyKind LIMIT 10 LIMIT 
0': invalid query
[WARNING] User query might have a limit already set, so trying without zero 
limit
[ERROR] Error writing batch of 200 mutations to Datastore (DEADLINE_EXCEEDED): 
[INFO] Latest stats timestamp for kind testKind is 123400
[INFO] Splitting the query into 100 splits
2017-09-22T12:10:48.251 [INFO] Tests run: 46, Failures: 0, Errors: 0, Skipped: 
0, Time elapsed: 12.21 s - in 
org.apache.beam.sdk.io.gcp.datastore.DatastoreV1Test
2017-09-22T12:10:48.251 [INFO] Running 
org.apache.beam.sdk.io.gcp.spanner.SpannerIOWriteTest
2017-09-22T12:10:49.324 [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 
0, Time elapsed: 1.068 s - in 
org.apache.beam.sdk.io.gcp.spanner.SpannerIOWriteTest
2017-09-22T12:10:49.324 [INFO] Running 
org.apache.beam.sdk.io.gcp.spanner.MutationSizeEstimatorTest
2017-09-22T12:10:49.383 [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 
0, Time elapsed: 0.053 s - in 
org.apache.beam.sdk.io.gcp.spanner.MutationSizeEstimatorTest
2017-09-22T12:10:49.383 [INFO] Running 
org.apache.beam.sdk.io.gcp.spanner.SpannerIOReadTest
2017-09-22T12:10:50.028 [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 
0, Time elapsed: 0.642 s - in 
org.apache.beam.sdk.io.gcp.spanner.SpannerIOReadTest
2017-09-22T12:10:50.030 [INFO] 
2017-09-22T12:10:50.030 [INFO] Results:
2017-09-22T12:10:50.030 [INFO] 
2017-09-22T12:10:50.030 [INFO] Tests run: 277, Failures: 0, Errors: 0, Skipped: 0
2017-09-22T12:10:50.030 [INFO] 
[JENKINS] Recording test results
2017-09-22T12:10:50.636 [INFO] 
2017-09-22T12:10:50.636 [INFO] --- 
build-helper-maven-plugin:3.0.0:regex-properties (render-artifact-id) @ 
beam-sdks-java-io-google-cloud-platform ---
2017-09-22T12:10:50.961 [INFO] 
2017-09-22T12:10:50.961 [INFO] --- maven-jar-plugin:3.0.2:jar (default-jar) @ 
beam-sdks-java-io-google-cloud-platform ---
2017-09-22T12:10:50.998 [INFO] Building jar: 

Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4853

2017-09-22 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3899

2017-09-22 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2980) BagState.isEmpty needs a tighter spec

2017-09-22 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176570#comment-16176570
 ] 

Kenneth Knowles commented on BEAM-2980:
---

Yea, it actually came up because the tests added for BEAM-2975 enforce that 
{{isEmpty()}} does *not* take a snapshot.

https://github.com/apache/beam/blob/master/sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/ParDoTest.java#L1991

> BagState.isEmpty needs a tighter spec
> -
>
> Key: BEAM-2980
> URL: https://issues.apache.org/jira/browse/BEAM-2980
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>
> Consider the following:
> {code}
> BagState myBag = // empty
> ReadableState isMyBagEmpty = myBag.isEmpty();
> myBag.add(bizzle);
> bool empty = isMyBagEmpty.read();
> {code}
> Should {{empty}} be true or false? We need a consistent answer, across all 
> kinds of state, when snapshots are required.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3191

2017-09-22 Thread Apache Jenkins Server
See 


--
[...truncated 44.47 KB...]
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting six<1.11,>=1.9 (from apache-beam==2.2.0.dev0)
  Using cached six-1.10.0-py2.py3-none-any.whl
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.6-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-cGBIH4-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmphpkbrypip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-cGBIH4-build/setup.py", line 203, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-cGBIH4-build/setup.py", line 143, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, six, funcsigs, 
pbr, mock, pyasn1, rsa, pyasn1-modules, oauth2client, protobuf, pyyaml, typing, 
apache-beam
  Found existing installation: six 1.11.0
Uninstalling six-1.11.0:
  Successfully uninstalled six-1.11.0
  Found existing installation: protobuf 3.4.0
Uninstalling protobuf-3.4.0:
  Successfully uninstalled protobuf-3.4.0
  Running setup.py install for apache-beam: started
Running setup.py install for apache-beam: finished with status 'error'
Complete output from 

[jira] [Created] (BEAM-2982) PubSubIO.readMessages().fromSubscription(...) doesn't work with ValueProvider

2017-09-22 Thread Ben Chambers (JIRA)
Ben Chambers created BEAM-2982:
--

 Summary: PubSubIO.readMessages().fromSubscription(...) doesn't 
work with ValueProvider
 Key: BEAM-2982
 URL: https://issues.apache.org/jira/browse/BEAM-2982
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-gcp
Reporter: Ben Chambers
Assignee: Thomas Groh


Originally reported on Stack Overflow:

https://stackoverflow.com/questions/46360584/apache-beam-template-runtime-context-error

---

In the `PubsubUnboundedSource#expand` method we create the PubsubSource:

https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L1399

Creating the PubsubSource calls `getSubscription` which attempts to get the 
value out of a value provider.

To support templatization, the PubsubSource needs to take the ValueProvider, 
and only get the subscription out at pipeline execution time.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2981) Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch

2017-09-22 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2981:
-

 Summary: Unable to deserialize ProtoCoder in Dataflow, 
serialVersionUID mismatch
 Key: BEAM-2981
 URL: https://issues.apache.org/jira/browse/BEAM-2981
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles
Priority: Blocker
 Fix For: 2.2.0


Jenkins failure: 
https://builds.apache.org/job/beam_Release_NightlySnapshot/540/org.apache.beam$beam-runners-google-cloud-dataflow-java/#showFailuresLink

{code}
Caused by: java.io.InvalidClassException: 
org.apache.beam.sdk.extensions.protobuf.ProtoCoder; local class incompatible: 
stream classdesc serialVersionUID = -5043999806040629525, local class 
serialVersionUID = 5772992315255168068
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3884: Revert "Initial set of pipeline jobs."

2017-09-22 Thread jasonkuster
GitHub user jasonkuster opened a pull request:

https://github.com/apache/beam/pull/3884

Revert "Initial set of pipeline jobs."

This reverts commit 4f7e0d65c514f022c0675dec50853ac3c7554be7.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
R: @chamikaramj 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jasonkuster/beam revert

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3884.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3884


commit ba3a46e7872d09d90d16419c5cb185676f78dd3f
Author: Jason Kuster 
Date:   2017-09-22T18:02:51Z

Revert "Initial set of pipeline jobs."

This reverts commit 4f7e0d65c514f022c0675dec50853ac3c7554be7.




---


[jira] [Commented] (BEAM-2981) Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch

2017-09-22 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176837#comment-16176837
 ] 

Eugene Kirpichov commented on BEAM-2981:


This should be fixed by a recent fix in the Dataflow worker.

> Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch
> ---
>
> Key: BEAM-2981
> URL: https://issues.apache.org/jira/browse/BEAM-2981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Jenkins failure: 
> https://builds.apache.org/job/beam_Release_NightlySnapshot/540/org.apache.beam$beam-runners-google-cloud-dataflow-java/#showFailuresLink
> {code}
> Caused by: java.io.InvalidClassException: 
> org.apache.beam.sdk.extensions.protobuf.ProtoCoder; local class incompatible: 
> stream classdesc serialVersionUID = -5043999806040629525, local class 
> serialVersionUID = 5772992315255168068
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3901

2017-09-22 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2981) Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch

2017-09-22 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176863#comment-16176863
 ] 

Kenneth Knowles commented on BEAM-2981:
---

I am now in the loop on that, and I'll bump the worker version shortly.

> Unable to deserialize ProtoCoder in Dataflow, serialVersionUID mismatch
> ---
>
> Key: BEAM-2981
> URL: https://issues.apache.org/jira/browse/BEAM-2981
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.2.0
>
>
> Jenkins failure: 
> https://builds.apache.org/job/beam_Release_NightlySnapshot/540/org.apache.beam$beam-runners-google-cloud-dataflow-java/#showFailuresLink
> {code}
> Caused by: java.io.InvalidClassException: 
> org.apache.beam.sdk.extensions.protobuf.ProtoCoder; local class incompatible: 
> stream classdesc serialVersionUID = -5043999806040629525, local class 
> serialVersionUID = 5772992315255168068
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3885: [BEAM-2876] Add preliminary provision API

2017-09-22 Thread herohde
GitHub user herohde opened a pull request:

https://github.com/apache/beam/pull/3885

[BEAM-2876] Add preliminary provision API

Add provisioning API for the portability container contract. The pipeline 
options
match the type used in the job submission request.

See: https://s.apache.org/beam-fn-api-container-contract


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/herohde/beam containers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3885.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3885


commit dc340577bc51ba123aa1aa62caa86882e3813561
Author: Henning Rohde 
Date:   2017-09-22T17:41:28Z

[BEAM-2876] Add preliminary provision API




---


[jira] [Commented] (BEAM-2876) Add provision api proto

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176867#comment-16176867
 ] 

ASF GitHub Bot commented on BEAM-2876:
--

GitHub user herohde opened a pull request:

https://github.com/apache/beam/pull/3885

[BEAM-2876] Add preliminary provision API

Add provisioning API for the portability container contract. The pipeline 
options
match the type used in the job submission request.

See: https://s.apache.org/beam-fn-api-container-contract


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/herohde/beam containers

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3885.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3885


commit dc340577bc51ba123aa1aa62caa86882e3813561
Author: Henning Rohde 
Date:   2017-09-22T17:41:28Z

[BEAM-2876] Add preliminary provision API




> Add provision api proto
> ---
>
> Key: BEAM-2876
> URL: https://issues.apache.org/jira/browse/BEAM-2876
> Project: Beam
>  Issue Type: Sub-task
>  Components: beam-model
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>  Labels: portability
>
> As per discussion in https://s.apache.org/beam-fn-api-container-contract, we 
> need to define the provision API to allow boot code access to pipeline 
> options, in particular.
> It is proposed as a separate API instead of merging it with control or 
> artifact:
>(1) Not merging with control avoids having the boot code talk to control, 
> only to disconnect and have the SDK harness connect. The runner can't then 
> use the lifetime of the connection to be the lifetime of the SDK harness.
>(2) Not merging with artifact allows for simple, reusable artifact proxies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2983) separate docs for constructing WindowFn from applying WindowFn to PCollection

2017-09-22 Thread Melissa Pashniak (JIRA)
Melissa Pashniak created BEAM-2983:
--

 Summary: separate docs for constructing WindowFn from applying 
WindowFn to PCollection
 Key: BEAM-2983
 URL: https://issues.apache.org/jira/browse/BEAM-2983
 Project: Beam
  Issue Type: Improvement
  Components: website
Reporter: Melissa Pashniak
Assignee: Melissa Pashniak
Priority: Minor


tgroh request: condense the "How to construct a window fn" parts to the 
provided windowing functions section, and have this be just how to apply a 
WindowFn to a PCollection




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-2967) Python appears to be broken in Java Pre, Postcommit

2017-09-22 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh closed BEAM-2967.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> Python appears to be broken in Java Pre, Postcommit
> ---
>
> Key: BEAM-2967
> URL: https://issues.apache.org/jira/browse/BEAM-2967
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Thomas Groh
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: Not applicable
>
>
> Postcommit has been broken since 6 pm on the 17th, starting at 
> https://builds.apache.org/job/beam_PostCommit_Java_MavenInstall/4811/
> The java components appear to still build but the overall build fails during 
> the python target with
> TypeError: Error when calling the metaclass bases
> metaclass conflict: the metaclass of a derived class must be a 
> (non-strict) subclass of the metaclasses of all its bases



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2877) Java SDK harness container

2017-09-22 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-2877:
---

Assignee: Henning Rohde

> Java SDK harness container
> --
>
> Key: BEAM-2877
> URL: https://issues.apache.org/jira/browse/BEAM-2877
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-harness
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>  Labels: portability
>
> Add portable Java SDK harness container as per 
> https://s.apache.org/beam-fn-api-container-contract.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3886: [BEAM-419] Fixing SE_BAD_FIELD FindBug in CombineFn...

2017-09-22 Thread youngoli
GitHub user youngoli opened a pull request:

https://github.com/apache/beam/pull/3886

[BEAM-419] Fixing SE_BAD_FIELD FindBug in CombineFnUtil

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

FindBugs detected that the `context` member in 
`NonSerializableBoundedKeyedCombineFn` (now `NonSerializableBoundedCombineFn`) 
was not serializable although the class was. Seeing as the class has 
"`NonSerializable`" right in the name, to fix the FindBug I marked the member 
as transient to make it explicitly non-serializable.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/youngoli/beam bugfix-beam419

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3886.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3886


commit 0a9f01cffd686985bd0e7a889ad3c36524e84ed5
Author: Daniel Oliveira 
Date:   2017-09-22T17:40:29Z

[BEAM-419] Making non-serializable member transient to fix FindBug

commit d0491d2c2e229eb34e202833ab6bed4d344cd18b
Author: Daniel Oliveira 
Date:   2017-09-22T18:23:30Z

[BEAM-419] Removing findbug entry for fixed bug.




---


[jira] [Commented] (BEAM-419) Non-transient non-serializable instance field in CombineFnUtil$NonSerializableBoundedKeyedCombineFn

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176885#comment-16176885
 ] 

ASF GitHub Bot commented on BEAM-419:
-

GitHub user youngoli opened a pull request:

https://github.com/apache/beam/pull/3886

[BEAM-419] Fixing SE_BAD_FIELD FindBug in CombineFnUtil

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

FindBugs detected that the `context` member in 
`NonSerializableBoundedKeyedCombineFn` (now `NonSerializableBoundedCombineFn`) 
was not serializable although the class was. Seeing as the class has 
"`NonSerializable`" right in the name, to fix the FindBug I marked the member 
as transient to make it explicitly non-serializable.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/youngoli/beam bugfix-beam419

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3886.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3886


commit 0a9f01cffd686985bd0e7a889ad3c36524e84ed5
Author: Daniel Oliveira 
Date:   2017-09-22T17:40:29Z

[BEAM-419] Making non-serializable member transient to fix FindBug

commit d0491d2c2e229eb34e202833ab6bed4d344cd18b
Author: Daniel Oliveira 
Date:   2017-09-22T18:23:30Z

[BEAM-419] Removing findbug entry for fixed bug.




> Non-transient non-serializable instance field in 
> CombineFnUtil$NonSerializableBoundedKeyedCombineFn
> ---
>
> Key: BEAM-419
> URL: https://issues.apache.org/jira/browse/BEAM-419
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Daniel Oliveira
>Priority: Minor
>  Labels: findbugs, newbie, starter
>
> [FindBugs 
> SE_BAD_FIELD|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L363]:
>  Non-transient non-serializable instance field in serializable class
> Applies to: 
> [CombineFnUtil$NonSerializableBoundedKeyedCombineFn.context|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/util/CombineFnUtil.java#L170].
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3887: [BEAM-2884] Revert "This closes #3859: Send portabl...

2017-09-22 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3887

[BEAM-2884] Revert "This closes #3859: Send portable protos for ParDo in 
DataflowRunner"

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

The blowup to the job submission was a bit much. We will instead wait to 
implement a more robust longer-term solution that does not embed the protos 
directly in the job submission.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam ParDoPayload-rollback

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3887.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3887


commit cf665b6113be7f01fcb55e80d3657079055b8f95
Author: Kenneth Knowles 
Date:   2017-09-22T18:34:28Z

Revert "This closes #3859: [BEAM-2884] Send portable protos for ParDo in 
DataflowRunner"

This reverts commit 0d5d00d7060d6e4ee8273201e3432f14abf35f8a, reversing
changes made to 4e4d102124576aefc3f71e432dbf619792e3.

The blowup to the job submission was a bit much. We will instead wait to
implement a more robust longer-term solution that does not embed the protos
directly in the job submission.




---


[jira] [Commented] (BEAM-2884) Dataflow runs portable pipelines

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176906#comment-16176906
 ] 

ASF GitHub Bot commented on BEAM-2884:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/3887

[BEAM-2884] Revert "This closes #3859: Send portable protos for ParDo in 
DataflowRunner"

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [x] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

The blowup to the job submission was a bit much. We will instead wait to 
implement a more robust longer-term solution that does not embed the protos 
directly in the job submission.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam ParDoPayload-rollback

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3887.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3887


commit cf665b6113be7f01fcb55e80d3657079055b8f95
Author: Kenneth Knowles 
Date:   2017-09-22T18:34:28Z

Revert "This closes #3859: [BEAM-2884] Send portable protos for ParDo in 
DataflowRunner"

This reverts commit 0d5d00d7060d6e4ee8273201e3432f14abf35f8a, reversing
changes made to 4e4d102124576aefc3f71e432dbf619792e3.

The blowup to the job submission was a bit much. We will instead wait to
implement a more robust longer-term solution that does not embed the protos
directly in the job submission.




> Dataflow runs portable pipelines
> 
>
> Key: BEAM-2884
> URL: https://issues.apache.org/jira/browse/BEAM-2884
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Henning Rohde
>  Labels: portability
>
> Dataflow should run pipelines using the full portability API as currently 
> defined:
> https://s.apache.org/beam-fn-api 
> https://s.apache.org/beam-runner-api
> https://s.apache.org/beam-job-api
> https://s.apache.org/beam-fn-api-container-contract
> This issue tracks its adoption of the portability framework. New Fn API and 
> other features will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2984) Job submission too large with embedded Beam protos

2017-09-22 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2984:
-

 Summary: Job submission too large with embedded Beam protos
 Key: BEAM-2984
 URL: https://issues.apache.org/jira/browse/BEAM-2984
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles
Priority: Blocker
 Fix For: 2.2.0


Empirically, naively putting context around the {{DoFnInfo}} could cause a 
blowup of 40%, which is too much and might cause jobs that were will under API 
size limits to start to fail.

There's a certain amount of wiggle room since it is hard to control the 
submission size anyhow, but 40% is way too much.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[beam-site] branch mergebot updated (7e8a86b -> 8c78c9a)

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 7e8a86b  This closes #311
 add f478921  Prepare repository for deployment.
 new d9e10e6  Update Mapreduce capability matrix when/how entries
 new 8c78c9a  This closes #324

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 content/get-started/wordcount-example/index.html | 370 ---
 src/_data/capability-matrix.yml  |  30 +-
 2 files changed, 281 insertions(+), 119 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 02/02: This closes #324

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 8c78c9a15b189d9845f7715f64993f661e12798a
Merge: f478921 d9e10e6
Author: Mergebot 
AuthorDate: Fri Sep 22 18:43:33 2017 +

This closes #324

 src/_data/capability-matrix.yml | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] 01/02: Update Mapreduce capability matrix when/how entries

2017-09-22 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit d9e10e65880ffb3b936cfe38e6d75d3f940e1f34
Author: melissa 
AuthorDate: Thu Sep 21 09:41:58 2017 -0700

Update Mapreduce capability matrix when/how entries
---
 src/_data/capability-matrix.yml | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/src/_data/capability-matrix.yml b/src/_data/capability-matrix.yml
index 1c1171d..b0ea35a 100644
--- a/src/_data/capability-matrix.yml
+++ b/src/_data/capability-matrix.yml
@@ -577,7 +577,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: It is a batch-only runner, and intermediate trigger firings 
are effectively meaningless.
+l2: batch-only runner
 l3: ''
 
   - name: Event-time triggers
@@ -608,7 +608,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: Currently watermark progress jumps from the beginning of time 
to the end of time once the input has been fully consumed, thus no additional 
triggering granularity is available.
+l2: ''
 l3: ''
 
   - name: Processing-time triggers
@@ -639,7 +639,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: From the perspective of triggers, processing time currently 
jumps from the beginning of time to the end of time once the input has been 
fully consumed, thus no additional triggering granularity is available.
+l2: ''
 l3: ''
 
   - name: Count triggers
@@ -670,7 +670,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: Elements are processed in the largest bundles possible, so 
count-based triggers are effectively meaningless.
+l2: ''
 l3: ''
 
   - name: '[Meta]data driven triggers'
@@ -702,7 +702,7 @@ categories:
 l3:
   - class: mapreduce
 l1: 'No'
-l2: pending model support
+l2: ''
 l3:
 
   - name: Composite triggers
@@ -732,8 +732,8 @@ categories:
 l2: ''
 l3: ''
   - class: mapreduce
-l1: 'Yes'
-l2: fully supported
+l1: 'No'
+l2: ''
 l3: ''
 
   - name: Allowed lateness
@@ -764,7 +764,7 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: No data is ever late.
+l2: ''
 l3: ''
 
   - name: Timers
@@ -794,8 +794,8 @@ categories:
 l2: not implemented
 l3: ''
   - class: mapreduce
-l1: 'Partially'
-l2: not implemented
+l1: 'No'
+l2: ''
 l3: ''
 
   - description: How do refinements relate?
@@ -833,8 +833,8 @@ categories:
 l2: fully supported
 l3: ''
   - class: mapreduce
-l1: 'Yes'
-l2: fully supported
+l1: 'No'
+l2: batch-only runner
 l3: ''
 
   - name: Accumulating
@@ -864,8 +864,8 @@ categories:
 l2: ''
 l3: ''
   - class: mapreduce
-l1: 'Yes'
-l2: fully supported
+l1: 'No'
+l2: ''
 l3: ''
 
   - name: 'Accumulating  Retracting'
@@ -897,5 +897,5 @@ categories:
 l3: ''
   - class: mapreduce
 l1: 'No'
-l2: pending model support
+l2: ''
 l3: ''

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Dataflow #4025

2017-09-22 Thread Apache Jenkins Server
See 


--
Started by user kenn
[EnvInject] - Loading node environment variables.
Building remotely on beam6 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse refs/remotes/origin/pr/3887/head^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3887/head^{commit} # timeout=10
 > git rev-parse origin/pr/3887/head^{commit} # timeout=10
ERROR: Couldn't find any revision to build. Verify the repository and branch 
configuration for this job.
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse refs/remotes/origin/pr/3887/head^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3887/head^{commit} # timeout=10
 > git rev-parse origin/pr/3887/head^{commit} # timeout=10
ERROR: Couldn't find any revision to build. Verify the repository and branch 
configuration for this job.
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse refs/remotes/origin/pr/3887/head^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3887/head^{commit} # timeout=10
 > git rev-parse origin/pr/3887/head^{commit} # timeout=10
ERROR: Couldn't find any revision to build. Verify the repository and branch 
configuration for this job.
Not sending mail to unregistered user k...@google.com
Not sending mail to unregistered user mil...@google.com
Not sending mail to unregistered user 
re...@relax-macbookpro.roam.corp.google.com


[GitHub] beam pull request #3888: [BEAM-2982] Use the SubscriptionProvider in PubsubU...

2017-09-22 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3888

[BEAM-2982] Use the SubscriptionProvider in PubsubUnboundedSource

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
During expansion, a ValueProvider may not be accessible. This ensures
that if the subscription is based on a value provider, it will only be
evaluated when that ValueProvider is bound, rather than at construction
time.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam b_2982

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3888.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3888


commit 23e56d0dee8c70b33eb0364a1cde94125399dadf
Author: Thomas Groh 
Date:   2017-09-22T18:47:19Z

Use the SubscriptionProvider in PubsubUnboundedSource

During expansion, a ValueProvider may not be accessible. This ensures
that if the subscription is based on a value provider, it will only be
evaluated when that ValueProvider is bound, rather than at construction
time.




---


[jira] [Commented] (BEAM-2982) PubSubIO.readMessages().fromSubscription(...) doesn't work with ValueProvider

2017-09-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16176922#comment-16176922
 ] 

ASF GitHub Bot commented on BEAM-2982:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3888

[BEAM-2982] Use the SubscriptionProvider in PubsubUnboundedSource

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
During expansion, a ValueProvider may not be accessible. This ensures
that if the subscription is based on a value provider, it will only be
evaluated when that ValueProvider is bound, rather than at construction
time.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam b_2982

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3888.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3888


commit 23e56d0dee8c70b33eb0364a1cde94125399dadf
Author: Thomas Groh 
Date:   2017-09-22T18:47:19Z

Use the SubscriptionProvider in PubsubUnboundedSource

During expansion, a ValueProvider may not be accessible. This ensures
that if the subscription is based on a value provider, it will only be
evaluated when that ValueProvider is bound, rather than at construction
time.




> PubSubIO.readMessages().fromSubscription(...) doesn't work with ValueProvider
> -
>
> Key: BEAM-2982
> URL: https://issues.apache.org/jira/browse/BEAM-2982
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Ben Chambers
>Assignee: Thomas Groh
>
> Originally reported on Stack Overflow:
> https://stackoverflow.com/questions/46360584/apache-beam-template-runtime-context-error
> ---
> In the `PubsubUnboundedSource#expand` method we create the PubsubSource:
> https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/pubsub/PubsubUnboundedSource.java#L1399
> Creating the PubsubSource calls `getSubscription` which attempts to get the 
> value out of a value provider.
> To support templatization, the PubsubSource needs to take the ValueProvider, 
> and only get the subscription out at pipeline execution time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)