[jira] [Commented] (BEAM-2286) Improve mobile gaming example user experience

2017-05-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009175#comment-16009175
 ] 

María GH commented on BEAM-2286:


I encountered the same issues described. Although, 1. is an understatement!--my 
Linux machine completely froze for 10 minutes when using the default input. 
While the default input gets changed, a note should be added to the 
documentation:
https://beam.apache.org/get-started/mobile-gaming-example/

Smaller datasets run without any problems


> Improve mobile gaming example user experience
> -
>
> Key: BEAM-2286
> URL: https://issues.apache.org/jira/browse/BEAM-2286
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java, sdk-py, website
>Reporter: Vikas Kedigehalli
>Assignee: Ahmet Altay
>Priority: Minor
>
> Few things I noticed while running mobile gaming example that could be 
> improved:
> 1. When running on direct runner the default input is too large (20G), so it 
> seems as though the pipeline is stuck without an progress updates or metrics.
> This could be improved by using a much smaller dataset by default.
> 2. Even when running on dataflow runner, with default worker settings and 
> auto scaling, it still takes more than 30 minutes to run. We could use a much 
> smaller dataset here too. 
> Also the documentation of these examples could be improved in both the code 
> docstring as well as beam quick start guide. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2286) Improve mobile gaming example user experience

2017-05-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

María GH updated BEAM-2286:
---
Environment: LInux

> Improve mobile gaming example user experience
> -
>
> Key: BEAM-2286
> URL: https://issues.apache.org/jira/browse/BEAM-2286
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java, sdk-py, website
> Environment: LInux
>Reporter: Vikas Kedigehalli
>Assignee: Ahmet Altay
>Priority: Minor
>
> Few things I noticed while running mobile gaming example that could be 
> improved:
> 1. When running on direct runner the default input is too large (20G), so it 
> seems as though the pipeline is stuck without an progress updates or metrics.
> This could be improved by using a much smaller dataset by default.
> 2. Even when running on dataflow runner, with default worker settings and 
> auto scaling, it still takes more than 30 minutes to run. We could use a much 
> smaller dataset here too. 
> Also the documentation of these examples could be improved in both the code 
> docstring as well as beam quick start guide. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-2286) Improve mobile gaming example user experience

2017-05-12 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009175#comment-16009175
 ] 

María GH edited comment on BEAM-2286 at 5/13/17 6:39 AM:
-

I encountered the same issues described. Although, 1. is an understatement!--my 
Linux machine completely froze for 10 minutes when using the default input. 
While the default input gets changed, a note should be added to the 
documentation:
https://beam.apache.org/get-started/mobile-gaming-example/

Smaller datasets run without any problems.



was (Author: mariagh):
I encountered the same issues described. Although, 1. is an understatement!--my 
Linux machine completely froze for 10 minutes when using the default input. 
While the default input gets changed, a note should be added to the 
documentation:
https://beam.apache.org/get-started/mobile-gaming-example/

Smaller datasets run without any problems


> Improve mobile gaming example user experience
> -
>
> Key: BEAM-2286
> URL: https://issues.apache.org/jira/browse/BEAM-2286
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java, sdk-py, website
>Reporter: Vikas Kedigehalli
>Assignee: Ahmet Altay
>Priority: Minor
>
> Few things I noticed while running mobile gaming example that could be 
> improved:
> 1. When running on direct runner the default input is too large (20G), so it 
> seems as though the pipeline is stuck without an progress updates or metrics.
> This could be improved by using a much smaller dataset by default.
> 2. Even when running on dataflow runner, with default worker settings and 
> auto scaling, it still takes more than 30 minutes to run. We could use a much 
> smaller dataset here too. 
> Also the documentation of these examples could be improved in both the code 
> docstring as well as beam quick start guide. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2195) support conditional functions & operators

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009162#comment-16009162
 ] 

ASF GitHub Bot commented on BEAM-2195:
--

Github user xumingming closed the pull request at:

https://github.com/apache/beam/pull/3042


> support conditional functions & operators
> -
>
> Key: BEAM-2195
> URL: https://issues.apache.org/jira/browse/BEAM-2195
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
> Fix For: 2.1.0
>
>
> https://calcite.apache.org/docs/reference.html#conditional-functions-and-operators



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3042: [BEAM-2195] support conditional operators

2017-05-12 Thread xumingming
Github user xumingming closed the pull request at:

https://github.com/apache/beam/pull/3042


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (BEAM-2195) support conditional functions & operators

2017-05-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré resolved BEAM-2195.

   Resolution: Fixed
Fix Version/s: 2.1.0

> support conditional functions & operators
> -
>
> Key: BEAM-2195
> URL: https://issues.apache.org/jira/browse/BEAM-2195
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
> Fix For: 2.1.0
>
>
> https://calcite.apache.org/docs/reference.html#conditional-functions-and-operators



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2809

2017-05-12 Thread Apache Jenkins Server
See 




[2/2] beam git commit: [BEAM-2195] This closes #3042

2017-05-12 Thread jbonofre
[BEAM-2195] This closes #3042


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8bb59840
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8bb59840
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8bb59840

Branch: refs/heads/DSL_SQL
Commit: 8bb59840be32ef945cfb7151ab7f8368fabb19f0
Parents: 523482b 12dd804
Author: Jean-Baptiste Onofré 
Authored: Sat May 13 08:09:08 2017 +0200
Committer: Jean-Baptiste Onofré 
Committed: Sat May 13 08:09:08 2017 +0200

--
 .../dsls/sql/interpreter/BeamSQLFnExecutor.java |  3 +
 .../operator/BeamSqlCaseExpression.java | 64 +
 .../sql/interpreter/BeamSQLFnExecutorTest.java  | 11 +++
 .../operator/BeamSqlCaseExpressionTest.java | 94 
 4 files changed, 172 insertions(+)
--




[1/2] beam git commit: [BEAM-2195] Implement conditional operator (CASE)

2017-05-12 Thread jbonofre
Repository: beam
Updated Branches:
  refs/heads/DSL_SQL 523482be0 -> 8bb59840b


[BEAM-2195] Implement conditional operator (CASE)


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/12dd8046
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/12dd8046
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/12dd8046

Branch: refs/heads/DSL_SQL
Commit: 12dd804678bfaad4705c8f1d50eaf03a086f6daf
Parents: 523482b
Author: James Xu 
Authored: Wed May 10 17:19:48 2017 +0800
Committer: Jean-Baptiste Onofré 
Committed: Sat May 13 08:08:23 2017 +0200

--
 .../dsls/sql/interpreter/BeamSQLFnExecutor.java |  3 +
 .../operator/BeamSqlCaseExpression.java | 64 +
 .../sql/interpreter/BeamSQLFnExecutorTest.java  | 11 +++
 .../operator/BeamSqlCaseExpressionTest.java | 94 
 4 files changed, 172 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/12dd8046/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/BeamSQLFnExecutor.java
--
diff --git 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/BeamSQLFnExecutor.java
 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/BeamSQLFnExecutor.java
index 4ae7b33..4b7af2a 100644
--- 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/BeamSQLFnExecutor.java
+++ 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/BeamSQLFnExecutor.java
@@ -22,6 +22,7 @@ import java.util.List;
 
 import org.apache.beam.dsls.sql.exception.BeamSqlUnsupportedException;
 import org.apache.beam.dsls.sql.interpreter.operator.BeamSqlAndExpression;
+import org.apache.beam.dsls.sql.interpreter.operator.BeamSqlCaseExpression;
 import org.apache.beam.dsls.sql.interpreter.operator.BeamSqlEqualExpression;
 import org.apache.beam.dsls.sql.interpreter.operator.BeamSqlExpression;
 import org.apache.beam.dsls.sql.interpreter.operator.BeamSqlInputRefExpression;
@@ -159,6 +160,8 @@ public class BeamSQLFnExecutor implements 
BeamSQLExpressionExecutor {
 case "INITCAP":
   return new BeamSqlInitCapExpression(subExps);
 
+case "CASE":
+  return new BeamSqlCaseExpression(subExps);
 
 case "IS NULL":
   return new BeamSqlIsNullExpression(subExps.get(0));

http://git-wip-us.apache.org/repos/asf/beam/blob/12dd8046/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlCaseExpression.java
--
diff --git 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlCaseExpression.java
 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlCaseExpression.java
new file mode 100644
index 000..d108abd
--- /dev/null
+++ 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlCaseExpression.java
@@ -0,0 +1,64 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.dsls.sql.interpreter.operator;
+
+import java.util.List;
+
+import org.apache.beam.dsls.sql.schema.BeamSQLRow;
+import org.apache.calcite.sql.type.SqlTypeName;
+
+/**
+ *  {@code BeamSqlCaseExpression} represents CASE, NULLIF, COALESCE in SQL.
+ */
+public class BeamSqlCaseExpression extends BeamSqlExpression {
+  public BeamSqlCaseExpression(List operands) {
+// the return type of CASE is the type of the `else` condition
+super(operands, operands.get(operands.size() - 1).getOutputType());
+  }
+
+  @Override public boolean accept() {
+// `when`-`then` pair + `else`
+if (operands.size() % 2 != 1) {
+  return false;
+}
+
+for (int i = 0; i < operands.size() - 1; i += 2) {
+  if (opType(i) != SqlTypeName.BOOLEAN) {
+return false;
+  } else if (opType(i + 1) != outputType) {
+return false;
+  }
+}
+
+return true;
+  }
+
+  @Override public BeamSqlPrimitive evaluate(BeamSQLRow inputRecord) {
+for (int i = 0; i < operands.size() - 1; i += 2) {
+  if (op

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3123

2017-05-12 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2280) Examples archetypes build fail when upgrading

2017-05-12 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré resolved BEAM-2280.

   Resolution: Duplicate
Fix Version/s: Not applicable

> Examples archetypes build fail when upgrading
> -
>
> Key: BEAM-2280
> URL: https://issues.apache.org/jira/browse/BEAM-2280
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
> Fix For: Not applicable
>
>
> WriteWindowedFilesDoFn.java is removed in 2.0.0, but exists in previous 
> versions.
> However, maven-archetypes/examples/generate-sources.sh only rsync existing 
> files, and won't remove the leftover WriteWindowedFilesDoFn.java.
> So, if users have built the module in 0.6.0, after upgrading the built will 
> fail unless WriteWindowedFilesDoFn.java is removed manually.
> LOG
> [INFO] [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile 
> (default-compile) on project basic: Compilation failure: Compilation failure:
> [INFO] [ERROR] 
> /Users/peihe/github/beam/sdks/java/maven-archetypes/examples-java8/target/test-classes/projects/basic/project/basic/src/main/java/it/pkg/common/WriteWindowedFilesDoFn.java:[28,32]
>  cannot find symbol
> [INFO] [ERROR] symbol:   class IOChannelFactory
> [INFO] [ERROR] location: package org.apache.beam.sdk.util
> [INFO] [ERROR] 
> /Users/peihe/github/beam/sdks/java/maven-archetypes/examples-java8/target/test-classes/projects/basic/project/basic/src/main/java/it/pkg/common/WriteWindowedFilesDoFn.java:[29,32]
>  cannot find symbol
> [INFO] [ERROR] symbol:   class IOChannelUtils
> [INFO] [ERROR] location: package org.apache.beam.sdk.util
> [INFO] [ERROR] -> [Help 1]
> [INFO] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to 
> execute goal org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile 
> (default-compile) on project basic: Compilation failure



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


svn commit: r19653 - /dev/beam/2.0.0-RC4/

2017-05-12 Thread davor
Author: davor
Date: Sat May 13 04:40:17 2017
New Revision: 19653

Log:
Add Apache Beam, version 2.0.0, release candidate #4


Added:
dev/beam/2.0.0-RC4/
dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip   (with props)
dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.asc
dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.md5
dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.sha1
dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip   (with props)
dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.asc
dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.md5
dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.sha1

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip
==
Binary file - no diff available.

Propchange: dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip
--
svn:mime-type = application/octet-stream

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.asc
==
--- dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.asc (added)
+++ dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.asc Sat May 13 04:40:17 2017
@@ -0,0 +1,11 @@
+-BEGIN PGP SIGNATURE-
+Version: GnuPG v1
+
+iQEcBAABAgAGBQJZFo3pAAoJEMkEN+GPDTNPIL8H/Aw98I6RFpLcIEITnTh1EZeG
+ZQZR5OPWF2uZJiseYrcJKeh/V33Kphnlj0w2CH4emHDgMLTQhuAC50pFif82OVEm
+Ogq1BykjZsvZAbSN3me83TD5jqZaKfbzcQOP1Q3QHI1j92OVRhGBohgU7BUEnA4I
+ekHeWkFxpudo6HstE+o+Xfhps5Fh6Ucm1USVrN94apGAcdmgPHKxcjeV9zG0EMYG
+i5z0FmXhG13jTfWEpd3qkPqlGeHXuDCopEH0e++HOMo7EELnvQsLhsdxko7WA4sM
+6MCSMugUGc0EmKI3anxu8kfyzIERZl3CVucwIgxECVYlHCUG8TsSVJOZ/Y6d5Ak=
+=wtZK
+-END PGP SIGNATURE-

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.md5
==
--- dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.md5 (added)
+++ dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.md5 Sat May 13 04:40:17 2017
@@ -0,0 +1 @@
+ec2eee1939284afa3993ea4c55cea8f3  apache-beam-2.0.0-python.zip

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.sha1
==
--- dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.sha1 (added)
+++ dev/beam/2.0.0-RC4/apache-beam-2.0.0-python.zip.sha1 Sat May 13 04:40:17 
2017
@@ -0,0 +1 @@
+eae09f8bd00264ab297f93c28929fb381daad5d4  apache-beam-2.0.0-python.zip

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip
==
Binary file - no diff available.

Propchange: dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip
--
svn:mime-type = application/octet-stream

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.asc
==
--- dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.asc (added)
+++ dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.asc Sat May 13 
04:40:17 2017
@@ -0,0 +1,11 @@
+-BEGIN PGP SIGNATURE-
+Version: GnuPG v1
+
+iQEcBAABAgAGBQJZFo4CAAoJEMkEN+GPDTNPWYEH/0U1cAyS8tMQDHSIBKws2eHx
++HestkX3b5Skl8s6PanJXGxnnGvGRCaVkDgIyhK56OKtcMikhAPRyvbuCr8dvrog
+lme8fjHoEqPgfKdkyMJyMuva25vhDAa30aX33UkC3hU/+9Mmk+nIJZ8as+/H1zan
++vK0YTnPeivwe+csnygxKtAj2WvQgohHbXC4nm6Hrx1G1pawojVZKECrPdE4EPu3
+lv44btdkcLN1XcWi/Mt7SJhQQJUC4nCuLAMR1sTVXr6992EagjYx7lIAh3/gGnX4
+iDCXR/ZNEfdYZ5kTpZ+GSVdALxjv4/+07rsdIvkBLLUw8dCvksk6cLueUO0VyWk=
+=PjzK
+-END PGP SIGNATURE-

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.md5
==
--- dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.md5 (added)
+++ dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.md5 Sat May 13 
04:40:17 2017
@@ -0,0 +1 @@
+6931d1ad79aed5d25330da27af0c792e  apache-beam-2.0.0-source-release.zip

Added: dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.sha1
==
--- dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.sha1 (added)
+++ dev/beam/2.0.0-RC4/apache-beam-2.0.0-source-release.zip.sha1 Sat May 13 
04:40:17 2017
@@ -0,0 +1 @@
+4713e8c44021cd9ec8b06d93aa6b4d86a704c3c6  apache-beam-2.0.0-source-release.zip




[jira] [Created] (BEAM-2289) add unit tests for BeamSQLFilterFn and BeamSQLProjectFn

2017-05-12 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2289:


 Summary: add unit tests for BeamSQLFilterFn and BeamSQLProjectFn
 Key: BEAM-2289
 URL: https://issues.apache.org/jira/browse/BEAM-2289
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Xu Mingmin
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2288) support PAssert/TestPipeline for generated pipeline

2017-05-12 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2288:


 Summary: support PAssert/TestPipeline for generated pipeline
 Key: BEAM-2288
 URL: https://issues.apache.org/jira/browse/BEAM-2288
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin


To support to test Beam generated pipeline with Beam standard 
{{PAssert}}/{{TestPipeline}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2234) expose PCollection of each RelNode

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009124#comment-16009124
 ] 

ASF GitHub Bot commented on BEAM-2234:
--

GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3135

[BEAM-2234] expose PCollection of each RelNode

Change the return of `buildBeamPipeline` to `PCollection`, this 
helps to:
1). donot need to manage the upstream-downstream in `BeamPipelineCreator` 
as execution tree already have it;
2). easy for unit test with `PAssert`;

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-2234

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3135


commit ebd263c9e29515038b1ee00dc423da44bef65362
Author: mingmxu 
Date:   2017-05-13T04:14:29Z

change return type of buildBeamPipeline to `PCollection`




> expose PCollection of each RelNode
> --
>
> Key: BEAM-2234
> URL: https://issues.apache.org/jira/browse/BEAM-2234
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>
> Change the return of {{buildBeamPipeline}} to {{PCollection}}, 
> this helps to:
> 1). donot need to manage the upstream-downstream as execution tree already 
> have it;
> 2). easy for unit test with {{PAssert}};



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3135: [BEAM-2234] expose PCollection of each RelNode

2017-05-12 Thread XuMingmin
GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3135

[BEAM-2234] expose PCollection of each RelNode

Change the return of `buildBeamPipeline` to `PCollection`, this 
helps to:
1). donot need to manage the upstream-downstream in `BeamPipelineCreator` 
as execution tree already have it;
2). easy for unit test with `PAssert`;

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-2234

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3135.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3135


commit ebd263c9e29515038b1ee00dc423da44bef65362
Author: mingmxu 
Date:   2017-05-13T04:14:29Z

change return type of buildBeamPipeline to `PCollection`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2808

2017-05-12 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-1345) Mark @Experimental and @Internal where needed in user-facing bits of the codebase

2017-05-12 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci resolved BEAM-1345.

Resolution: Fixed

> Mark @Experimental and @Internal where needed in user-facing bits of the 
> codebase
> -
>
> Key: BEAM-1345
> URL: https://issues.apache.org/jira/browse/BEAM-1345
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> A blanket JIRA to ensure that before a stable release we make sure to mark 
> those pieces that would be unwise to freeze yet and consider how best to 
> communicate this to users, who may just autocomplete those features in their 
> IDE anyhow. (conversely, put infrastructure in place to enforce freezing of 
> the rest)
> Not technically "backwards incompatible" with pre-Beam Dataflow SDKs, but 
> certainly needs to be on the burndown for the first stable release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #2215

2017-05-12 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-2234) expose PCollection of each RelNode

2017-05-12 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin updated BEAM-2234:
-
Summary: expose PCollection of each RelNode  (was: expose result 
PCollection)

> expose PCollection of each RelNode
> --
>
> Key: BEAM-2234
> URL: https://issues.apache.org/jira/browse/BEAM-2234
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>
> To enable unit test with standard {{TestPipeline}}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2234) expose PCollection of each RelNode

2017-05-12 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin updated BEAM-2234:
-
Description: 
Change the return of {{buildBeamPipeline}} to {{PCollection}}, this 
helps to:
1). donot need to manage the upstream-downstream as execution tree already have 
it;
2). easy for unit test with {{PAssert}};

  was:To enable unit test with standard {{TestPipeline}}


> expose PCollection of each RelNode
> --
>
> Key: BEAM-2234
> URL: https://issues.apache.org/jira/browse/BEAM-2234
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>
> Change the return of {{buildBeamPipeline}} to {{PCollection}}, 
> this helps to:
> 1). donot need to manage the upstream-downstream as execution tree already 
> have it;
> 2). easy for unit test with {{PAssert}};



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2196) support UDF

2017-05-12 Thread Xu Mingmin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009107#comment-16009107
 ] 

Xu Mingmin commented on BEAM-2196:
--

[~xumingming], thanks for document it, overall looks good to me, but prefer to 
focus on UDF here first. I've one implementation already, could submit it if 
you haven't started.

Btw, I cannot receive JIRA notification emails, feel free to ping me on Slack 
if no response.

> support UDF
> ---
>
> Key: BEAM-2196
> URL: https://issues.apache.org/jira/browse/BEAM-2196
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2287) abstract aggregation holder for UDAF support

2017-05-12 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2287:


 Summary: abstract aggregation holder for UDAF support
 Key: BEAM-2287
 URL: https://issues.apache.org/jira/browse/BEAM-2287
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin


Create an aggregation wrapper, to accept UDAF functions;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2008) aggregation functions support

2017-05-12 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin resolved BEAM-2008.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> aggregation functions support
> -
>
> Key: BEAM-2008
> URL: https://issues.apache.org/jira/browse/BEAM-2008
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
> Fix For: Not applicable
>
>
> Support common-used aggregation functions in SQL, including:
> COUNT
> SUM
> MAX
> MIN



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3122

2017-05-12 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3121

2017-05-12 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PostCommit_Java_MavenInstall #3800

2017-05-12 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2008) aggregation functions support

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009001#comment-16009001
 ] 

ASF GitHub Bot commented on BEAM-2008:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3067


> aggregation functions support
> -
>
> Key: BEAM-2008
> URL: https://issues.apache.org/jira/browse/BEAM-2008
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>
> Support common-used aggregation functions in SQL, including:
> COUNT
> SUM
> MAX
> MIN



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3067: [BEAM-2008] aggregation functions support

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3067


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Support common-used aggregation functions in SQL, including: COUNT, SUM, AVG, MAX, MIN

2017-05-12 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/DSL_SQL 6729a027d -> 523482be0


Support common-used aggregation functions in SQL, including:
  COUNT,SUM,AVG,MAX,MIN

rename BeamAggregationTransform to BeamAggregationTransforms

update comments


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f728fbe5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f728fbe5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f728fbe5

Branch: refs/heads/DSL_SQL
Commit: f728fbe5c7153341ace046fa8b2465ef8844be1b
Parents: 6729a02
Author: mingmxu 
Authored: Wed May 10 16:38:13 2017 -0700
Committer: mingmxu 
Committed: Wed May 10 20:47:40 2017 -0700

--
 .../interpreter/operator/BeamSqlPrimitive.java  |  35 +
 .../beam/dsls/sql/rel/BeamAggregationRel.java   |  40 +-
 .../apache/beam/dsls/sql/schema/BeamSQLRow.java |   4 +
 .../beam/dsls/sql/schema/BeamSqlRowCoder.java   |   4 +-
 .../sql/transform/BeamAggregationTransform.java | 120 
 .../transform/BeamAggregationTransforms.java| 671 +++
 .../transform/BeamAggregationTransformTest.java | 436 
 .../schema/transform/BeamTransformBaseTest.java |  96 +++
 8 files changed, 1261 insertions(+), 145 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f728fbe5/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlPrimitive.java
--
diff --git 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlPrimitive.java
 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlPrimitive.java
index 3309577..a5938f3 100644
--- 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlPrimitive.java
+++ 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/interpreter/operator/BeamSqlPrimitive.java
@@ -65,6 +65,41 @@ public class BeamSqlPrimitive extends BeamSqlExpression{
 return value;
   }
 
+  public long getLong() {
+return (Long) getValue();
+  }
+
+  public double getDouble() {
+return (Double) getValue();
+  }
+
+  public float getFloat() {
+return (Float) getValue();
+  }
+
+  public int getInteger() {
+return (Integer) getValue();
+  }
+
+  public short getShort() {
+return (Short) getValue();
+  }
+
+  public byte getByte() {
+return (Byte) getValue();
+  }
+  public boolean getBoolean() {
+return (Boolean) getValue();
+  }
+
+  public String getString() {
+return (String) getValue();
+  }
+
+  public Date getDate() {
+return (Date) getValue();
+  }
+
   @Override
   public boolean accept() {
 if (value == null) {

http://git-wip-us.apache.org/repos/asf/beam/blob/f728fbe5/dsls/sql/src/main/java/org/apache/beam/dsls/sql/rel/BeamAggregationRel.java
--
diff --git 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/rel/BeamAggregationRel.java 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/rel/BeamAggregationRel.java
index 2c7626d..ab98cc4 100644
--- 
a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/rel/BeamAggregationRel.java
+++ 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/rel/BeamAggregationRel.java
@@ -18,15 +18,13 @@
 package org.apache.beam.dsls.sql.rel;
 
 import java.util.List;
-import org.apache.beam.dsls.sql.exception.BeamSqlUnsupportedException;
 import org.apache.beam.dsls.sql.planner.BeamPipelineCreator;
 import org.apache.beam.dsls.sql.planner.BeamSQLRelUtils;
 import org.apache.beam.dsls.sql.schema.BeamSQLRecordType;
 import org.apache.beam.dsls.sql.schema.BeamSQLRow;
-import org.apache.beam.dsls.sql.transform.BeamAggregationTransform;
+import org.apache.beam.dsls.sql.transform.BeamAggregationTransforms;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.transforms.Combine;
-import org.apache.beam.sdk.transforms.Count;
 import org.apache.beam.sdk.transforms.GroupByKey;
 import org.apache.beam.sdk.transforms.ParDo;
 import org.apache.beam.sdk.transforms.WithKeys;
@@ -79,7 +77,7 @@ public class BeamAggregationRel extends Aggregate implements 
BeamRelNode {
 PCollection upstream = planCreator.popUpstream();
 if (windowFieldIdx != -1) {
   upstream = upstream.apply("assignEventTimestamp", WithTimestamps
-  .of(new 
BeamAggregationTransform.WindowTimestampFn(windowFieldIdx)));
+  .of(new 
BeamAggregationTransforms.WindowTimestampFn(windowFieldIdx)));
 }
 
 PCollection windowStream = upstream.apply("window",
@@ -88,32 +86,26 @@ public class BeamAggregationRel extends Aggregate 
implements BeamRelNode {
 .withAllowedLateness(allowedLatence)
 .accumulatingFiredPanes());
 
+//1. extract fields in group-by key part
 PCollection> exGroupByStream = 
windowStream.apply("exGroupBy",

[2/2] beam git commit: This closes #3067

2017-05-12 Thread dhalperi
This closes #3067


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/523482be
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/523482be
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/523482be

Branch: refs/heads/DSL_SQL
Commit: 523482be0501a7bce79087f47c7752b900178a00
Parents: 6729a02 f728fbe
Author: Dan Halperin 
Authored: Fri May 12 17:47:06 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 17:47:06 2017 -0700

--
 .../interpreter/operator/BeamSqlPrimitive.java  |  35 +
 .../beam/dsls/sql/rel/BeamAggregationRel.java   |  40 +-
 .../apache/beam/dsls/sql/schema/BeamSQLRow.java |   4 +
 .../beam/dsls/sql/schema/BeamSqlRowCoder.java   |   4 +-
 .../sql/transform/BeamAggregationTransform.java | 120 
 .../transform/BeamAggregationTransforms.java| 671 +++
 .../transform/BeamAggregationTransformTest.java | 436 
 .../schema/transform/BeamTransformBaseTest.java |  96 +++
 8 files changed, 1261 insertions(+), 145 deletions(-)
--




Build failed in Jenkins: beam_PostCommit_Python_Verify #2214

2017-05-12 Thread Apache Jenkins Server
See 


Changes:

[dhalperi] [BEAM-2279] Fix archetype breakages

--
[...truncated 591.57 KB...]
{
  "kind": "ParallelDo", 
  "name": "s14", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_merge_tagged_vals_under_key"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s13"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s15", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Unkey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s14"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Unkey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s16", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapp

Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2807

2017-05-12 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3799

2017-05-12 Thread Apache Jenkins Server
See 


Changes:

[dhalperi] [BEAM-2279] Fix archetype breakages

--
[...truncated 2.35 MB...]
Error occurred in starting fork, check output in log
Process Exit Code: 1
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:494)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:441)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:292)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:243)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1077)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeAfterPreconditionsChecked(AbstractSurefireMojo.java:907)
at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.execute(AbstractSurefireMojo.java:785)
at 
org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:208)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at 
org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
at 
org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
at 
org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
at 
org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
at 
org.jvnet.hudson.maven3.launcher.Maven33Launcher.main(Maven33Launcher.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.launchStandard(Launcher.java:330)
at 
org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:238)
at jenkins.maven3.agent.Maven33Main.launch(Maven33Main.java:176)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at hudson.maven.Maven3Builder.call(Maven3Builder.java:139)
at hudson.maven.Maven3Builder.call(Maven3Builder.java:70)
at hudson.remoting.UserRequest.perform(UserRequest.java:153)
at hudson.remoting.UserRequest.perform(UserRequest.java:50)
at hudson.remoting.Request$2.run(Request.java:336)
at 
hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:68)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.maven.surefire.booter.SurefireBooterForkException: The 
forked VM terminated without properly saying goodbye. VM crash or System.exit 
called?
Command was /bin/sh -c cd 

 && /usr/local/asfpackages/java/jdk1.8.0_121/jre/bin/java 
org.apache.maven.surefire.booter.ForkedBooter 

 2017-05-12T23-04-33_800-jvmRun1 surefire9007512650165010654tmp 
surefire_478618043086786122453tmp
Error occurred in starting fork, check output in log
Process Exit Code: 1
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:679)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.fork(ForkStarter.java:533)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.access$600(ForkStarter.java:117)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.java:429)
at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter$2.call(ForkStarter.j

beam git commit: [maven-release-plugin] rollback changes from release preparation of v2.0.0-RC4

2017-05-12 Thread davor
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 234bda83a -> 443c0e4a5


[maven-release-plugin] rollback changes from release preparation of v2.0.0-RC4


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/443c0e4a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/443c0e4a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/443c0e4a

Branch: refs/heads/release-2.0.0
Commit: 443c0e4a5d9b8a9a2ee8144af67effe4f5292bcd
Parents: 234bda8
Author: Davor Bonaci 
Authored: Fri May 12 16:50:26 2017 -0700
Committer: Davor Bonaci 
Committed: Fri May 12 16:50:26 2017 -0700

--
 examples/java/pom.xml   | 2 +-
 examples/java8/pom.xml  | 2 +-
 examples/pom.xml| 2 +-
 pom.xml | 4 ++--
 runners/apex/pom.xml| 2 +-
 runners/core-construction-java/pom.xml  | 2 +-
 runners/core-java/pom.xml   | 2 +-
 runners/direct-java/pom.xml | 2 +-
 runners/flink/pom.xml   | 2 +-
 runners/google-cloud-dataflow-java/pom.xml  | 2 +-
 runners/pom.xml | 2 +-
 runners/spark/pom.xml   | 2 +-
 sdks/common/fn-api/pom.xml  | 2 +-
 sdks/common/pom.xml | 2 +-
 sdks/common/runner-api/pom.xml  | 2 +-
 sdks/java/build-tools/pom.xml   | 2 +-
 sdks/java/core/pom.xml  | 2 +-
 sdks/java/extensions/google-cloud-platform-core/pom.xml | 2 +-
 sdks/java/extensions/jackson/pom.xml| 2 +-
 sdks/java/extensions/join-library/pom.xml   | 2 +-
 sdks/java/extensions/pom.xml| 2 +-
 sdks/java/extensions/protobuf/pom.xml   | 2 +-
 sdks/java/extensions/sorter/pom.xml | 2 +-
 sdks/java/harness/pom.xml   | 2 +-
 sdks/java/io/common/pom.xml | 2 +-
 sdks/java/io/elasticsearch/pom.xml  | 2 +-
 sdks/java/io/google-cloud-platform/pom.xml  | 2 +-
 sdks/java/io/hadoop-common/pom.xml  | 2 +-
 sdks/java/io/hadoop-file-system/pom.xml | 2 +-
 sdks/java/io/hadoop/input-format/pom.xml| 2 +-
 sdks/java/io/hadoop/jdk1.8-tests/pom.xml| 2 +-
 sdks/java/io/hadoop/pom.xml | 2 +-
 sdks/java/io/hbase/pom.xml  | 2 +-
 sdks/java/io/jdbc/pom.xml   | 2 +-
 sdks/java/io/jms/pom.xml| 2 +-
 sdks/java/io/kafka/pom.xml  | 2 +-
 sdks/java/io/kinesis/pom.xml| 2 +-
 sdks/java/io/mongodb/pom.xml| 2 +-
 sdks/java/io/mqtt/pom.xml   | 2 +-
 sdks/java/io/pom.xml| 2 +-
 sdks/java/io/xml/pom.xml| 2 +-
 sdks/java/java8tests/pom.xml| 2 +-
 sdks/java/javadoc/pom.xml   | 2 +-
 sdks/java/maven-archetypes/examples-java8/pom.xml   | 2 +-
 sdks/java/maven-archetypes/examples/pom.xml | 2 +-
 sdks/java/maven-archetypes/pom.xml  | 2 +-
 sdks/java/maven-archetypes/starter/pom.xml  | 2 +-
 sdks/java/pom.xml   | 2 +-
 sdks/pom.xml| 2 +-
 sdks/python/pom.xml | 2 +-
 50 files changed, 51 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/443c0e4a/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index 64c0115..c124b7e 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.beam
 beam-examples-parent
-2.0.0
+2.0.0-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/443c0e4a/examples/java8/pom.xml
--
diff --git a/examples/java8/pom.xml b/examples/java8/pom.xml
index 07a8908..536e66b 100644
--- a/examples/java8/pom.xml
+++ b/examples/java8/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.beam
 beam-examples-parent
-2.0.0
+2.0.0-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/443c0e4a/examples/pom.xml
-

beam git commit: [maven-release-plugin] prepare release v2.0.0-RC4

2017-05-12 Thread davor
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 6931219d8 -> 234bda83a


[maven-release-plugin] prepare release v2.0.0-RC4


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/234bda83
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/234bda83
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/234bda83

Branch: refs/heads/release-2.0.0
Commit: 234bda83a62c7a8d923dbac2e21224d97f5874db
Parents: 6931219
Author: Davor Bonaci 
Authored: Fri May 12 16:50:12 2017 -0700
Committer: Davor Bonaci 
Committed: Fri May 12 16:50:12 2017 -0700

--
 examples/java/pom.xml   | 2 +-
 examples/java8/pom.xml  | 2 +-
 examples/pom.xml| 2 +-
 pom.xml | 4 ++--
 runners/apex/pom.xml| 2 +-
 runners/core-construction-java/pom.xml  | 2 +-
 runners/core-java/pom.xml   | 2 +-
 runners/direct-java/pom.xml | 2 +-
 runners/flink/pom.xml   | 2 +-
 runners/google-cloud-dataflow-java/pom.xml  | 2 +-
 runners/pom.xml | 2 +-
 runners/spark/pom.xml   | 2 +-
 sdks/common/fn-api/pom.xml  | 2 +-
 sdks/common/pom.xml | 2 +-
 sdks/common/runner-api/pom.xml  | 2 +-
 sdks/java/build-tools/pom.xml   | 2 +-
 sdks/java/core/pom.xml  | 2 +-
 sdks/java/extensions/google-cloud-platform-core/pom.xml | 2 +-
 sdks/java/extensions/jackson/pom.xml| 2 +-
 sdks/java/extensions/join-library/pom.xml   | 2 +-
 sdks/java/extensions/pom.xml| 2 +-
 sdks/java/extensions/protobuf/pom.xml   | 2 +-
 sdks/java/extensions/sorter/pom.xml | 2 +-
 sdks/java/harness/pom.xml   | 2 +-
 sdks/java/io/common/pom.xml | 2 +-
 sdks/java/io/elasticsearch/pom.xml  | 2 +-
 sdks/java/io/google-cloud-platform/pom.xml  | 2 +-
 sdks/java/io/hadoop-common/pom.xml  | 2 +-
 sdks/java/io/hadoop-file-system/pom.xml | 2 +-
 sdks/java/io/hadoop/input-format/pom.xml| 2 +-
 sdks/java/io/hadoop/jdk1.8-tests/pom.xml| 2 +-
 sdks/java/io/hadoop/pom.xml | 2 +-
 sdks/java/io/hbase/pom.xml  | 2 +-
 sdks/java/io/jdbc/pom.xml   | 2 +-
 sdks/java/io/jms/pom.xml| 2 +-
 sdks/java/io/kafka/pom.xml  | 2 +-
 sdks/java/io/kinesis/pom.xml| 2 +-
 sdks/java/io/mongodb/pom.xml| 2 +-
 sdks/java/io/mqtt/pom.xml   | 2 +-
 sdks/java/io/pom.xml| 2 +-
 sdks/java/io/xml/pom.xml| 2 +-
 sdks/java/java8tests/pom.xml| 2 +-
 sdks/java/javadoc/pom.xml   | 2 +-
 sdks/java/maven-archetypes/examples-java8/pom.xml   | 2 +-
 sdks/java/maven-archetypes/examples/pom.xml | 2 +-
 sdks/java/maven-archetypes/pom.xml  | 2 +-
 sdks/java/maven-archetypes/starter/pom.xml  | 2 +-
 sdks/java/pom.xml   | 2 +-
 sdks/pom.xml| 2 +-
 sdks/python/pom.xml | 2 +-
 50 files changed, 51 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/234bda83/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index c124b7e..64c0115 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.beam
 beam-examples-parent
-2.0.0-SNAPSHOT
+2.0.0
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/234bda83/examples/java8/pom.xml
--
diff --git a/examples/java8/pom.xml b/examples/java8/pom.xml
index 536e66b..07a8908 100644
--- a/examples/java8/pom.xml
+++ b/examples/java8/pom.xml
@@ -22,7 +22,7 @@
   
 org.apache.beam
 beam-examples-parent
-2.0.0-SNAPSHOT
+2.0.0
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/234bda83/examples/pom.xml
--

[beam] Git Push Summary

2017-05-12 Thread davor
Repository: beam
Updated Tags:  refs/tags/v2.0.0-RC4 [created] 52327b294


[GitHub] beam pull request #3134: [BEAM-1345] Remove FileSystems.setDefaultConfigInWo...

2017-05-12 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/3134

[BEAM-1345] Remove FileSystems.setDefaultConfigInWorkers since Dataflow no 
longer depends on this

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3134.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3134


commit eb6d85808f748b40d15e4111e503ce1d82077892
Author: Luke Cwik 
Date:   2017-05-12T23:44:10Z

[BEAM-1345] Remove FileSystems.setDefaultConfigInWorkers since Dataflow no 
longer depends on this




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1345) Mark @Experimental and @Internal where needed in user-facing bits of the codebase

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008921#comment-16008921
 ] 

ASF GitHub Bot commented on BEAM-1345:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/3134

[BEAM-1345] Remove FileSystems.setDefaultConfigInWorkers since Dataflow no 
longer depends on this

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam cleanup

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3134.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3134


commit eb6d85808f748b40d15e4111e503ce1d82077892
Author: Luke Cwik 
Date:   2017-05-12T23:44:10Z

[BEAM-1345] Remove FileSystems.setDefaultConfigInWorkers since Dataflow no 
longer depends on this




> Mark @Experimental and @Internal where needed in user-facing bits of the 
> codebase
> -
>
> Key: BEAM-1345
> URL: https://issues.apache.org/jira/browse/BEAM-1345
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> A blanket JIRA to ensure that before a stable release we make sure to mark 
> those pieces that would be unwise to freeze yet and consider how best to 
> communicate this to users, who may just autocomplete those features in their 
> IDE anyhow. (conversely, put infrastructure in place to enforce freezing of 
> the rest)
> Not technically "backwards incompatible" with pre-Beam Dataflow SDKs, but 
> certainly needs to be on the burndown for the first stable release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1013) Recheck all existing programming guide code snippets for correctness

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008920#comment-16008920
 ] 

ASF GitHub Bot commented on BEAM-1013:
--

GitHub user jkff opened a pull request:

https://github.com/apache/beam-site/pull/243

[BEAM-1013, BEAM-1353] Website fixups after PTransform Style Guide changes

R: @davorbonaci 
CC: @melap 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/beam-site style-guide

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #243


commit 680af94ba320a601c42dad4970fcfc0c89191e04
Author: Eugene Kirpichov 
Date:   2017-05-12T23:06:09Z

Rewrites the section on Coders to not talk about them as a parsing mechanism

commit 001f20e7ad905a655923d4efa1180d9d146272b5
Author: Eugene Kirpichov 
Date:   2017-05-12T23:07:19Z

[BEAM-1353] Style Guide fixups

Fixes usages of PTransforms affected by changes as part of
https://issues.apache.org/jira/browse/BEAM-1353




> Recheck all existing programming guide code snippets for correctness
> 
>
> Key: BEAM-1013
> URL: https://issues.apache.org/jira/browse/BEAM-1013
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Melissa Pashniak
>Assignee: Eugene Kirpichov
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam-site pull request #243: [BEAM-1013, BEAM-1353] Website fixups after PTr...

2017-05-12 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam-site/pull/243

[BEAM-1013, BEAM-1353] Website fixups after PTransform Style Guide changes

R: @davorbonaci 
CC: @melap 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/beam-site style-guide

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/243.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #243


commit 680af94ba320a601c42dad4970fcfc0c89191e04
Author: Eugene Kirpichov 
Date:   2017-05-12T23:06:09Z

Rewrites the section on Coders to not talk about them as a parsing mechanism

commit 001f20e7ad905a655923d4efa1180d9d146272b5
Author: Eugene Kirpichov 
Date:   2017-05-12T23:07:19Z

[BEAM-1353] Style Guide fixups

Fixes usages of PTransforms affected by changes as part of
https://issues.apache.org/jira/browse/BEAM-1353




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-2286) Improve mobile gaming example user experience

2017-05-12 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-2286:
--
Component/s: website

> Improve mobile gaming example user experience
> -
>
> Key: BEAM-2286
> URL: https://issues.apache.org/jira/browse/BEAM-2286
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java, sdk-py, website
>Reporter: Vikas Kedigehalli
>Assignee: Ahmet Altay
>Priority: Minor
>
> Few things I noticed while running mobile gaming example that could be 
> improved:
> 1. When running on direct runner the default input is too large (20G), so it 
> seems as though the pipeline is stuck without an progress updates or metrics.
> This could be improved by using a much smaller dataset by default.
> 2. Even when running on dataflow runner, with default worker settings and 
> auto scaling, it still takes more than 30 minutes to run. We could use a much 
> smaller dataset here too. 
> Also the documentation of these examples could be improved in both the code 
> docstring as well as beam quick start guide. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2286) Improve mobile gaming example user experience

2017-05-12 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay updated BEAM-2286:
--
Component/s: examples-java

> Improve mobile gaming example user experience
> -
>
> Key: BEAM-2286
> URL: https://issues.apache.org/jira/browse/BEAM-2286
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java, sdk-py, website
>Reporter: Vikas Kedigehalli
>Assignee: Ahmet Altay
>Priority: Minor
>
> Few things I noticed while running mobile gaming example that could be 
> improved:
> 1. When running on direct runner the default input is too large (20G), so it 
> seems as though the pipeline is stuck without an progress updates or metrics.
> This could be improved by using a much smaller dataset by default.
> 2. Even when running on dataflow runner, with default worker settings and 
> auto scaling, it still takes more than 30 minutes to run. We could use a much 
> smaller dataset here too. 
> Also the documentation of these examples could be improved in both the code 
> docstring as well as beam quick start guide. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2286) Improve mobile gaming example user experience

2017-05-12 Thread Vikas Kedigehalli (JIRA)
Vikas Kedigehalli created BEAM-2286:
---

 Summary: Improve mobile gaming example user experience
 Key: BEAM-2286
 URL: https://issues.apache.org/jira/browse/BEAM-2286
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py
Reporter: Vikas Kedigehalli
Assignee: Ahmet Altay
Priority: Minor


Few things I noticed while running mobile gaming example that could be improved:

1. When running on direct runner the default input is too large (20G), so it 
seems as though the pipeline is stuck without an progress updates or metrics.
This could be improved by using a much smaller dataset by default.

2. Even when running on dataflow runner, with default worker settings and auto 
scaling, it still takes more than 30 minutes to run. We could use a much 
smaller dataset here too. 

Also the documentation of these examples could be improved in both the code 
docstring as well as beam quick start guide. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.

2017-05-12 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-2277.
-
Resolution: Fixed

> IllegalArgumentException when using Hadoop file system for WordCount example.
> -
>
> Key: BEAM-2277
> URL: https://issues.apache.org/jira/browse/BEAM-2277
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Blocker
> Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt 
> --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds 
> have the same scheme, but received file, hdfs.
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>   at 
> org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
>   at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
>   at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.

2017-05-12 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008904#comment-16008904
 ] 

Daniel Halperin commented on BEAM-2277:
---

Using the latest version on relesae-2.0.0 branch 
(6931219d8725ad90f6b18b1bf8219d5adb10ff8a) after I successfully ran this using 
an archetype-generated copy of the examples:

{code}
gcloud dataproc jobs submit spark --cluster jasonkuster-test1-0 --properties 
spark.default.parallelism=200 --class org.apache.beam.examples.WordCount --jars 
./target/java-0.1.jar -- --runner=SparkRunner 
--inputFile=hdfs:///home/dhalperi-hdfs/words.txt 
--output=hdfs:///home/dhalperi-hdfs/output-
{code}

Output:

{code}
dhalperi@jasonkuster-test1-0-m:~$ hadoop fs -cat /home/dhalperi-hdfs/words.txt
17/05/12 23:27:47 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 
1.5.5-hadoop2
a
b
c
d
e
f
dhalperi@jasonkuster-test1-0-m:~$ hadoop fs -cat /home/dhalperi-hdfs/
/home/dhalperi-hdfs/output--0-of-6 
/home/dhalperi-hdfs/output--4-of-6
/home/dhalperi-hdfs/output--1-of-6 
/home/dhalperi-hdfs/output--5-of-6
/home/dhalperi-hdfs/output--2-of-6 
/home/dhalperi-hdfs/.temp-beam-2017-05-132_23-26-39-0
/home/dhalperi-hdfs/output--3-of-6 
/home/dhalperi-hdfs/words.txt
dhalperi@jasonkuster-test1-0-m:~$ hadoop fs -cat /home/dhalperi-hdfs/output*
17/05/12 23:28:05 INFO gcs.GoogleHadoopFileSystemBase: GHFS version: 
1.5.5-hadoop2
a: 1
e: 1
d: 1
b: 1
f: 1
c: 1
{code}

(Notice the double {{--}} in output file names is my fault.)

> IllegalArgumentException when using Hadoop file system for WordCount example.
> -
>
> Key: BEAM-2277
> URL: https://issues.apache.org/jira/browse/BEAM-2277
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Blocker
> Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt 
> --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds 
> have the same scheme, but received file, hdfs.
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>   at 
> org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
>   at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
>   at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-908) Update wordcount walkthrough

2017-05-12 Thread Davor Bonaci (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci resolved BEAM-908.
---
   Resolution: Fixed
Fix Version/s: (was: 2.0.0)
   Not applicable

> Update wordcount walkthrough
> 
>
> Key: BEAM-908
> URL: https://issues.apache.org/jira/browse/BEAM-908
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
> Fix For: Not applicable
>
>
> Run through https://beam.apache.org/get-started/wordcount-example/ and make 
> sure it works and reads well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam-site pull request #241: Revise WordCount Example Walkthrough

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/241


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/3] beam-site git commit: Regenerate website

2017-05-12 Thread davor
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/12121557
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/12121557
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/12121557

Branch: refs/heads/asf-site
Commit: 12121557b155ec7d0aea865afa5c6f2801217d56
Parents: 9fb214f
Author: Davor Bonaci 
Authored: Fri May 12 16:17:49 2017 -0700
Committer: Davor Bonaci 
Committed: Fri May 12 16:17:49 2017 -0700

--
 .../documentation/programming-guide/index.html  |  2 +-
 .../sdks/python-custom-io/index.html|  2 +-
 .../get-started/wordcount-example/index.html| 80 +++-
 3 files changed, 47 insertions(+), 37 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/12121557/content/documentation/programming-guide/index.html
--
diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index 1e6e9a7..3b3cb14 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -1939,7 +1939,7 @@ Subsequent transforms, however, are applied to the result 
of the unix_timestamp = extract_timestamp_from_log_entry(element)
 # Wrap and emit the current entry and new timestamp in 
a
 # TimestampedValue.
-yield beam.TimestampedValue(element, unix_timestamp)
+yield beam.window.TimestampedValue(element, unix_timestamp)
 
 timestamped_items = items | 'timestamp' >> beam.ParDo(AddTimestampDoFn())
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/12121557/content/documentation/sdks/python-custom-io/index.html
--
diff --git a/content/documentation/sdks/python-custom-io/index.html 
b/content/documentation/sdks/python-custom-io/index.html
index 629ef0f..fb2646f 100644
--- a/content/documentation/sdks/python-custom-io/index.html
+++ b/content/documentation/sdks/python-custom-io/index.html
@@ -464,7 +464,7 @@ numbers = p | 'ProduceNumbers' >> 
beam.io.Read(CountingSource(count))
 
 FileSink
 
-If your data source uses files, you can derive your Sink and Writer classes from the FileSink and FileSinkWriter classes, which can be found in 
the https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/fileio.py";>fileio.py
 module. These classes implement code common sinks that interact with files, 
including:
+If your data source uses files, you can derive your Sink and Writer classes from the FileBasedSink and FileBasedSinkWriter classes, which can be 
found in the https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filebasedsink.py";>filebasedsink.py
 module. These classes implement code common sinks that interact with files, 
including:
 
 
   Setting file headers and footers

http://git-wip-us.apache.org/repos/asf/beam-site/blob/12121557/content/get-started/wordcount-example/index.html
--
diff --git a/content/get-started/wordcount-example/index.html 
b/content/get-started/wordcount-example/index.html
index 5cc32f3..333cfb9 100644
--- a/content/get-started/wordcount-example/index.html
+++ b/content/get-started/wordcount-example/index.html
@@ -172,6 +172,7 @@
   Dataflow Runner
   Apache Spark Runner
   Apache Flink Runner
+  Apache Apex Runner
 
   
   Testing your Pipeline via 
PAssert
@@ -228,7 +229,7 @@
 
 Creating the Pipeline
 
-The first step in creating a Beam pipeline is to create a PipelineOptions object. This object lets us 
set various options for our pipeline, such as the pipeline runner that will 
execute our pipeline and any runner-specific configuration required by the 
chosen runner. In this example we set these options programmatically, but more 
often command-line arguments are used to set PipelineOptions.
+The first step in creating a Beam pipeline is to create a PipelineOptions object. This object lets us 
set various options for our pipeline, such as the pipeline runner that will 
execute our pipeline and any runner-specific configuration required by the 
chosen runner. In this example we set these options programmatically, but more 
often command-line arguments are used to set PipelineOptions.
 
 You can specify a runner for executing your pipeline, such as the DataflowRunner or SparkRunner. If you omit specifying a runner, 
as in this example, your pipeline will be executed locally using the DirectRunner. In the next sections, we will 
specify the pipeline’s runner.
 
@@ -273,7 +274,7 @@
 
 The Minimal WordCount pipeline contains several transforms to read data 
into the pipeline, manipulate or otherwise transf

[3/3] beam-site git commit: This closes #241

2017-05-12 Thread davor
This closes #241


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/261e3977
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/261e3977
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/261e3977

Branch: refs/heads/asf-site
Commit: 261e39912fdf02e2b6282b6384f6e182e1a2
Parents: 87b0fb5 1212155
Author: Davor Bonaci 
Authored: Fri May 12 16:17:50 2017 -0700
Committer: Davor Bonaci 
Committed: Fri May 12 16:17:50 2017 -0700

--
 .../documentation/programming-guide/index.html  |  2 +-
 .../sdks/python-custom-io/index.html|  2 +-
 .../get-started/wordcount-example/index.html| 80 ++
 src/documentation/sdks/python-custom-io.md  |  2 +-
 src/get-started/wordcount-example.md| 86 +++-
 5 files changed, 95 insertions(+), 77 deletions(-)
--




[1/3] beam-site git commit: Revise WordCount Example Walkthrough

2017-05-12 Thread davor
Repository: beam-site
Updated Branches:
  refs/heads/asf-site 87b0fb5e4 -> 261e39777


Revise WordCount Example Walkthrough

Fiddle with monospace markers.

Update code snippets.

Reorder logging example and update "[Google ]Cloud Logging" ->
Stackdriver Logging


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/9fb214fe
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/9fb214fe
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/9fb214fe

Branch: refs/heads/asf-site
Commit: 9fb214fe7ee5c662248a448d06ef30e3aa8eb82e
Parents: 87b0fb5
Author: Thomas Groh 
Authored: Fri May 12 14:01:06 2017 -0700
Committer: Davor Bonaci 
Committed: Fri May 12 16:17:27 2017 -0700

--
 src/documentation/sdks/python-custom-io.md |  2 +-
 src/get-started/wordcount-example.md   | 86 ++---
 2 files changed, 48 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/9fb214fe/src/documentation/sdks/python-custom-io.md
--
diff --git a/src/documentation/sdks/python-custom-io.md 
b/src/documentation/sdks/python-custom-io.md
index ee87e4e..8ce174e 100644
--- a/src/documentation/sdks/python-custom-io.md
+++ b/src/documentation/sdks/python-custom-io.md
@@ -228,7 +228,7 @@ The Beam SDK for Python contains some convenient abstract 
base classes to help y
 
  FileSink
 
-If your data source uses files, you can derive your `Sink` and `Writer` 
classes from the `FileSink` and `FileSinkWriter` classes, which can be found in 
the 
[fileio.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/fileio.py)
 module. These classes implement code common sinks that interact with files, 
including:
+If your data source uses files, you can derive your `Sink` and `Writer` 
classes from the `FileBasedSink` and `FileBasedSinkWriter` classes, which can 
be found in the 
[filebasedsink.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filebasedsink.py)
 module. These classes implement code common sinks that interact with files, 
including:
 
 * Setting file headers and footers
 * Sequential record writing

http://git-wip-us.apache.org/repos/asf/beam-site/blob/9fb214fe/src/get-started/wordcount-example.md
--
diff --git a/src/get-started/wordcount-example.md 
b/src/get-started/wordcount-example.md
index 3572e76..19a82d7 100644
--- a/src/get-started/wordcount-example.md
+++ b/src/get-started/wordcount-example.md
@@ -47,7 +47,7 @@ The following sections explain these concepts in detail along 
with excerpts of t
 
 ### Creating the Pipeline
 
-The first step in creating a Beam pipeline is to create a `PipelineOptions 
object`. This object lets us set various options for our pipeline, such as the 
pipeline runner that will execute our pipeline and any runner-specific 
configuration required by the chosen runner. In this example we set these 
options programmatically, but more often command-line arguments are used to set 
`PipelineOptions`. 
+The first step in creating a Beam pipeline is to create a `PipelineOptions` 
object. This object lets us set various options for our pipeline, such as the 
pipeline runner that will execute our pipeline and any runner-specific 
configuration required by the chosen runner. In this example we set these 
options programmatically, but more often command-line arguments are used to set 
`PipelineOptions`. 
 
 You can specify a runner for executing your pipeline, such as the 
`DataflowRunner` or `SparkRunner`. If you omit specifying a runner, as in this 
example, your pipeline will be executed locally using the `DirectRunner`. In 
the next sections, we will specify the pipeline's runner.
 
@@ -86,14 +86,14 @@ Pipeline p = Pipeline.create(options);
 
 The Minimal WordCount pipeline contains several transforms to read data into 
the pipeline, manipulate or otherwise transform the data, and write out the 
results. Each transform represents an operation in the pipeline.
 
-Each transform takes some kind of input (data or otherwise), and produces some 
output data. The input and output data is represented by the SDK class 
`PCollection`. `PCollection` is a special class, provided by the Beam SDK, that 
you can use to represent a data set of virtually any size, including infinite 
data sets.
+Each transform takes some kind of input (data or otherwise), and produces some 
output data. The input and output data is represented by the SDK class 
`PCollection`. `PCollection` is a special class, provided by the Beam SDK, that 
you can use to represent a data set of virtually any size, including unbounded 
data sets.
 
 
 Figure 1: The pipeline data flow.
 
 The Minimal WordCount pipel

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2806

2017-05-12 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-2279.
-
Resolution: Fixed

Hopefully fixed for real this time.

> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008882#comment-16008882
 ] 

ASF GitHub Bot commented on BEAM-2279:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3133


> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3133: [BEAM-2279] Cherry-pick #3132 to release-2.0.0

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3133


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2279] Fix archetype breakages

2017-05-12 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 6083b47d0 -> 6931219d8


[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/af16c34f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/af16c34f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/af16c34f

Branch: refs/heads/release-2.0.0
Commit: af16c34ff1b89ec8364ce1d69e02fca035780e53
Parents: 6083b47
Author: Dan Halperin 
Authored: Fri May 12 15:52:51 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:58:38 2017 -0700

--
 .../src/main/resources/archetype-resources/pom.xml   | 8 
 .../examples/src/main/resources/archetype-resources/pom.xml  | 1 +
 2 files changed, 9 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/af16c34f/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
index 05826ba..5f34689 100644
--- 
a/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
@@ -218,6 +218,7 @@
 
   org.apache.beam
   beam-sdks-java-io-hadoop-file-system
+  ${beam.version}
   runtime
 
 
@@ -361,5 +362,12 @@
   1.9.5
   test
 
+
+
+  org.apache.beam
+  beam-runners-direct-java
+  ${beam.version}
+  test
+
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/af16c34f/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
index fb6e8b1..a3d7b8f 100644
--- 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
@@ -218,6 +218,7 @@
 
   org.apache.beam
   beam-sdks-java-io-hadoop-file-system
+  ${beam.version}
   runtime
 
 



[2/2] beam git commit: This closes #3133

2017-05-12 Thread dhalperi
This closes #3133


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6931219d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6931219d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6931219d

Branch: refs/heads/release-2.0.0
Commit: 6931219d8725ad90f6b18b1bf8219d5adb10ff8a
Parents: 6083b47 af16c34
Author: Dan Halperin 
Authored: Fri May 12 16:05:33 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 16:05:33 2017 -0700

--
 .../src/main/resources/archetype-resources/pom.xml   | 8 
 .../examples/src/main/resources/archetype-resources/pom.xml  | 1 +
 2 files changed, 9 insertions(+)
--




[jira] [Commented] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008871#comment-16008871
 ] 

ASF GitHub Bot commented on BEAM-2279:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3133

[BEAM-2279] Cherry-pick #3132 to release-2.0.0

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp-3132

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3133


commit af16c34ff1b89ec8364ce1d69e02fca035780e53
Author: Dan Halperin 
Date:   2017-05-12T22:52:51Z

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.




> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3133: [BEAM-2279] Cherry-pick #3132 to release-2.0.0

2017-05-12 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3133

[BEAM-2279] Cherry-pick #3132 to release-2.0.0

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp-3132

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3133.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3133


commit af16c34ff1b89ec8364ce1d69e02fca035780e53
Author: Dan Halperin 
Date:   2017-05-12T22:52:51Z

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3132: [BEAM-2279] Fix archetype breakages

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3132


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008865#comment-16008865
 ] 

ASF GitHub Bot commented on BEAM-2279:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3132


> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #3132

2017-05-12 Thread dhalperi
This closes #3132


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9a43da74
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9a43da74
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9a43da74

Branch: refs/heads/master
Commit: 9a43da74a9005507acdc80524edc634c8318674d
Parents: 89289e4 7bade36
Author: Dan Halperin 
Authored: Fri May 12 15:57:21 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:57:21 2017 -0700

--
 .../src/main/resources/archetype-resources/pom.xml   | 8 
 .../examples/src/main/resources/archetype-resources/pom.xml  | 1 +
 2 files changed, 9 insertions(+)
--




[1/2] beam git commit: [BEAM-2279] Fix archetype breakages

2017-05-12 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 89289e498 -> 9a43da74a


[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7bade36e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7bade36e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7bade36e

Branch: refs/heads/master
Commit: 7bade36eb18a48e01a8401f97f10a70a492172c4
Parents: 89289e4
Author: Dan Halperin 
Authored: Fri May 12 15:52:51 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:52:51 2017 -0700

--
 .../src/main/resources/archetype-resources/pom.xml   | 8 
 .../examples/src/main/resources/archetype-resources/pom.xml  | 1 +
 2 files changed, 9 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7bade36e/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
index 05826ba..5f34689 100644
--- 
a/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
@@ -218,6 +218,7 @@
 
   org.apache.beam
   beam-sdks-java-io-hadoop-file-system
+  ${beam.version}
   runtime
 
 
@@ -361,5 +362,12 @@
   1.9.5
   test
 
+
+
+  org.apache.beam
+  beam-runners-direct-java
+  ${beam.version}
+  test
+
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/7bade36e/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
index fb6e8b1..a3d7b8f 100644
--- 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
@@ -218,6 +218,7 @@
 
   org.apache.beam
   beam-sdks-java-io-hadoop-file-system
+  ${beam.version}
   runtime
 
 



[jira] [Commented] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008858#comment-16008858
 ] 

ASF GitHub Bot commented on BEAM-2279:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3132

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.

R: @davorbonaci 
CC: @aviemzur 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam fix-archetypes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3132


commit 7bade36eb18a48e01a8401f97f10a70a492172c4
Author: Dan Halperin 
Date:   2017-05-12T22:52:51Z

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.




> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3132: [BEAM-2279] Fix archetype breakages

2017-05-12 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3132

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.

R: @davorbonaci 
CC: @aviemzur 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam fix-archetypes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3132.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3132


commit 7bade36eb18a48e01a8401f97f10a70a492172c4
Author: Dan Halperin 
Date:   2017-05-12T22:52:51Z

[BEAM-2279] Fix archetype breakages

* Add version for hadoop-file-system dependency.
* Add DirectRunner in test scope in java8, otherwise mvn package fails
  when another runner profile is enabled.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Reopened] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reopened BEAM-2279:
---

Introduced a new breakage in the archetypes

> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-2024) Python and Java SDKs differ on name of AfterCount

2017-05-12 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008840#comment-16008840
 ] 

Robert Bradshaw edited comment on BEAM-2024 at 5/12/17 10:39 PM:
-

Note also that we have the more concise AfterWatermark(early=..., late=...) 
rather than 
AfterWatermark.pastEndOfWindow().withEarlyFirings(...).withLateFirings(...) and 
just use AfterEach(...) rather than AfterEach.inOrder(...). This is in large 
part due to Java's use of static method + builder pattern, rather than 
constructor-with-keyword-arguments style. 

I don't think we need to make Python as verbose as Java. (And other languages 
like Go may prefer brevity as well.)


was (Author: robertwb):
Note also that we have the more concise AfterWatermark(early=..., late=...) 
rather than 
AfterWatermark.pastEndOfWindow().withEarlyFirings(...).withLateFirings(...) and 
just use AfterEach(...) rather than AfterEach.inOrder(...). This is in large 
part due to Java's use of static method + builder pattern, rather than 
constructor-with-keyword-arguments style. 

> Python and Java SDKs differ on name of AfterCount
> -
>
> Key: BEAM-2024
> URL: https://issues.apache.org/jira/browse/BEAM-2024
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Wesley Tanaka
>Assignee: Ahmet Altay
>Priority: Minor
>
> * https://cloud.google.com/dataflow/model/triggers#window-accumulation-modes 
> refers to an AfterCount trigger
> * There is no AfterCount trigger in java SDK
> * There is an AfterCount trigger in python SDK
> My wild guess hypothesis is that AfterCount got renamed between Dataflow and 
> Beam, but that the python beam SDK did not also execute the same rename?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2024) Python and Java SDKs differ on name of AfterCount

2017-05-12 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008840#comment-16008840
 ] 

Robert Bradshaw commented on BEAM-2024:
---

Note also that we have the more concise AfterWatermark(early=..., late=...) 
rather than 
AfterWatermark.pastEndOfWindow().withEarlyFirings(...).withLateFirings(...) and 
just use AfterEach(...) rather than AfterEach.inOrder(...). This is in large 
part due to Java's use of static method + builder pattern, rather than 
constructor-with-keyword-arguments style. 

> Python and Java SDKs differ on name of AfterCount
> -
>
> Key: BEAM-2024
> URL: https://issues.apache.org/jira/browse/BEAM-2024
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Wesley Tanaka
>Assignee: Ahmet Altay
>Priority: Minor
>
> * https://cloud.google.com/dataflow/model/triggers#window-accumulation-modes 
> refers to an AfterCount trigger
> * There is no AfterCount trigger in java SDK
> * There is an AfterCount trigger in python SDK
> My wild guess hypothesis is that AfterCount got renamed between Dataflow and 
> Beam, but that the python beam SDK did not also execute the same rename?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3131: [BEAM-] Fix timestamps for Compressed...

2017-05-12 Thread rfevang
GitHub user rfevang opened a pull request:

https://github.com/apache/beam/pull/3131

[BEAM-] Fix timestamps for CompressedSource

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rfevang/beam fix-compressed-timestamps

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3131.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3131


commit 997b26b5b431622b0f18694b4a6520083ef6cde4
Author: Rune Fevang 
Date:   2017-05-12T20:24:32Z

Fix issue where timestamps weren't set when using CompressedSource.

commit fd615b82bb27df419b797c111648c059206ab17a
Author: Rune Fevang 
Date:   2017-05-12T22:26:45Z

Add test.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Comment Edited] (BEAM-2024) Python and Java SDKs differ on name of AfterCount

2017-05-12 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008749#comment-16008749
 ] 

Robert Bradshaw edited comment on BEAM-2024 at 5/12/17 10:37 PM:
-

This is the more verbose AfterPane.elementCountAtLeast(...) in Java. 


was (Author: robertwb):
This is the more verbose AfterPane.elementCount(...) in Java. 

> Python and Java SDKs differ on name of AfterCount
> -
>
> Key: BEAM-2024
> URL: https://issues.apache.org/jira/browse/BEAM-2024
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Wesley Tanaka
>Assignee: Ahmet Altay
>Priority: Minor
>
> * https://cloud.google.com/dataflow/model/triggers#window-accumulation-modes 
> refers to an AfterCount trigger
> * There is no AfterCount trigger in java SDK
> * There is an AfterCount trigger in python SDK
> My wild guess hypothesis is that AfterCount got renamed between Dataflow and 
> Beam, but that the python beam SDK did not also execute the same rename?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-2283) Consider using actual URIs instead of Strings/ResourceIds in relation to FileSystems

2017-05-12 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008833#comment-16008833
 ] 

Luke Cwik edited comment on BEAM-2283 at 5/12/17 10:32 PM:
---

One proposal is to:
* read().from("string path") represents an unescaped URI with no query or 
fragment component, potentially containing glob characters and '\' to escape 
glob characters
* add read().from(URI uri) for cases where users need to specify query/fragment 
components

Conversion from string to URI would be handled through a double escaping 
mechanism to support glob expressions:
* file:/my/path* would represent file:/my/path followed by the glob expression 
'*', this would be converted to the string file:/my/path%2A and then passed to 
a URI. FileSystem implementations would need to inspect the URI for escaped 
glob expressions.
* file:/my/path\* would represent file:/my/path* (note that this is a file 
named path* and not a glob expression), this would be converted to the string 
file:/my/path#%5C%2A and then passed to a URI. FileSystem implementations would 
need to inspect the URI, notice that it is not a glob expression and treat the 
unescaped path segment as a literal.

It would be important for FileSystem implementations to work on the URI and 
components and path segments individually converting to their own internal 
representation and failing if necessary.

Glob characters *, [], and ? would be understood and used by the internals of 
Apache Beam and glob conversion from *, [], ? to internal FileSystem glob 
representations would be FileSystem dependent.

This proposal has the benefits that:
* users have the minimal amount of escaping that they need to do (only escape 
the set of glob characters when the want things named with *, [], and ?)
* file:/my/path* is a canonical representation that most users would expect to 
represent file:/my/path followed by the glob *


was (Author: lcwik):
One proposal is to:
* read().from("string path") represents an unescaped URI with no query or 
fragment component, potentially containing glob characters and '\' to escape 
glob characters
* add read().from(URI uri) for cases where users need to specify query/fragment 
components

Conversion from string to URI would be handled through a double escaping 
mechanism to support glob expressions:
* file:/my/path* would represent file:/my/path followed by the glob expression 
'*', this would be converted to the string file:/my/path%2A and then passed to 
a URI. FileSystem implementations would need to inspect the URI for escaped 
glob expressions.
* file:/my/path\* would represent file:/my/path* (note that this is a file 
named path* and not a glob expression), this would be converted to the string 
file:/my/path#%5C%2A and then passed to a URI. FileSystem implementations would 
need to inspect the URI, notice that it is not a glob expression and treat the 
unescaped path segment as a literal.

It would be important for FileSystem implementations to work on the URI and 
components and path segments individually converting to their own internal 
representation and failing if necessary.

Glob characters *, [], and ? would be understood and used by the internals of 
Apache Beam and glob conversion from *, [], ? to internal FileSystem glob 
representations would be FileSystem dependent.

This proposal has the benefits that:
* users have the minimal amount of escaping that they need to do (only escape 
the set of glob characters)
* file:/my/path* is a canonical representation that most users would expect to 
represent file:/my/path followed by the glob *

> Consider using actual URIs instead of Strings/ResourceIds in relation to 
> FileSystems
> 
>
> Key: BEAM-2283
> URL: https://issues.apache.org/jira/browse/BEAM-2283
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp, sdk-py
>Reporter: Luke Cwik
>
> We treat things like URIs because we expect them to have a scheme component 
> and to be able to resolve a parent/child but fail to treat them as URIs in 
> the internal implementation since our string versions don't go through URI 
> normalization. This brings up a few issues:
> * The cost of implementing and maintaining ResourceIds instead of having 
> users use a standard URI implementation. This would just require FileSystems 
> to be able to take a string and give back a URI (to enable them to have 
> custom implementations in case they extend the concept of URIs with scheme 
> specific extensions).
> * The myriad of bugs that will come up because of improper usage of URI like 
> strings and the assumptions associated with them (like 
> https://issues.apache.org/jira/browse/BEAM-2277)
> Note that swapping to URIs

[jira] [Commented] (BEAM-2283) Consider using actual URIs instead of Strings/ResourceIds in relation to FileSystems

2017-05-12 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008833#comment-16008833
 ] 

Luke Cwik commented on BEAM-2283:
-

One proposal is to:
* read().from("string path") represents an unescaped URI with no query or 
fragment component, potentially containing glob characters and '\' to escape 
glob characters
* add read().from(URI uri) for cases where users need to specify query/fragment 
components

Conversion from string to URI would be handled through a double escaping 
mechanism to support glob expressions:
* file:/my/path* would represent file:/my/path followed by the glob expression 
'*', this would be converted to the string file:/my/path%2A and then passed to 
a URI. FileSystem implementations would need to inspect the URI for escaped 
glob expressions.
* file:/my/path\* would represent file:/my/path* (note that this is a file 
named path* and not a glob expression), this would be converted to the string 
file:/my/path#%5C%2A and then passed to a URI. FileSystem implementations would 
need to inspect the URI, notice that it is not a glob expression and treat the 
unescaped path segment as a literal.

It would be important for FileSystem implementations to work on the URI and 
components and path segments individually converting to their own internal 
representation and failing if necessary.

Glob characters *, [], and ? would be understood and used by the internals of 
Apache Beam and glob conversion from *, [], ? to internal FileSystem glob 
representations would be FileSystem dependent.

This proposal has the benefits that:
* users have the minimal amount of escaping that they need to do (only escape 
the set of glob characters)
* file:/my/path* is a canonical representation that most users would expect to 
represent file:/my/path followed by the glob *

> Consider using actual URIs instead of Strings/ResourceIds in relation to 
> FileSystems
> 
>
> Key: BEAM-2283
> URL: https://issues.apache.org/jira/browse/BEAM-2283
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp, sdk-py
>Reporter: Luke Cwik
>
> We treat things like URIs because we expect them to have a scheme component 
> and to be able to resolve a parent/child but fail to treat them as URIs in 
> the internal implementation since our string versions don't go through URI 
> normalization. This brings up a few issues:
> * The cost of implementing and maintaining ResourceIds instead of having 
> users use a standard URI implementation. This would just require FileSystems 
> to be able to take a string and give back a URI (to enable them to have 
> custom implementations in case they extend the concept of URIs with scheme 
> specific extensions).
> * The myriad of bugs that will come up because of improper usage of URI like 
> strings and the assumptions associated with them (like 
> https://issues.apache.org/jira/browse/BEAM-2277)
> Note that swapping to URIs adds complexity because:
> * Resolving URIs with glob expressions needs to be handled carefully
> * FileSystems may need to implement a complicated type instead of ResourceId.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3130: Cherry pick #3128

2017-05-12 Thread aaltay
Github user aaltay closed the pull request at:

https://github.com/apache/beam/pull/3130


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3130

2017-05-12 Thread dhalperi
This closes #3130


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6083b47d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6083b47d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6083b47d

Branch: refs/heads/release-2.0.0
Commit: 6083b47d0078cd425f1a474127e6780263768fae
Parents: 7b598f8 7573ec1
Author: Dan Halperin 
Authored: Fri May 12 15:25:11 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:25:11 2017 -0700

--
 sdks/python/apache_beam/internal/pickler.py   | 2 ++
 sdks/python/apache_beam/internal/util.py  | 5 -
 sdks/python/apache_beam/transforms/trigger.py | 6 +-
 3 files changed, 11 insertions(+), 2 deletions(-)
--




[1/2] beam git commit: [BEAM-1345] internal comments

2017-05-12 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 7b598f878 -> 6083b47d0


[BEAM-1345] internal comments


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7573ec10
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7573ec10
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7573ec10

Branch: refs/heads/release-2.0.0
Commit: 7573ec10c31742519e98ccbc2d4d4de1f05013bb
Parents: 7b598f8
Author: Ahmet Altay 
Authored: Fri May 12 14:54:33 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:24:35 2017 -0700

--
 sdks/python/apache_beam/internal/pickler.py   | 2 ++
 sdks/python/apache_beam/internal/util.py  | 5 -
 sdks/python/apache_beam/transforms/trigger.py | 6 +-
 3 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7573ec10/sdks/python/apache_beam/internal/pickler.py
--
diff --git a/sdks/python/apache_beam/internal/pickler.py 
b/sdks/python/apache_beam/internal/pickler.py
index 4305379..e049a71 100644
--- a/sdks/python/apache_beam/internal/pickler.py
+++ b/sdks/python/apache_beam/internal/pickler.py
@@ -17,6 +17,8 @@
 
 """Pickler for values, functions, and classes.
 
+For internal use only. No backwards compatibility guarantees.
+
 Pickles created by the pickling library contain non-ASCII characters, so
 we base64-encode the results so that we can put them in a JSON objects.
 The pickler is used to embed FlatMap callable objects into the workflow JSON

http://git-wip-us.apache.org/repos/asf/beam/blob/7573ec10/sdks/python/apache_beam/internal/util.py
--
diff --git a/sdks/python/apache_beam/internal/util.py 
b/sdks/python/apache_beam/internal/util.py
index df4878c..dbbeafc 100644
--- a/sdks/python/apache_beam/internal/util.py
+++ b/sdks/python/apache_beam/internal/util.py
@@ -15,7 +15,10 @@
 # limitations under the License.
 #
 
-"""Utility functions used throughout the package."""
+"""Utility functions used throughout the package.
+
+For internal use only. No backwards compatibility guarantees.
+"""
 
 import logging
 from multiprocessing.pool import ThreadPool

http://git-wip-us.apache.org/repos/asf/beam/blob/7573ec10/sdks/python/apache_beam/transforms/trigger.py
--
diff --git a/sdks/python/apache_beam/transforms/trigger.py 
b/sdks/python/apache_beam/transforms/trigger.py
index 7de2f85..4200995 100644
--- a/sdks/python/apache_beam/transforms/trigger.py
+++ b/sdks/python/apache_beam/transforms/trigger.py
@@ -37,6 +37,7 @@ from apache_beam.runners.api import beam_runner_api_pb2
 from apache_beam.utils.timestamp import MAX_TIMESTAMP
 from apache_beam.utils.timestamp import MIN_TIMESTAMP
 
+# AfterCount is experimental. No backwards compatibility guarantees.
 
 __all__ = [
 'AccumulationMode',
@@ -367,7 +368,10 @@ class AfterWatermark(TriggerFn):
 
 
 class AfterCount(TriggerFn):
-  """Fire when there are at least count elements in this window pane."""
+  """Fire when there are at least count elements in this window pane.
+
+  AfterCount is experimental. No backwards compatibility guarantees.
+  """
 
   COUNT_TAG = _CombiningValueStateTag('count', combiners.CountCombineFn())
 



Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2805

2017-05-12 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-2279) Hadoop file system support should be included in examples/archetype profiles of Spark runner.

2017-05-12 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-2279.
-
Resolution: Fixed

Thanks Aviem!

> Hadoop file system support should be included in examples/archetype profiles 
> of Spark runner.
> -
>
> Key: BEAM-2279
> URL: https://issues.apache.org/jira/browse/BEAM-2279
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>  Labels: newbie, starter
> Fix For: 2.0.0
>
>
> Hadoop file system support (dependency on 
> {{beam-sdks-java-io-hadoop-file-system}}) should be included in any 
> archetype/examples profiles of Spark runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2024) Python and Java SDKs differ on name of AfterCount

2017-05-12 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-2024:
--
Summary: Python and Java SDKs differ on name of AfterCount  (was: Python 
and Java Dataflow refer to AfterCount, but Java Beam does not have AfterCount)

> Python and Java SDKs differ on name of AfterCount
> -
>
> Key: BEAM-2024
> URL: https://issues.apache.org/jira/browse/BEAM-2024
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Wesley Tanaka
>Assignee: Ahmet Altay
>Priority: Minor
>
> * https://cloud.google.com/dataflow/model/triggers#window-accumulation-modes 
> refers to an AfterCount trigger
> * There is no AfterCount trigger in java SDK
> * There is an AfterCount trigger in python SDK
> My wild guess hypothesis is that AfterCount got renamed between Dataflow and 
> Beam, but that the python beam SDK did not also execute the same rename?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3130: Cherry pick #3128

2017-05-12 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/3130

Cherry pick #3128

R: @dhalperi 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam rcla1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3130.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3130


commit 056fcd50270896fda08f084ff06bbcb4056c3011
Author: Ahmet Altay 
Date:   2017-05-12T21:54:33Z

internal comments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2024) Python and Java Dataflow refer to AfterCount, but Java Beam does not have AfterCount

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008796#comment-16008796
 ] 

ASF GitHub Bot commented on BEAM-2024:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3128


> Python and Java Dataflow refer to AfterCount, but Java Beam does not have 
> AfterCount
> 
>
> Key: BEAM-2024
> URL: https://issues.apache.org/jira/browse/BEAM-2024
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Wesley Tanaka
>Assignee: Ahmet Altay
>Priority: Minor
>
> * https://cloud.google.com/dataflow/model/triggers#window-accumulation-modes 
> refers to an AfterCount trigger
> * There is no AfterCount trigger in java SDK
> * There is an AfterCount trigger in python SDK
> My wild guess hypothesis is that AfterCount got renamed between Dataflow and 
> Beam, but that the python beam SDK did not also execute the same rename?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2804

2017-05-12 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3128: [BEAM-2024] internal comments

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3128


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: internal comments

2017-05-12 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 9f81fd299 -> 89289e498


internal comments


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f53ceecf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f53ceecf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f53ceecf

Branch: refs/heads/master
Commit: f53ceecfb18a52aeeab8f1869c189e4aa2f56d35
Parents: 9f81fd2
Author: Ahmet Altay 
Authored: Fri May 12 14:54:33 2017 -0700
Committer: Ahmet Altay 
Committed: Fri May 12 15:08:04 2017 -0700

--
 sdks/python/apache_beam/internal/pickler.py   | 2 ++
 sdks/python/apache_beam/internal/util.py  | 5 -
 sdks/python/apache_beam/transforms/trigger.py | 6 +-
 3 files changed, 11 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f53ceecf/sdks/python/apache_beam/internal/pickler.py
--
diff --git a/sdks/python/apache_beam/internal/pickler.py 
b/sdks/python/apache_beam/internal/pickler.py
index 4305379..e049a71 100644
--- a/sdks/python/apache_beam/internal/pickler.py
+++ b/sdks/python/apache_beam/internal/pickler.py
@@ -17,6 +17,8 @@
 
 """Pickler for values, functions, and classes.
 
+For internal use only. No backwards compatibility guarantees.
+
 Pickles created by the pickling library contain non-ASCII characters, so
 we base64-encode the results so that we can put them in a JSON objects.
 The pickler is used to embed FlatMap callable objects into the workflow JSON

http://git-wip-us.apache.org/repos/asf/beam/blob/f53ceecf/sdks/python/apache_beam/internal/util.py
--
diff --git a/sdks/python/apache_beam/internal/util.py 
b/sdks/python/apache_beam/internal/util.py
index df4878c..dbbeafc 100644
--- a/sdks/python/apache_beam/internal/util.py
+++ b/sdks/python/apache_beam/internal/util.py
@@ -15,7 +15,10 @@
 # limitations under the License.
 #
 
-"""Utility functions used throughout the package."""
+"""Utility functions used throughout the package.
+
+For internal use only. No backwards compatibility guarantees.
+"""
 
 import logging
 from multiprocessing.pool import ThreadPool

http://git-wip-us.apache.org/repos/asf/beam/blob/f53ceecf/sdks/python/apache_beam/transforms/trigger.py
--
diff --git a/sdks/python/apache_beam/transforms/trigger.py 
b/sdks/python/apache_beam/transforms/trigger.py
index 7de2f85..4200995 100644
--- a/sdks/python/apache_beam/transforms/trigger.py
+++ b/sdks/python/apache_beam/transforms/trigger.py
@@ -37,6 +37,7 @@ from apache_beam.runners.api import beam_runner_api_pb2
 from apache_beam.utils.timestamp import MAX_TIMESTAMP
 from apache_beam.utils.timestamp import MIN_TIMESTAMP
 
+# AfterCount is experimental. No backwards compatibility guarantees.
 
 __all__ = [
 'AccumulationMode',
@@ -367,7 +368,10 @@ class AfterWatermark(TriggerFn):
 
 
 class AfterCount(TriggerFn):
-  """Fire when there are at least count elements in this window pane."""
+  """Fire when there are at least count elements in this window pane.
+
+  AfterCount is experimental. No backwards compatibility guarantees.
+  """
 
   COUNT_TAG = _CombiningValueStateTag('count', combiners.CountCombineFn())
 



[2/2] beam git commit: This closes #3128

2017-05-12 Thread altay
This closes #3128


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/89289e49
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/89289e49
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/89289e49

Branch: refs/heads/master
Commit: 89289e498e71d6e9fbd24702032471fa13fbd46c
Parents: 9f81fd2 f53ceec
Author: Ahmet Altay 
Authored: Fri May 12 15:08:16 2017 -0700
Committer: Ahmet Altay 
Committed: Fri May 12 15:08:16 2017 -0700

--
 sdks/python/apache_beam/internal/pickler.py   | 2 ++
 sdks/python/apache_beam/internal/util.py  | 5 -
 sdks/python/apache_beam/transforms/trigger.py | 6 +-
 3 files changed, 11 insertions(+), 2 deletions(-)
--




[jira] [Commented] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008787#comment-16008787
 ] 

ASF GitHub Bot commented on BEAM-2277:
--

Github user dhalperi closed the pull request at:

https://github.com/apache/beam/pull/3129


> IllegalArgumentException when using Hadoop file system for WordCount example.
> -
>
> Key: BEAM-2277
> URL: https://issues.apache.org/jira/browse/BEAM-2277
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Blocker
> Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt 
> --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds 
> have the same scheme, but received file, hdfs.
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>   at 
> org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
>   at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
>   at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008786#comment-16008786
 ] 

ASF GitHub Bot commented on BEAM-2277:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3129

[BEAM-2277] Cherrypick #3121 to release-2.0.0

fbb0de129d Remove '/' entirely from determining FileSystem scheme
a6a5ff7be3 [BEAM-2277] Add ResourceIdTester and test existing ResourceId 
implementations
ec956c85ef Mark FileSystem and related as Experimental
15df211c75 [BEAM-2277] HadoopFileSystem: normalize implementation
f3540d47f1 Rename FileSystems.setDefaultConfigInWorkers
3921163829 Fix shading of guava testlib

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp3121

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3129.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3129


commit 8aa4a8a9b06bf1c356fc0686b26c1ea5d9983ffb
Author: Dan Halperin 
Date:   2017-05-12T21:59:14Z

[BEAM-2277] Cherrypick #3121 to release-2.0.0

fbb0de129d Remove '/' entirely from determining FileSystem scheme
a6a5ff7be3 [BEAM-2277] Add ResourceIdTester and test existing ResourceId 
implementations
ec956c85ef Mark FileSystem and related as Experimental
15df211c75 [BEAM-2277] HadoopFileSystem: normalize implementation
f3540d47f1 Rename FileSystems.setDefaultConfigInWorkers
3921163829 Fix shading of guava testlib




> IllegalArgumentException when using Hadoop file system for WordCount example.
> -
>
> Key: BEAM-2277
> URL: https://issues.apache.org/jira/browse/BEAM-2277
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Blocker
> Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt 
> --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds 
> have the same scheme, but received file, hdfs.
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>   at 
> org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
>   at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
>   at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3129: [BEAM-2277] Cherrypick #3121 to release-2.0.0

2017-05-12 Thread dhalperi
Github user dhalperi closed the pull request at:

https://github.com/apache/beam/pull/3129


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2277] Cherrypick #3121 to release-2.0.0

2017-05-12 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 c807fec9b -> 7b598f878


[BEAM-2277] Cherrypick #3121 to release-2.0.0

fbb0de129d Remove '/' entirely from determining FileSystem scheme
a6a5ff7be3 [BEAM-2277] Add ResourceIdTester and test existing ResourceId 
implementations
ec956c85ef Mark FileSystem and related as Experimental
15df211c75 [BEAM-2277] HadoopFileSystem: normalize implementation
f3540d47f1 Rename FileSystems.setDefaultConfigInWorkers
3921163829 Fix shading of guava testlib


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e6ee1d8b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e6ee1d8b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e6ee1d8b

Branch: refs/heads/release-2.0.0
Commit: e6ee1d8b98bcee6ee8c6a80b4af6646990e0c399
Parents: c807fec
Author: Dan Halperin 
Authored: Fri May 12 14:59:14 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:03:50 2017 -0700

--
 pom.xml |   4 +
 .../utils/SerializablePipelineOptions.java  |   2 +-
 runners/direct-java/pom.xml |   4 +
 .../utils/SerializedPipelineOptions.java|   2 +-
 .../DataflowPipelineTranslatorTest.java |   2 +-
 .../runners/dataflow/DataflowRunnerTest.java|   6 +-
 .../options/DataflowPipelineOptionsTest.java|   4 +-
 .../runners/dataflow/util/PackageUtilTest.java  |   2 +-
 .../spark/translation/SparkRuntimeContext.java  |   2 +-
 sdks/java/core/pom.xml  |   4 +
 .../org/apache/beam/sdk/PipelineRunner.java |   2 +-
 .../beam/sdk/annotations/Experimental.java  |   7 +
 .../java/org/apache/beam/sdk/io/AvroIO.java |   4 +
 .../org/apache/beam/sdk/io/FileBasedSink.java   |  21 ++-
 .../java/org/apache/beam/sdk/io/FileSystem.java |   3 +
 .../apache/beam/sdk/io/FileSystemRegistrar.java |   3 +
 .../org/apache/beam/sdk/io/FileSystems.java |  21 ++-
 .../beam/sdk/io/LocalFileSystemRegistrar.java   |   3 +
 .../org/apache/beam/sdk/io/LocalResources.java  |   3 +
 .../java/org/apache/beam/sdk/io/TFRecordIO.java |   4 +
 .../java/org/apache/beam/sdk/io/TextIO.java |   4 +
 .../org/apache/beam/sdk/io/fs/ResourceId.java   |   3 +
 .../apache/beam/sdk/testing/TestPipeline.java   |   2 +-
 .../apache/beam/sdk/io/LocalResourceIdTest.java |   6 +
 .../apache/beam/sdk/io/fs/ResourceIdTester.java | 150 +++
 .../google-cloud-platform-core/pom.xml  |   6 +
 .../gcp/storage/GcsFileSystemRegistrar.java |   5 +-
 .../gcp/storage/GcsResourceIdTest.java  |   9 ++
 sdks/java/io/hadoop-file-system/pom.xml |  13 ++
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  |  32 ++--
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|   3 +
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |   3 +
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  16 +-
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  |   5 +-
 .../beam/sdk/io/hdfs/HadoopResourceIdTest.java  |  71 +
 35 files changed, 395 insertions(+), 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e6ee1d8b/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 3d02096..8f96acc 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1446,6 +1446,10 @@
 
   
 com.google.common
+
+  
+  com.google.common.**.testing.*
+
 
 
   
org.apache.${renderedArtifactId}.repackaged.com.google.common

http://git-wip-us.apache.org/repos/asf/beam/blob/e6ee1d8b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
--
diff --git 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
index 02afa7a..46b04fc 100644
--- 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
+++ 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
@@ -62,7 +62,7 @@ public class SerializablePipelineOptions implements 
Externalizable {
 .as(ApexPipelineOptions.class);
 
 if (FILE_SYSTEMS_INTIIALIZED.compareAndSet(false, true)) {
-  FileSystems.setDefaultConfigInWorkers(pipelineOptions);
+  FileSystems.setDefaultPipelineOptions(pipelineOptions);
 }
   }
 

http://git-wip-us.apache.org/repos/asf/beam/blob/e6ee1d8b/runners/direct-java/pom.xml
--

[2/2] beam git commit: This closes #3129

2017-05-12 Thread dhalperi
This closes #3129


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7b598f87
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7b598f87
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7b598f87

Branch: refs/heads/release-2.0.0
Commit: 7b598f8781211c165bf183139d673a407993c32f
Parents: c807fec e6ee1d8
Author: Dan Halperin 
Authored: Fri May 12 15:03:54 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 15:03:54 2017 -0700

--
 pom.xml |   4 +
 .../utils/SerializablePipelineOptions.java  |   2 +-
 runners/direct-java/pom.xml |   4 +
 .../utils/SerializedPipelineOptions.java|   2 +-
 .../DataflowPipelineTranslatorTest.java |   2 +-
 .../runners/dataflow/DataflowRunnerTest.java|   6 +-
 .../options/DataflowPipelineOptionsTest.java|   4 +-
 .../runners/dataflow/util/PackageUtilTest.java  |   2 +-
 .../spark/translation/SparkRuntimeContext.java  |   2 +-
 sdks/java/core/pom.xml  |   4 +
 .../org/apache/beam/sdk/PipelineRunner.java |   2 +-
 .../beam/sdk/annotations/Experimental.java  |   7 +
 .../java/org/apache/beam/sdk/io/AvroIO.java |   4 +
 .../org/apache/beam/sdk/io/FileBasedSink.java   |  21 ++-
 .../java/org/apache/beam/sdk/io/FileSystem.java |   3 +
 .../apache/beam/sdk/io/FileSystemRegistrar.java |   3 +
 .../org/apache/beam/sdk/io/FileSystems.java |  21 ++-
 .../beam/sdk/io/LocalFileSystemRegistrar.java   |   3 +
 .../org/apache/beam/sdk/io/LocalResources.java  |   3 +
 .../java/org/apache/beam/sdk/io/TFRecordIO.java |   4 +
 .../java/org/apache/beam/sdk/io/TextIO.java |   4 +
 .../org/apache/beam/sdk/io/fs/ResourceId.java   |   3 +
 .../apache/beam/sdk/testing/TestPipeline.java   |   2 +-
 .../apache/beam/sdk/io/LocalResourceIdTest.java |   6 +
 .../apache/beam/sdk/io/fs/ResourceIdTester.java | 150 +++
 .../google-cloud-platform-core/pom.xml  |   6 +
 .../gcp/storage/GcsFileSystemRegistrar.java |   5 +-
 .../gcp/storage/GcsResourceIdTest.java  |   9 ++
 sdks/java/io/hadoop-file-system/pom.xml |  13 ++
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  |  32 ++--
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|   3 +
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |   3 +
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  16 +-
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  |   5 +-
 .../beam/sdk/io/hdfs/HadoopResourceIdTest.java  |  71 +
 35 files changed, 395 insertions(+), 36 deletions(-)
--




[GitHub] beam pull request #3129: [BEAM-2277] Cherrypick #3121 to release-2.0.0

2017-05-12 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3129

[BEAM-2277] Cherrypick #3121 to release-2.0.0

fbb0de129d Remove '/' entirely from determining FileSystem scheme
a6a5ff7be3 [BEAM-2277] Add ResourceIdTester and test existing ResourceId 
implementations
ec956c85ef Mark FileSystem and related as Experimental
15df211c75 [BEAM-2277] HadoopFileSystem: normalize implementation
f3540d47f1 Rename FileSystems.setDefaultConfigInWorkers
3921163829 Fix shading of guava testlib

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp3121

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3129.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3129


commit 8aa4a8a9b06bf1c356fc0686b26c1ea5d9983ffb
Author: Dan Halperin 
Date:   2017-05-12T21:59:14Z

[BEAM-2277] Cherrypick #3121 to release-2.0.0

fbb0de129d Remove '/' entirely from determining FileSystem scheme
a6a5ff7be3 [BEAM-2277] Add ResourceIdTester and test existing ResourceId 
implementations
ec956c85ef Mark FileSystem and related as Experimental
15df211c75 [BEAM-2277] HadoopFileSystem: normalize implementation
f3540d47f1 Rename FileSystems.setDefaultConfigInWorkers
3921163829 Fix shading of guava testlib




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.

2017-05-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16008780#comment-16008780
 ] 

ASF GitHub Bot commented on BEAM-2277:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3121


> IllegalArgumentException when using Hadoop file system for WordCount example.
> -
>
> Key: BEAM-2277
> URL: https://issues.apache.org/jira/browse/BEAM-2277
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Blocker
> Fix For: 2.0.0
>
>
> IllegalArgumentException when using Hadoop file system for WordCount example.
> Occurred when running WordCount example using Spark runner on a YARN cluster.
> Command-line arguments:
> {code:none}
> --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt 
> --output=hdfs:///user/myuser/wc/wc
> {code}
> Stack trace:
> {code:none}
> java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds 
> have the same scheme, but received file, hdfs.
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122)
>   at 
> org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394)
>   at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626)
>   at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516)
>   at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3121: [BEAM-2277] Add ResourceIdTester and test existing ...

2017-05-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3121


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[5/7] beam git commit: Mark FileSystem and related as Experimental

2017-05-12 Thread dhalperi
Mark FileSystem and related as Experimental


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ec956c85
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ec956c85
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ec956c85

Branch: refs/heads/master
Commit: ec956c85efa16d00c5e218ee1257b8ee62a2013d
Parents: a6a5ff7
Author: Dan Halperin 
Authored: Fri May 12 11:38:54 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:10 2017 -0700

--
 .../beam/sdk/annotations/Experimental.java  |  7 +++
 .../java/org/apache/beam/sdk/io/AvroIO.java |  4 
 .../org/apache/beam/sdk/io/FileBasedSink.java   | 21 +++-
 .../java/org/apache/beam/sdk/io/FileSystem.java |  3 +++
 .../apache/beam/sdk/io/FileSystemRegistrar.java |  3 +++
 .../org/apache/beam/sdk/io/FileSystems.java |  3 +++
 .../beam/sdk/io/LocalFileSystemRegistrar.java   |  3 +++
 .../org/apache/beam/sdk/io/LocalResources.java  |  3 +++
 .../java/org/apache/beam/sdk/io/TFRecordIO.java |  4 
 .../java/org/apache/beam/sdk/io/TextIO.java |  4 
 .../org/apache/beam/sdk/io/fs/ResourceId.java   |  3 +++
 .../gcp/storage/GcsFileSystemRegistrar.java |  3 +++
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|  3 +++
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  3 +++
 14 files changed, 66 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ec956c85/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
index ac02465..8224ebb 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java
@@ -23,6 +23,7 @@ import java.lang.annotation.Retention;
 import java.lang.annotation.RetentionPolicy;
 import java.lang.annotation.Target;
 
+
 /**
  * Signifies that a public API (public class, method or field) is subject to 
incompatible changes,
  * or even removal, in a future release.
@@ -79,6 +80,12 @@ public @interface Experimental {
 /** Metrics-related experimental APIs. */
 METRICS,
 
+/**
+ * {@link org.apache.beam.sdk.io.FileSystem} and {@link 
org.apache.beam.sdk.io.fs.ResourceId}
+ * related APIs.
+ */
+FILESYSTEM,
+
 /** Experimental feature related to alternative, unnested encodings for 
coders. */
 CODER_CONTEXT,
 

http://git-wip-us.apache.org/repos/asf/beam/blob/ec956c85/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index d13c6ff..6af0e79 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -30,6 +30,8 @@ import org.apache.avro.Schema;
 import org.apache.avro.file.CodecFactory;
 import org.apache.avro.generic.GenericRecord;
 import org.apache.avro.reflect.ReflectData;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
 import org.apache.beam.sdk.coders.AvroCoder;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.VoidCoder;
@@ -313,6 +315,7 @@ public class AvroIO {
  * a common suffix (if supplied using {@link #withSuffix(String)}). This 
default can be
  * overridden using {@link #withFilenamePolicy(FilenamePolicy)}.
  */
+@Experimental(Kind.FILESYSTEM)
 public Write to(ResourceId outputPrefix) {
   return toResource(StaticValueProvider.of(outputPrefix));
 }
@@ -333,6 +336,7 @@ public class AvroIO {
 /**
  * Like {@link #to(ResourceId)}.
  */
+@Experimental(Kind.FILESYSTEM)
 public Write toResource(ValueProvider outputPrefix) {
   return toBuilder().setFilenamePrefix(outputPrefix).build();
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/ec956c85/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
index 7f729a7..8102316 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileBasedSink.java
@@ -47,6 +47,8 @@ import java.util.Set;
 import java.util.con

[7/7] beam git commit: This closes #3121

2017-05-12 Thread dhalperi
This closes #3121


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9f81fd29
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9f81fd29
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9f81fd29

Branch: refs/heads/master
Commit: 9f81fd299bd32e0d6056a7da9fa994cf74db0ed9
Parents: c2a3628 3921163
Author: Dan Halperin 
Authored: Fri May 12 14:59:14 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:14 2017 -0700

--
 pom.xml |   4 +
 .../utils/SerializablePipelineOptions.java  |   2 +-
 runners/direct-java/pom.xml |   4 +
 .../utils/SerializedPipelineOptions.java|   2 +-
 .../DataflowPipelineTranslatorTest.java |   2 +-
 .../runners/dataflow/DataflowRunnerTest.java|   6 +-
 .../options/DataflowPipelineOptionsTest.java|   4 +-
 .../runners/dataflow/util/PackageUtilTest.java  |   2 +-
 .../spark/translation/SparkRuntimeContext.java  |   2 +-
 sdks/java/core/pom.xml  |   4 +
 .../org/apache/beam/sdk/PipelineRunner.java |   2 +-
 .../beam/sdk/annotations/Experimental.java  |   7 +
 .../java/org/apache/beam/sdk/io/AvroIO.java |   4 +
 .../org/apache/beam/sdk/io/FileBasedSink.java   |  21 ++-
 .../java/org/apache/beam/sdk/io/FileSystem.java |   3 +
 .../apache/beam/sdk/io/FileSystemRegistrar.java |   3 +
 .../org/apache/beam/sdk/io/FileSystems.java |  21 ++-
 .../beam/sdk/io/LocalFileSystemRegistrar.java   |   3 +
 .../org/apache/beam/sdk/io/LocalResources.java  |   3 +
 .../java/org/apache/beam/sdk/io/TFRecordIO.java |   4 +
 .../java/org/apache/beam/sdk/io/TextIO.java |   4 +
 .../org/apache/beam/sdk/io/fs/ResourceId.java   |   3 +
 .../apache/beam/sdk/testing/TestPipeline.java   |   2 +-
 .../apache/beam/sdk/io/LocalResourceIdTest.java |   6 +
 .../apache/beam/sdk/io/fs/ResourceIdTester.java | 150 +++
 .../google-cloud-platform-core/pom.xml  |   6 +
 .../gcp/storage/GcsFileSystemRegistrar.java |   5 +-
 .../gcp/storage/GcsResourceIdTest.java  |   9 ++
 sdks/java/io/hadoop-file-system/pom.xml |  13 ++
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  |  32 ++--
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|   3 +
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |   3 +
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  16 +-
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  |   5 +-
 .../beam/sdk/io/hdfs/HadoopResourceIdTest.java  |  71 +
 35 files changed, 395 insertions(+), 36 deletions(-)
--




[6/7] beam git commit: [BEAM-2277] HadoopFileSystem: normalize implementation

2017-05-12 Thread dhalperi
[BEAM-2277] HadoopFileSystem: normalize implementation

* Drop empty authority always
* Resolve directories


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/15df211c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/15df211c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/15df211c

Branch: refs/heads/master
Commit: 15df211c758e7c8f05c3136f25bbe18e3f394321
Parents: ec956c8
Author: Dan Halperin 
Authored: Fri May 12 11:11:06 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:10 2017 -0700

--
 .../apache/beam/sdk/io/fs/ResourceIdTester.java | 15 +
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 32 +++-
 .../beam/sdk/io/hdfs/HadoopResourceId.java  | 16 +-
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  |  3 +-
 .../beam/sdk/io/hdfs/HadoopResourceIdTest.java  | 22 +-
 5 files changed, 56 insertions(+), 32 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/15df211c/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
index fe50ada..8ceaeed 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
@@ -21,9 +21,8 @@ import static 
com.google.common.base.Preconditions.checkArgument;
 import static 
org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions.RESOLVE_DIRECTORY;
 import static 
org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions.RESOLVE_FILE;
 import static org.hamcrest.Matchers.equalTo;
-import static org.junit.Assert.assertFalse;
+import static org.hamcrest.Matchers.is;
 import static org.junit.Assert.assertThat;
-import static org.junit.Assert.assertTrue;
 import static org.junit.Assert.fail;
 
 import com.google.common.testing.EqualsTester;
@@ -66,16 +65,16 @@ public final class ResourceIdTester {
 ResourceId file2a = baseDirectory.resolve("child2", RESOLVE_FILE);
 allResourceIds.add(file1);
 allResourceIds.add(file2);
-assertFalse("Resolved file isDirectory()", file1.isDirectory());
-assertFalse("Resolved file isDirectory()", file2.isDirectory());
-assertFalse("Resolved file isDirectory()", file2a.isDirectory());
+assertThat("Resolved file isDirectory()", file1.isDirectory(), is(false));
+assertThat("Resolved file isDirectory()", file2.isDirectory(), is(false));
+assertThat("Resolved file isDirectory()", file2a.isDirectory(), is(false));
 
 ResourceId dir1 = baseDirectory.resolve("child1", RESOLVE_DIRECTORY);
 ResourceId dir2 = baseDirectory.resolve("child2", RESOLVE_DIRECTORY);
 ResourceId dir2a = baseDirectory.resolve("child2", RESOLVE_DIRECTORY);
-assertTrue("Resolved directory isDirectory()", dir1.isDirectory());
-assertTrue("Resolved directory isDirectory()", dir2.isDirectory());
-assertTrue("Resolved directory isDirectory()", dir2a.isDirectory());
+assertThat("Resolved directory isDirectory()", dir1.isDirectory(), 
is(true));
+assertThat("Resolved directory isDirectory()", dir2.isDirectory(), 
is(true));
+assertThat("Resolved directory isDirectory()", dir2a.isDirectory(), 
is(true));
 allResourceIds.add(dir1);
 allResourceIds.add(dir2);
 

http://git-wip-us.apache.org/repos/asf/beam/blob/15df211c/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
--
diff --git 
a/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
 
b/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
index 154a818..d519a8c 100644
--- 
a/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
+++ 
b/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
@@ -21,7 +21,6 @@ import com.google.common.annotations.VisibleForTesting;
 import com.google.common.collect.ImmutableList;
 import java.io.IOException;
 import java.net.URI;
-import java.net.URISyntaxException;
 import java.nio.ByteBuffer;
 import java.nio.channels.Channels;
 import java.nio.channels.ReadableByteChannel;
@@ -82,8 +81,9 @@ class HadoopFileSystem extends FileSystem {
 List metadata = new ArrayList<>();
 for (FileStatus fileStatus : fileStatuses) {
   if (fileStatus.isFile()) {
+URI uri = 
dropEmptyAuthority(fileStatus.getPath().toUri().toString());
 metadata.add(Metadata.

[4/7] beam git commit: Remove '/' entirely from determining FileSystem scheme

2017-05-12 Thread dhalperi
Remove '/' entirely from determining FileSystem scheme


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fbb0de12
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fbb0de12
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fbb0de12

Branch: refs/heads/master
Commit: fbb0de129d881b6b6a1ae37fc8d75075c9d8df86
Parents: c2a3628
Author: Dan Halperin 
Authored: Fri May 12 11:13:00 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:10 2017 -0700

--
 .../core/src/main/java/org/apache/beam/sdk/io/FileSystems.java | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/fbb0de12/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
index 08e0def..4341fab 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java
@@ -66,8 +66,8 @@ import org.apache.beam.sdk.values.KV;
 public class FileSystems {
 
   public static final String DEFAULT_SCHEME = "default";
-  private static final Pattern URI_SCHEME_PATTERN = Pattern.compile(
-  "(?[a-zA-Z][-a-zA-Z0-9+.]*):/.*");
+  private static final Pattern FILE_SCHEME_PATTERN =
+  Pattern.compile("(?[a-zA-Z][-a-zA-Z0-9+.]*):.*");
 
   private static final AtomicReference> 
SCHEME_TO_FILESYSTEM =
   new AtomicReference>(
@@ -416,7 +416,7 @@ public class FileSystems {
 // from their use in the URI spec. ('*' is not reserved).
 // Here, we just need the scheme, which is so circumscribed as to be
 // very easy to extract with a regex.
-Matcher matcher = URI_SCHEME_PATTERN.matcher(spec);
+Matcher matcher = FILE_SCHEME_PATTERN.matcher(spec);
 
 if (!matcher.matches()) {
   return "file";



[2/7] beam git commit: Rename FileSystems.setDefaultConfigInWorkers

2017-05-12 Thread dhalperi
Rename FileSystems.setDefaultConfigInWorkers

And document that it's not for users.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f3540d47
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f3540d47
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f3540d47

Branch: refs/heads/master
Commit: f3540d47f10c18859340a738a7e93643ee57f604
Parents: 15df211
Author: Dan Halperin 
Authored: Fri May 12 11:46:16 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:10 2017 -0700

--
 .../translation/utils/SerializablePipelineOptions.java  |  2 +-
 .../translation/utils/SerializedPipelineOptions.java|  2 +-
 .../dataflow/DataflowPipelineTranslatorTest.java|  2 +-
 .../beam/runners/dataflow/DataflowRunnerTest.java   |  6 +++---
 .../dataflow/options/DataflowPipelineOptionsTest.java   |  4 ++--
 .../beam/runners/dataflow/util/PackageUtilTest.java |  2 +-
 .../runners/spark/translation/SparkRuntimeContext.java  |  2 +-
 .../main/java/org/apache/beam/sdk/PipelineRunner.java   |  2 +-
 .../main/java/org/apache/beam/sdk/io/FileSystems.java   | 12 +++-
 .../java/org/apache/beam/sdk/testing/TestPipeline.java  |  2 +-
 .../extensions/gcp/storage/GcsFileSystemRegistrar.java  |  2 +-
 .../sdk/extensions/gcp/storage/GcsResourceIdTest.java   |  2 +-
 .../apache/beam/sdk/io/hdfs/HadoopFileSystemTest.java   |  2 +-
 .../apache/beam/sdk/io/hdfs/HadoopResourceIdTest.java   |  2 +-
 14 files changed, 27 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f3540d47/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
--
diff --git 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
index 02afa7a..46b04fc 100644
--- 
a/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
+++ 
b/runners/apex/src/main/java/org/apache/beam/runners/apex/translation/utils/SerializablePipelineOptions.java
@@ -62,7 +62,7 @@ public class SerializablePipelineOptions implements 
Externalizable {
 .as(ApexPipelineOptions.class);
 
 if (FILE_SYSTEMS_INTIIALIZED.compareAndSet(false, true)) {
-  FileSystems.setDefaultConfigInWorkers(pipelineOptions);
+  FileSystems.setDefaultPipelineOptions(pipelineOptions);
 }
   }
 

http://git-wip-us.apache.org/repos/asf/beam/blob/f3540d47/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/SerializedPipelineOptions.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/SerializedPipelineOptions.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/SerializedPipelineOptions.java
index 84f3bf4..40b6dd6 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/SerializedPipelineOptions.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/utils/SerializedPipelineOptions.java
@@ -56,7 +56,7 @@ public class SerializedPipelineOptions implements 
Serializable {
   try {
 pipelineOptions = createMapper().readValue(serializedOptions, 
PipelineOptions.class);
 
-FileSystems.setDefaultConfigInWorkers(pipelineOptions);
+FileSystems.setDefaultPipelineOptions(pipelineOptions);
   } catch (IOException e) {
 throw new RuntimeException("Couldn't deserialize the 
PipelineOptions.", e);
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/f3540d47/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
index 93c1e5b..87744f0 100644
--- 
a/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
+++ 
b/runners/google-cloud-dataflow-java/src/test/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslatorTest.java
@@ -144,7 +144,7 @@ public class DataflowPipelineTranslatorTest implements 
Serializable {
 Pipeline p = Pipeline.create(options);
 
 // Enable the FileSystems API to know about gs:// URIs in this test.
-FileSystems.setDefaultConfigInWorkers(options);
+FileSystems.setDefau

[3/7] beam git commit: [BEAM-2277] Add ResourceIdTester and test existing ResourceId implementations

2017-05-12 Thread dhalperi
[BEAM-2277] Add ResourceIdTester and test existing ResourceId implementations

A first cut at some of the parts of the ResourceId spec.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a6a5ff7b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a6a5ff7b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a6a5ff7b

Branch: refs/heads/master
Commit: a6a5ff7be387ef295fc7f921de36a3ea77327bc1
Parents: fbb0de1
Author: Dan Halperin 
Authored: Fri May 12 09:20:34 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:10 2017 -0700

--
 .../apache/beam/sdk/io/LocalResourceIdTest.java |   6 +
 .../apache/beam/sdk/io/fs/ResourceIdTester.java | 151 +++
 .../google-cloud-platform-core/pom.xml  |   6 +
 .../gcp/storage/GcsResourceIdTest.java  |   9 ++
 sdks/java/io/hadoop-file-system/pom.xml |  13 ++
 .../beam/sdk/io/hdfs/HadoopResourceIdTest.java  |  63 
 6 files changed, 248 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a6a5ff7b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/LocalResourceIdTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/LocalResourceIdTest.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/LocalResourceIdTest.java
index 7ea85cf..e1ca303 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/LocalResourceIdTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/LocalResourceIdTest.java
@@ -31,6 +31,7 @@ import java.io.File;
 import java.nio.file.Paths;
 import org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions;
 import org.apache.beam.sdk.io.fs.ResourceId;
+import org.apache.beam.sdk.io.fs.ResourceIdTester;
 import org.apache.commons.lang3.SystemUtils;
 import org.junit.Rule;
 import org.junit.Test;
@@ -259,6 +260,11 @@ public class LocalResourceIdTest {
 "xyz.txt");
   }
 
+  @Test
+  public void testResourceIdTester() throws Exception {
+ResourceIdTester.runResourceIdBattery(toResourceIdentifier("/tmp/foo/"));
+  }
+
   private LocalResourceId toResourceIdentifier(String str) throws Exception {
 boolean isDirectory;
 if (SystemUtils.IS_OS_WINDOWS) {

http://git-wip-us.apache.org/repos/asf/beam/blob/a6a5ff7b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
new file mode 100644
index 000..fe50ada
--- /dev/null
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/fs/ResourceIdTester.java
@@ -0,0 +1,151 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.fs;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static 
org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions.RESOLVE_DIRECTORY;
+import static 
org.apache.beam.sdk.io.fs.ResolveOptions.StandardResolveOptions.RESOLVE_FILE;
+import static org.hamcrest.Matchers.equalTo;
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import com.google.common.testing.EqualsTester;
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.beam.sdk.annotations.Experimental;
+import org.apache.beam.sdk.annotations.Experimental.Kind;
+import org.apache.beam.sdk.io.FileSystems;
+
+/**
+ * A utility to test {@link ResourceId} implementations.
+ */
+@Experimental(Kind.FILESYSTEM)
+public final class ResourceIdTester {
+  /**
+   * Enforces that the {@link ResourceId} implementation of {@code 
baseDirectory} meets the
+   * {@link ResourceId} spec.
+   */
+  public static void runResourceIdBattery(ResourceId baseDirectory) {
+checkArgument(
+baseDirectory.isDirectory(), "b

[1/7] beam git commit: Fix shading of guava testlib

2017-05-12 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master c2a36284b -> 9f81fd299


Fix shading of guava testlib


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/39211638
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/39211638
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/39211638

Branch: refs/heads/master
Commit: 3921163829d2a99b0524bf3fa1b85a4cde826f3a
Parents: f3540d4
Author: Dan Halperin 
Authored: Fri May 12 13:57:59 2017 -0700
Committer: Dan Halperin 
Committed: Fri May 12 14:59:10 2017 -0700

--
 pom.xml | 4 
 runners/direct-java/pom.xml | 4 
 sdks/java/core/pom.xml  | 4 
 3 files changed, 12 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/39211638/pom.xml
--
diff --git a/pom.xml b/pom.xml
index eaca6b7..a978f58 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1446,6 +1446,10 @@
 
   
 com.google.common
+
+  
+  com.google.common.**.testing.*
+
 
 
   
org.apache.${renderedArtifactId}.repackaged.com.google.common

http://git-wip-us.apache.org/repos/asf/beam/blob/39211638/runners/direct-java/pom.xml
--
diff --git a/runners/direct-java/pom.xml b/runners/direct-java/pom.xml
index 0daa066..c581113 100644
--- a/runners/direct-java/pom.xml
+++ b/runners/direct-java/pom.xml
@@ -131,6 +131,10 @@
 
 
   com.google.common
+  
+
+com.google.common.**.testing.*
+  
   
 org.apache.beam.runners.direct.repackaged.com.google.common
   

http://git-wip-us.apache.org/repos/asf/beam/blob/39211638/sdks/java/core/pom.xml
--
diff --git a/sdks/java/core/pom.xml b/sdks/java/core/pom.xml
index 68d76e1..882657b 100644
--- a/sdks/java/core/pom.xml
+++ b/sdks/java/core/pom.xml
@@ -161,6 +161,10 @@
   
 
   com.google.common
+  
+
+com.google.common.**.testing.*
+  
   
   
 org.apache.beam.sdk.repackaged.com.google.common



Jenkins build is back to normal : beam_PostCommit_Python_Verify #2211

2017-05-12 Thread Apache Jenkins Server
See 




  1   2   3   >