Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #1751

2017-04-21 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2044) Upgrade HBaseIO to use HBase client version 1.3.1

2017-04-21 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Baptiste Onofré resolved BEAM-2044.

   Resolution: Fixed
Fix Version/s: First stable release

> Upgrade HBaseIO to use HBase client version 1.3.1
> -
>
> Key: BEAM-2044
> URL: https://issues.apache.org/jira/browse/BEAM-2044
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: First stable release
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Trivial
> Fix For: First stable release
>
>
> An interesting fix on Scans on exhausted regions was added so this is worth 
> the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2044) Upgrade HBaseIO to use HBase client version 1.3.1

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979769#comment-15979769
 ] 

ASF GitHub Bot commented on BEAM-2044:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2629


> Upgrade HBaseIO to use HBase client version 1.3.1
> -
>
> Key: BEAM-2044
> URL: https://issues.apache.org/jira/browse/BEAM-2044
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: First stable release
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Trivial
>
> An interesting fix on Scans on exhausted regions was added so this is worth 
> the upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2629: [BEAM-2044] Upgrade HBaseIO to use HBase client ver...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2629


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2044] Upgrade HBaseIO to use HBase client version 1.3.1

2017-04-21 Thread jbonofre
Repository: beam
Updated Branches:
  refs/heads/master 459630ec8 -> 4a8e5d5f9


[BEAM-2044] Upgrade HBaseIO to use HBase client version 1.3.1


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/05877063
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/05877063
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/05877063

Branch: refs/heads/master
Commit: 058770633a9bf9d6a5949b57698652bae6350dd0
Parents: 459630e
Author: Ismaël Mejía 
Authored: Fri Apr 21 16:39:03 2017 +0200
Committer: Jean-Baptiste Onofré 
Committed: Sat Apr 22 06:33:32 2017 +0200

--
 sdks/java/io/hbase/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/05877063/sdks/java/io/hbase/pom.xml
--
diff --git a/sdks/java/io/hbase/pom.xml b/sdks/java/io/hbase/pom.xml
index 1561600..3695bcb 100644
--- a/sdks/java/io/hbase/pom.xml
+++ b/sdks/java/io/hbase/pom.xml
@@ -31,7 +31,7 @@
   Library to read and write from/to HBase
 
   
-1.3.0
+1.3.1
 2.5.1
   
 



[2/2] beam git commit: [BEAM-2044] This closes #2629

2017-04-21 Thread jbonofre
[BEAM-2044] This closes #2629


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4a8e5d5f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4a8e5d5f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4a8e5d5f

Branch: refs/heads/master
Commit: 4a8e5d5f9ef7121ad8b5f59551dde2397240d464
Parents: 459630e 0587706
Author: Jean-Baptiste Onofré 
Authored: Sat Apr 22 07:05:27 2017 +0200
Committer: Jean-Baptiste Onofré 
Committed: Sat Apr 22 07:05:27 2017 +0200

--
 sdks/java/io/hbase/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Created] (BEAM-2053) Add HDFS file provider to wordcount examples

2017-04-21 Thread Thomas Weise (JIRA)
Thomas Weise created BEAM-2053:
--

 Summary: Add HDFS file provider to wordcount examples
 Key: BEAM-2053
 URL: https://issues.apache.org/jira/browse/BEAM-2053
 Project: Beam
  Issue Type: Task
  Components: runner-apex
Reporter: Thomas Weise
Priority: Minor


Instructions for running the example on YARN refer to HDFS, but HDFS is 
currently not supported in the example project:

https://beam.apache.org/documentation/runners/apex/

Using local files is sufficient in a sandbox, for multi-node cluster we need a 
distributed FS.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-825) Fill in the documentation/runners/apex portion of the website

2017-04-21 Thread Thomas Weise (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved BEAM-825.
---
   Resolution: Fixed
Fix Version/s: First stable release

> Fill in the documentation/runners/apex portion of the website
> -
>
> Key: BEAM-825
> URL: https://issues.apache.org/jira/browse/BEAM-825
> Project: Beam
>  Issue Type: Task
>  Components: runner-apex, website
>Reporter: Thomas Weise
>Assignee: Sandeep Deshmukh
> Fix For: First stable release
>
>
> As per 
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit.
> Should be a landing page for Apex-specific information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #1750

2017-04-21 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2052) Windowed file sinks should support dynamic sharding

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979710#comment-15979710
 ] 

ASF GitHub Bot commented on BEAM-2052:
--

GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/2647

[BEAM-2052] Allow dynamic sharding in windowed file sinks

We now allow windowed FileBasedSinks to support dynamic sharding. This 
requires encoding the window and pane in the FileResult object, and delaying 
calling into the FilenamePolicy until the finalize step when we know how many 
shards there are. It also requires us to ensure that elements from different 
windows are written to different temporary files in the WriteBundles step 
(since at that point, the bundle might contain elements from several windows).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam streaming_gcs_output

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2647.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2647


commit 57a52f02b4077344b421676ab548f9bfbd256e73
Author: Reuven Lax 
Date:   2017-04-05T19:13:44Z

Start the process of getting rid of Sink.

commit eb512cc5f343451ca83d981d957884aae62f8a55
Author: Reuven Lax 
Date:   2017-04-05T22:06:14Z

Remove Sink class, and rename Write to WriteFiles.

commit 430f51a89e56cf606ac1d4b3de873b439177c939
Author: Reuven Lax 
Date:   2017-04-06T00:22:15Z

Get rid of Sink and initialize. We keep standin versions of the old Write 
and Sink transforms around as a stopgap solution for HDFSFileSink.

commit 176223954e73e42d4e40d47985083ec907ce02b5
Author: Reuven Lax 
Date:   2017-04-19T17:05:09Z

Fix Javadoc issues.

commit cfad16ceec00f7336cbbca8d1f45edcde4619f84
Author: Reuven Lax 
Date:   2017-04-19T18:01:14Z

Fix javadoc

commit 6b587c5079a24b4ab747c71af8b525bf0f615fea
Author: Reuven Lax 
Date:   2017-04-19T22:18:44Z

Deleting unneeded test.

commit 0dedaf39356170d61ee7b5777bcdda760e78c685
Author: Reuven Lax 
Date:   2017-04-21T17:23:12Z

Foo

commit dfbbe3c2e86963c911681e05149a172881b6466e
Author: Reuven Lax 
Date:   2017-04-21T19:23:31Z

Finish making windowed writes work dynamically.

commit 4bf76edc3ec50857a5f277a5391d5d61eaa2236d
Author: Reuven Lax 
Date:   2017-04-22T02:00:04Z

Finish implementing dynamic-sharding for windowed file outputs, and add an 
integration test.




> Windowed file sinks should support dynamic sharding
> ---
>
> Key: BEAM-2052
> URL: https://issues.apache.org/jira/browse/BEAM-2052
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Davor Bonaci
>
> Currently windowed file sinks (WriteFiles and FileBasedSink) require 
> withNumShards to be set explicitly. We should remove this requirement, and 
> allow dynamic output.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2647: [BEAM-2052] Allow dynamic sharding in windowed file...

2017-04-21 Thread reuvenlax
GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/2647

[BEAM-2052] Allow dynamic sharding in windowed file sinks

We now allow windowed FileBasedSinks to support dynamic sharding. This 
requires encoding the window and pane in the FileResult object, and delaying 
calling into the FilenamePolicy until the finalize step when we know how many 
shards there are. It also requires us to ensure that elements from different 
windows are written to different temporary files in the WriteBundles step 
(since at that point, the bundle might contain elements from several windows).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam streaming_gcs_output

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2647.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2647


commit 57a52f02b4077344b421676ab548f9bfbd256e73
Author: Reuven Lax 
Date:   2017-04-05T19:13:44Z

Start the process of getting rid of Sink.

commit eb512cc5f343451ca83d981d957884aae62f8a55
Author: Reuven Lax 
Date:   2017-04-05T22:06:14Z

Remove Sink class, and rename Write to WriteFiles.

commit 430f51a89e56cf606ac1d4b3de873b439177c939
Author: Reuven Lax 
Date:   2017-04-06T00:22:15Z

Get rid of Sink and initialize. We keep standin versions of the old Write 
and Sink transforms around as a stopgap solution for HDFSFileSink.

commit 176223954e73e42d4e40d47985083ec907ce02b5
Author: Reuven Lax 
Date:   2017-04-19T17:05:09Z

Fix Javadoc issues.

commit cfad16ceec00f7336cbbca8d1f45edcde4619f84
Author: Reuven Lax 
Date:   2017-04-19T18:01:14Z

Fix javadoc

commit 6b587c5079a24b4ab747c71af8b525bf0f615fea
Author: Reuven Lax 
Date:   2017-04-19T22:18:44Z

Deleting unneeded test.

commit 0dedaf39356170d61ee7b5777bcdda760e78c685
Author: Reuven Lax 
Date:   2017-04-21T17:23:12Z

Foo

commit dfbbe3c2e86963c911681e05149a172881b6466e
Author: Reuven Lax 
Date:   2017-04-21T19:23:31Z

Finish making windowed writes work dynamically.

commit 4bf76edc3ec50857a5f277a5391d5d61eaa2236d
Author: Reuven Lax 
Date:   2017-04-22T02:00:04Z

Finish implementing dynamic-sharding for windowed file outputs, and add an 
integration test.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2646: Merge pull request #1 from apache/master

2017-04-21 Thread china-lee
Github user china-lee closed the pull request at:

https://github.com/apache/beam/pull/2646


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2052) Windowed file sinks should support dynamic sharding

2017-04-21 Thread Reuven Lax (JIRA)
Reuven Lax created BEAM-2052:


 Summary: Windowed file sinks should support dynamic sharding
 Key: BEAM-2052
 URL: https://issues.apache.org/jira/browse/BEAM-2052
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Reuven Lax
Assignee: Davor Bonaci


Currently windowed file sinks (WriteFiles and FileBasedSink) require 
withNumShards to be set explicitly. We should remove this requirement, and 
allow dynamic output.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2646: Merge pull request #1 from apache/master

2017-04-21 Thread china-lee
GitHub user china-lee opened a pull request:

https://github.com/apache/beam/pull/2646

Merge pull request #1 from apache/master

update from origin

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/china-lee/Apache-Beam master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2646.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2646


commit 7d675738dfe3754423c17066e93d5447baab8b93
Author: china-lee 
Date:   2017-04-12T13:53:11Z

Merge pull request #1 from apache/master

update from origin




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2051) Reduce scope of the PCollectionView interface

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979692#comment-15979692
 ] 

ASF GitHub Bot commented on BEAM-2051:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2641


> Reduce scope of the PCollectionView interface
> -
>
> Key: BEAM-2051
> URL: https://issues.apache.org/jira/browse/BEAM-2051
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> Users should only ever use a PCollectionView class as a token to access a 
> view. A Runner can cast down to a more expressive type if required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: Make SimplePCollectionView Visible

2017-04-21 Thread tgroh
Make SimplePCollectionView Visible

View will be replaced as a marker interface. Runners can expect to
always recieve a subclass of SimplePCollectionView, and cast to it when
methods are required.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e8f28dd4
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e8f28dd4
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e8f28dd4

Branch: refs/heads/master
Commit: e8f28dd477c2404ebfd26d08b87340a7eea6525c
Parents: 781c155
Author: Thomas Groh 
Authored: Fri Apr 21 09:46:28 2017 -0700
Committer: Thomas Groh 
Committed: Fri Apr 21 19:01:01 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/util/PCollectionViews.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e8f28dd4/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
index 14b36fd..f2052ac 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/PCollectionViews.java
@@ -341,8 +341,10 @@ public class PCollectionViews {
   /**
* A class for {@link PCollectionView} implementations, with additional type 
parameters
* that are not visible at pipeline assembly time when the view is used as a 
side input.
+   *
+   * For internal use only.
*/
-  private static class SimplePCollectionView
+  public static class SimplePCollectionView
   extends PValueBase
   implements PCollectionView {
 /** The {@link PCollection} this view was originally created from. */



[GitHub] beam pull request #2641: [BEAM-2051] Make SimplePCollectionView Visible

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2641


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: This closes #2641

2017-04-21 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 781c15522 -> 459630ec8


This closes #2641


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/459630ec
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/459630ec
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/459630ec

Branch: refs/heads/master
Commit: 459630ec8754794a53c9848f871b907f1f4a10e0
Parents: 781c155 e8f28dd
Author: Thomas Groh 
Authored: Fri Apr 21 19:01:01 2017 -0700
Committer: Thomas Groh 
Committed: Fri Apr 21 19:01:01 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/util/PCollectionViews.java | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--




[GitHub] beam pull request #2645: Fix dataflow staging path to be unique

2017-04-21 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2645

Fix dataflow staging path to be unique

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @aaltay PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-fix-dataflow-staging-path

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2645.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2645


commit 2aee975c5b49bdca1a2cbe1465531e1a12086592
Author: Sourabh Bajaj 
Date:   2017-04-22T01:24:31Z

Fix dataflow staging path to be unique




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : beam_PostCommit_Java_MavenInstall #3422

2017-04-21 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #3420

2017-04-21 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Java_MavenInstall #3421

2017-04-21 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml

--
[...truncated 2.41 MB...]
2017-04-22T00:32:35.155 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar
2017-04-22T00:32:35.271 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-configuration/commons-configuration/1.6/commons-configuration-1.6.jar
 (292 KB at 91.4 KB/sec)
2017-04-22T00:32:35.271 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-digester/commons-digester/1.8/commons-digester-1.8.jar
2017-04-22T00:32:35.412 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-digester/commons-digester/1.8/commons-digester-1.8.jar
 (141 KB at 42.1 KB/sec)
2017-04-22T00:32:35.412 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar
2017-04-22T00:32:35.524 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-beanutils/commons-beanutils/1.7.0/commons-beanutils-1.7.0.jar
 (185 KB at 53.5 KB/sec)
2017-04-22T00:32:35.524 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar
2017-04-22T00:32:35.547 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/spark/spark-network-common_2.10/1.6.3/spark-network-common_2.10-1.6.3.jar
 (2298 KB at 662.3 KB/sec)
2017-04-22T00:32:35.547 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-auth/2.2.0/hadoop-auth-2.2.0.jar
2017-04-22T00:32:35.581 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-auth/2.2.0/hadoop-auth-2.2.0.jar
 (49 KB at 13.9 KB/sec)
2017-04-22T00:32:35.581 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-mapreduce-client-core/2.2.0/hadoop-mapreduce-client-core-2.2.0.jar
2017-04-22T00:32:35.604 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-common/2.2.0/hadoop-common-2.2.0.jar
 (2672 KB at 757.6 KB/sec)
2017-04-22T00:32:35.604 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/hadoop/hadoop-yarn-common/2.2.0/hadoop-yarn-common-2.2.0.jar
2017-04-22T00:32:35.648 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-beanutils/commons-beanutils-core/1.8.0/commons-beanutils-core-1.8.0.jar
 (202 KB at 56.4 KB/sec)
2017-04-22T00:32:35.648 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/aopalliance/aopalliance/1.0/aopalliance-1.0.jar
2017-04-22T00:32:35.674 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/aopalliance/aopalliance/1.0/aopalliance-1.0.jar
 (5 KB at 1.2 KB/sec)
2017-04-22T00:32:35.674 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/objenesis/objenesis/1.2/objenesis-1.2.jar
2017-04-22T00:32:35.708 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/avro/avro/1.8.1/avro-1.8.1.jar
2017-04-22T00:32:35.932 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpclient/4.0.1/httpclient-4.0.1.jar
2017-04-22T00:32:35.943 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpcore/4.0.1/httpcore-4.0.1.jar
2017-04-22T00:32:36.030 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpclient/4.0.1/httpclient-4.0.1.jar
 (285 KB at 71.9 KB/sec)
2017-04-22T00:32:36.030 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/commons/commons-compress/1.9/commons-compress-1.9.jar
2017-04-22T00:32:36.056 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/httpcomponents/httpcore/4.0.1/httpcore-4.0.1.jar
 (169 KB at 42.5 KB/sec)
2017-04-22T00:32:36.056 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.6/commons-lang-2.6.jar
2017-04-22T00:32:36.096 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/zookeeper/zookeeper/3.4.6/zookeeper-3.4.6.jar
2017-04-22T00:32:36.144 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
2017-04-22T00:32:36.153 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/mockito/mockito-all/1.9.5/mockito-all-1.9.5.jar
2017-04-22T00:32:36.232 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/commons-lang/commons-lang/2.6/commons-lang-2.6.jar
 (278 KB at 66.8 KB/sec)
2017-04-22T00:32:36.634 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/mockito/mockito-all/1.9.5/mockito-all-1.9.5.jar
 (1545 KB at 339.0 KB/sec)
2017-04-22T00:32:37.402 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/scala-lang/scala-library/2.10.5/scala-library-2.10.5.jar
 (6964 KB at 1308.2 KB/sec)
2017-04-22T00:32:37.439 [INFO] Downloading: 

Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #3419

2017-04-21 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-115) Beam Runner API

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979640#comment-15979640
 ] 

ASF GitHub Bot commented on BEAM-115:
-

GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2644

[BEAM-115] Fn API support for Python

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam fn-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2644.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2644


commit 4f5c5241949473d81b18cb034640189ea98c49df
Author: Robert Bradshaw 
Date:   2017-04-21T16:59:52Z

Add instructions to regenerate Python proto wrappers.

commit f4f94db5bb3b64c015fd95de64bafd34770aaa21
Author: Robert Bradshaw 
Date:   2017-04-21T17:00:20Z

Generate python proto wrappers for runner and fn API.

commit 6f0e487749b777c5606a69865b7e47134d878a58
Author: Robert Bradshaw 
Date:   2017-04-21T18:04:03Z

Ignore generated files for linter.

commit aa1425c86f5def43ff9032b8666352b747d20440
Author: Robert Bradshaw 
Date:   2017-04-21T18:09:03Z

Ignore generated files in rat plugin.

commit 91428b06f8d4ced0bd3965801920cd94ee8298c6
Author: Robert Bradshaw 
Date:   2017-04-21T19:15:02Z

Add apache licence to generated files.

commit 27f718e89d1829b5b217f263d4b4deb8832c947d
Author: Robert Bradshaw 
Date:   2017-04-20T20:59:25Z

Add fn api runner.

commit f987622d524fa4245089f82fbd150c0cf9ae380d
Author: Robert Bradshaw 
Date:   2017-04-20T21:20:29Z

Restore __init__.py

commit 45ccb571aed5e51ef893428c2a5350ba01864aac
Author: Robert Bradshaw 
Date:   2017-04-20T20:44:33Z

Add runner core files.

commit c6310ad8236578c05432244ced40fe572c2d71a7
Author: Robert Bradshaw 
Date:   2017-04-20T21:23:30Z

move files around

commit 0f9de53c7007e200e5f04c9a30349313b2bdff39
Author: Robert Bradshaw 
Date:   2017-04-20T21:27:14Z

more moving

commit 2f5c518bff028878b7d18a30141beea734cb36a0
Author: Robert Bradshaw 
Date:   2017-04-20T21:43:28Z

Rename runners.

commit 85b2172c1d8d6a5797015c15cd5a301de4a252c6
Author: Robert Bradshaw 
Date:   2017-04-20T21:53:26Z

cythonization works

commit 9f1177db48fa0230fab463cb7a5d784b96c9e084
Author: Robert Bradshaw 
Date:   2017-04-20T22:03:52Z

test module import renames

commit a0e8d792bbfac4550e4399b29c9004f67b82cbd0
Author: Robert Bradshaw 
Date:   2017-04-20T22:06:30Z

remove google3s

commit dfc0b2f3ca04594a2167eaadc7882ca257cdca5f
Author: Robert Bradshaw 
Date:   2017-04-20T22:07:37Z

move sdk_harness to sdk_worker

commit 8b08216c54e65fd0eb5a6c9588405cb4301267b2
Author: Robert Bradshaw 
Date:   2017-04-20T22:09:33Z

Remove unused end_time from statesampler.

commit 011dec575a4066eb7e50c0bff2bad78a9cadd449
Author: Robert Bradshaw 
Date:   2017-04-21T16:59:52Z

Add instructions to regenerate Python proto wrappers.

commit 4a74cdd528c6e9800c66152f9498592f51b59248
Author: Robert Bradshaw 
Date:   2017-04-21T17:00:20Z

Generate python proto wrappers for runner and fn API.

commit 5dcd25d144f784a5b9bd6bd668c2ceb8bc2680f3
Author: Robert Bradshaw 
Date:   2017-04-21T18:04:03Z

Ignore generated files for linter.

commit 5697ecdca1c14ce926f0be63399df77609c12aaf
Author: Robert Bradshaw 
Date:   2017-04-21T18:09:03Z

Ignore generated files in rat plugin.

commit 57a16aa810f5e4ef14c5f2ced901c3bb6fbc7b1a
Author: Robert Bradshaw 
Date:   2017-04-21T19:15:02Z

Add apache licence to generated files.

commit a8456be3830869e5b51e24970655819bc832bd79
Author: Robert Bradshaw 
Date:   2017-04-21T19:36:27Z

implement portpicker, fix more imports

commit 

[jira] [Closed] (BEAM-1745) Unintended unboxing of potential null pointer in AutoValue_ElasticsearchIO_Write

2017-04-21 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu closed BEAM-1745.

   Resolution: Not A Problem
Fix Version/s: Not applicable

> Unintended unboxing of potential null pointer in 
> AutoValue_ElasticsearchIO_Write
> 
>
> Key: BEAM-1745
> URL: https://issues.apache.org/jira/browse/BEAM-1745
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: Not applicable
>
>
> {code}
>   if (maxBatchSize == null) {
> missing += " maxBatchSize";
>   }
> ...
>   return new AutoValue_ElasticsearchIO_Write(
>   this.connectionConfiguration,
>   this.maxBatchSize,
>   this.maxBatchSizeBytes);
> {code}
> If maxBatchSize is null, it would be unboxed at the time 
> AutoValue_ElasticsearchIO_Write is constructed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2644: [BEAM-115] Fn API support for Python

2017-04-21 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2644

[BEAM-115] Fn API support for Python

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam fn-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2644.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2644


commit 4f5c5241949473d81b18cb034640189ea98c49df
Author: Robert Bradshaw 
Date:   2017-04-21T16:59:52Z

Add instructions to regenerate Python proto wrappers.

commit f4f94db5bb3b64c015fd95de64bafd34770aaa21
Author: Robert Bradshaw 
Date:   2017-04-21T17:00:20Z

Generate python proto wrappers for runner and fn API.

commit 6f0e487749b777c5606a69865b7e47134d878a58
Author: Robert Bradshaw 
Date:   2017-04-21T18:04:03Z

Ignore generated files for linter.

commit aa1425c86f5def43ff9032b8666352b747d20440
Author: Robert Bradshaw 
Date:   2017-04-21T18:09:03Z

Ignore generated files in rat plugin.

commit 91428b06f8d4ced0bd3965801920cd94ee8298c6
Author: Robert Bradshaw 
Date:   2017-04-21T19:15:02Z

Add apache licence to generated files.

commit 27f718e89d1829b5b217f263d4b4deb8832c947d
Author: Robert Bradshaw 
Date:   2017-04-20T20:59:25Z

Add fn api runner.

commit f987622d524fa4245089f82fbd150c0cf9ae380d
Author: Robert Bradshaw 
Date:   2017-04-20T21:20:29Z

Restore __init__.py

commit 45ccb571aed5e51ef893428c2a5350ba01864aac
Author: Robert Bradshaw 
Date:   2017-04-20T20:44:33Z

Add runner core files.

commit c6310ad8236578c05432244ced40fe572c2d71a7
Author: Robert Bradshaw 
Date:   2017-04-20T21:23:30Z

move files around

commit 0f9de53c7007e200e5f04c9a30349313b2bdff39
Author: Robert Bradshaw 
Date:   2017-04-20T21:27:14Z

more moving

commit 2f5c518bff028878b7d18a30141beea734cb36a0
Author: Robert Bradshaw 
Date:   2017-04-20T21:43:28Z

Rename runners.

commit 85b2172c1d8d6a5797015c15cd5a301de4a252c6
Author: Robert Bradshaw 
Date:   2017-04-20T21:53:26Z

cythonization works

commit 9f1177db48fa0230fab463cb7a5d784b96c9e084
Author: Robert Bradshaw 
Date:   2017-04-20T22:03:52Z

test module import renames

commit a0e8d792bbfac4550e4399b29c9004f67b82cbd0
Author: Robert Bradshaw 
Date:   2017-04-20T22:06:30Z

remove google3s

commit dfc0b2f3ca04594a2167eaadc7882ca257cdca5f
Author: Robert Bradshaw 
Date:   2017-04-20T22:07:37Z

move sdk_harness to sdk_worker

commit 8b08216c54e65fd0eb5a6c9588405cb4301267b2
Author: Robert Bradshaw 
Date:   2017-04-20T22:09:33Z

Remove unused end_time from statesampler.

commit 011dec575a4066eb7e50c0bff2bad78a9cadd449
Author: Robert Bradshaw 
Date:   2017-04-21T16:59:52Z

Add instructions to regenerate Python proto wrappers.

commit 4a74cdd528c6e9800c66152f9498592f51b59248
Author: Robert Bradshaw 
Date:   2017-04-21T17:00:20Z

Generate python proto wrappers for runner and fn API.

commit 5dcd25d144f784a5b9bd6bd668c2ceb8bc2680f3
Author: Robert Bradshaw 
Date:   2017-04-21T18:04:03Z

Ignore generated files for linter.

commit 5697ecdca1c14ce926f0be63399df77609c12aaf
Author: Robert Bradshaw 
Date:   2017-04-21T18:09:03Z

Ignore generated files in rat plugin.

commit 57a16aa810f5e4ef14c5f2ced901c3bb6fbc7b1a
Author: Robert Bradshaw 
Date:   2017-04-21T19:15:02Z

Add apache licence to generated files.

commit a8456be3830869e5b51e24970655819bc832bd79
Author: Robert Bradshaw 
Date:   2017-04-21T19:36:27Z

implement portpicker, fix more imports

commit c4633f174d436304da142606145a4827c9da7f4e
Author: Robert Bradshaw 
Date:   2017-04-21T20:04:22Z

portpicker fixes

commit f349f14e49e03b675ad7f7ab06e2c871c9d5eae4
Author: Robert Bradshaw 
Date:   

[jira] [Commented] (BEAM-1631) Flink runner: submit job to a Flink-on-YARN cluster

2017-04-21 Thread Vikas Kedigehalli (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979634#comment-15979634
 ] 

Vikas Kedigehalli commented on BEAM-1631:
-

[~aljoscha] I was able to come up with something without the need to have a 
bin/flink installation, 
https://github.com/vikkyrk/incubator-beam/commit/7405d376db390aab0f4b658b34c35b2e50eca63b
 (just a hack, needs clean up) but it still require users to have a 
{{HADOOP_CONF_DIR}}. If we want to go with my approach then I am happy to clean 
up and send out a PR. 

> Flink runner: submit job to a Flink-on-YARN cluster
> ---
>
> Key: BEAM-1631
> URL: https://issues.apache.org/jira/browse/BEAM-1631
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Davor Bonaci
>Assignee: Aljoscha Krettek
>
> As far as I understand, running Beam pipelines on a Flink cluster can be done 
> in two ways:
> * Run directly with a Flink runner, and specifying {{--flinkMaster}} pipeline 
> option via, say, {{mvn exec}}.
> * Produce a bundled JAR, and use {{bin/flink}} to submit the same pipeline.
> These two ways are equivalent, and work well on a standalone Flink cluster.
> Submitting to a Flink-on-YARN is more complicated. You can still produce a 
> bundled JAR, and use {{bin/flink -yid }} to submit such a job. 
> However, that seems impossible with a Flink runner directly.
> If so, we should add the ability to the Flink runner to submit a job to a 
> Flink-on-YARN cluster directly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1914) XML IO should comply with PTransform style guide

2017-04-21 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-1914.
--

> XML IO should comply with PTransform style guide
> 
>
> Key: BEAM-1914
> URL: https://issues.apache.org/jira/browse/BEAM-1914
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>  Labels: backward-incompatible, starter
> Fix For: First stable release
>
>
> Currently we have XmlSource and XmlSink in the Java SDK. They violate the 
> PTransform style guide in several respects:
> - They should be grouped into an XmlIO class with read() and write() verbs, 
> like all the other similar connectors
> - The source/sink classes should be made private or package-local
> - Should get rid of XmlSink.Bound - XmlSink itself should inherit from 
> FileBasedSink
> - Could optionally benefit from AutoValue
> See e.g. the PR with BigQuery fixes https://github.com/apache/beam/pull/2149



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1414) CountingInput should comply with PTransform style guide

2017-04-21 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-1414.
--
Resolution: Fixed

> CountingInput should comply with PTransform style guide
> ---
>
> Key: BEAM-1414
> URL: https://issues.apache.org/jira/browse/BEAM-1414
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>  Labels: backward-incompatible, starter
> Fix For: First stable release
>
>
> Suggested changes:
> - Rename the whole class and its inner transforms to sound more verb-like, 
> e.g.: GenerateRange.Bounded/Unbounded (as opposed to current 
> CountingInput.BoundedCountingInput)
> - Provide a more unified API between bounded and unbounded cases: 
> GenerateRange.from(100) should return a GenerateRange.Unbounded; 
> GenerateRange.from(100).to(200) should return a GenerateRange.Bounded. They 
> both should accept a timestampFn. The unbounded one _should not_ have a 
> withMaxNumRecords builder - that's redundant with specifying the range.
> - (optional) Use AutoValue



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1414) CountingInput should comply with PTransform style guide

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979619#comment-15979619
 ] 

ASF GitHub Bot commented on BEAM-1414:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2601


> CountingInput should comply with PTransform style guide
> ---
>
> Key: BEAM-1414
> URL: https://issues.apache.org/jira/browse/BEAM-1414
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>  Labels: backward-incompatible, starter
> Fix For: First stable release
>
>
> Suggested changes:
> - Rename the whole class and its inner transforms to sound more verb-like, 
> e.g.: GenerateRange.Bounded/Unbounded (as opposed to current 
> CountingInput.BoundedCountingInput)
> - Provide a more unified API between bounded and unbounded cases: 
> GenerateRange.from(100) should return a GenerateRange.Unbounded; 
> GenerateRange.from(100).to(200) should return a GenerateRange.Bounded. They 
> both should accept a timestampFn. The unbounded one _should not_ have a 
> withMaxNumRecords builder - that's redundant with specifying the range.
> - (optional) Use AutoValue



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[5/5] beam git commit: This closes #2601

2017-04-21 Thread jkff
This closes #2601


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/781c1552
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/781c1552
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/781c1552

Branch: refs/heads/master
Commit: 781c15522d188c461e991515173304b94b1190ae
Parents: 62f041e dffa6a8
Author: Eugene Kirpichov 
Authored: Fri Apr 21 16:53:58 2017 -0700
Committer: Eugene Kirpichov 
Committed: Fri Apr 21 16:53:58 2017 -0700

--
 .../translation/ReadUnboundTranslatorTest.java  |   6 +-
 .../EmptyFlattenAsCreateFactoryTest.java|  10 +-
 .../core/construction/PCollectionsTest.java |   6 +-
 .../PTransformReplacementsTest.java |   4 +-
 .../core/construction/PTransformsTest.java  |  17 +-
 .../core/construction/SdkComponentsTest.java|  16 +-
 .../runners/direct/DirectGraphVisitorTest.java  |   6 +-
 .../beam/runners/direct/DirectRunnerTest.java   |  10 +-
 .../runners/direct/EvaluationContextTest.java   |   4 +-
 .../beam/runners/flink/ReadSourceITCase.java|   4 +-
 .../flink/ReadSourceStreamingITCase.java|   4 +-
 .../streaming/StreamingSourceMetricsTest.java   |   7 +-
 .../org/apache/beam/sdk/io/CountingInput.java   | 283 ---
 .../org/apache/beam/sdk/io/CountingSource.java  |  47 +--
 .../apache/beam/sdk/io/GenerateSequence.java| 194 +
 .../org/apache/beam/sdk/values/PCollection.java |   9 +-
 .../java/org/apache/beam/sdk/PipelineTest.java  |  58 ++--
 .../apache/beam/sdk/io/CountingInputTest.java   | 221 ---
 .../apache/beam/sdk/io/CountingSourceTest.java  |   4 +-
 .../beam/sdk/io/GenerateSequenceTest.java   | 194 +
 .../apache/beam/sdk/metrics/MetricsTest.java|  12 +-
 .../sdk/runners/TransformHierarchyTest.java |   7 +-
 .../beam/sdk/testing/GatherAllPanesTest.java|   8 +-
 .../apache/beam/sdk/testing/PAssertTest.java|  12 +-
 .../apache/beam/sdk/transforms/FlattenTest.java |   8 +-
 .../apache/beam/sdk/transforms/ParDoTest.java   |   4 +-
 .../sdk/transforms/windowing/WindowTest.java|   4 +-
 .../beam/sdk/values/PCollectionListTest.java|  25 +-
 .../beam/sdk/values/PCollectionTupleTest.java   |   6 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java |   6 +-
 .../sdk/io/gcp/bigtable/BigtableWriteIT.java|   4 +-
 .../beam/sdk/io/gcp/datastore/V1WriteIT.java|   4 +-
 32 files changed, 540 insertions(+), 664 deletions(-)
--




[GitHub] beam pull request #2601: [BEAM-1414] Replaces CountingInput with style guide...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2601


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/5] beam git commit: Deletes CountingInput

2017-04-21 Thread jkff
Deletes CountingInput


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/57eeaae1
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/57eeaae1
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/57eeaae1

Branch: refs/heads/master
Commit: 57eeaae11ea248c8145f467148e799d6c3565402
Parents: 6a9a24c
Author: Eugene Kirpichov 
Authored: Wed Apr 19 15:36:42 2017 -0700
Committer: Eugene Kirpichov 
Committed: Fri Apr 21 16:53:50 2017 -0700

--
 .../org/apache/beam/sdk/io/CountingInput.java   | 283 ---
 .../apache/beam/sdk/io/CountingInputTest.java   | 221 ---
 2 files changed, 504 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/57eeaae1/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
deleted file mode 100644
index ab006d4..000
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
+++ /dev/null
@@ -1,283 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io;
-
-import static com.google.common.base.Preconditions.checkArgument;
-import static com.google.common.base.Preconditions.checkNotNull;
-
-import com.google.common.base.Optional;
-import org.apache.beam.sdk.io.CountingSource.NowTimestampFn;
-import org.apache.beam.sdk.io.Read.Unbounded;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.transforms.SerializableFunction;
-import org.apache.beam.sdk.transforms.display.DisplayData;
-import org.apache.beam.sdk.values.PBegin;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PCollection.IsBounded;
-import org.joda.time.Duration;
-import org.joda.time.Instant;
-
-/**
- * A {@link PTransform} that produces longs. When used to produce a
- * {@link IsBounded#BOUNDED bounded} {@link PCollection}, {@link 
CountingInput} starts at {@code 0}
- * or starting value, and counts up to a specified maximum. When used to 
produce an
- * {@link IsBounded#UNBOUNDED unbounded} {@link PCollection}, it counts up to 
{@link Long#MAX_VALUE}
- * and then never produces more output. (In practice, this limit should never 
be reached.)
- *
- * The bounded {@link CountingInput} is implemented based on {@link 
OffsetBasedSource} and
- * {@link OffsetBasedSource.OffsetBasedReader}, so it performs efficient 
initial splitting and it
- * supports dynamic work rebalancing.
- *
- * To produce a bounded {@code PCollection} starting from {@code 0},
- * use {@link CountingInput#upTo(long)}:
- *
- * {@code
- * Pipeline p = ...
- * PTransform producer = CountingInput.upTo(1000);
- * PCollection bounded = p.apply(producer);
- * }
- *
- * To produce a bounded {@code PCollection} starting from {@code 
startOffset},
- * use {@link CountingInput#forSubrange(long, long)} instead.
- *
- * To produce an unbounded {@code PCollection}, use {@link 
CountingInput#unbounded()},
- * calling {@link 
UnboundedCountingInput#withTimestampFn(SerializableFunction)} to provide values
- * with timestamps other than {@link Instant#now}.
- *
- * {@code
- * Pipeline p = ...
- *
- * // To create an unbounded producer that uses processing time as the element 
timestamp.
- * PCollection unbounded = p.apply(CountingInput.unbounded());
- * // Or, to create an unbounded source that uses a provided function to set 
the element timestamp.
- * PCollection unboundedWithTimestamps =
- * p.apply(CountingInput.unbounded().withTimestampFn(someFn));
- * }
- */
-public class CountingInput {
-  /**
-   * Creates a {@link BoundedCountingInput} that will produce the specified 
number of elements,
-   * from {@code 0} to {@code numElements - 1}.
-   */
-  public static BoundedCountingInput upTo(long numElements) {
-checkArgument(numElements >= 0,
- 

[4/5] beam git commit: Replaces fromTo() with from().to()

2017-04-21 Thread jkff
Replaces fromTo() with from().to()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/dffa6a88
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/dffa6a88
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/dffa6a88

Branch: refs/heads/master
Commit: dffa6a8832c5ead67a398e0787f7137f9f15fa1f
Parents: 57eeaae
Author: Eugene Kirpichov 
Authored: Fri Apr 21 15:18:18 2017 -0700
Committer: Eugene Kirpichov 
Committed: Fri Apr 21 16:53:50 2017 -0700

--
 .../translation/ReadUnboundTranslatorTest.java|  2 +-
 .../EmptyFlattenAsCreateFactoryTest.java  |  2 +-
 .../core/construction/PCollectionsTest.java   |  2 +-
 .../beam/runners/direct/DirectRunnerTest.java |  2 +-
 .../beam/runners/flink/ReadSourceITCase.java  |  2 +-
 .../runners/flink/ReadSourceStreamingITCase.java  |  2 +-
 .../streaming/StreamingSourceMetricsTest.java |  2 +-
 .../org/apache/beam/sdk/io/GenerateSequence.java  |  5 -
 .../org/apache/beam/sdk/values/PCollection.java   |  8 
 .../java/org/apache/beam/sdk/PipelineTest.java|  2 +-
 .../apache/beam/sdk/io/GenerateSequenceTest.java  | 18 +-
 .../org/apache/beam/sdk/metrics/MetricsTest.java  |  4 ++--
 .../beam/sdk/testing/GatherAllPanesTest.java  |  6 +++---
 .../org/apache/beam/sdk/testing/PAssertTest.java  |  8 
 .../apache/beam/sdk/transforms/FlattenTest.java   |  6 +++---
 .../beam/sdk/transforms/windowing/WindowTest.java |  2 +-
 .../beam/sdk/values/PCollectionListTest.java  |  2 +-
 .../beam/sdk/values/PCollectionTupleTest.java |  2 +-
 .../beam/sdk/io/gcp/bigtable/BigtableWriteIT.java |  2 +-
 .../beam/sdk/io/gcp/datastore/V1WriteIT.java  |  2 +-
 20 files changed, 38 insertions(+), 43 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/dffa6a88/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
--
diff --git 
a/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
 
b/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
index e0cc251..6f54e23 100644
--- 
a/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
+++ 
b/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
@@ -92,7 +92,7 @@ public class ReadUnboundTranslatorTest {
 Pipeline p = Pipeline.create(options);
 
 Set expected = ContiguousSet.create(Range.closedOpen(0L, 10L), 
DiscreteDomain.longs());
-p.apply(GenerateSequence.fromTo(0, 10))
+p.apply(GenerateSequence.from(0).to(10))
 .apply(ParDo.of(new EmbeddedCollector()));
 
 ApexRunnerResult result = (ApexRunnerResult) p.run();

http://git-wip-us.apache.org/repos/asf/beam/blob/dffa6a88/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
--
diff --git 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
index bfa3190..c388878 100644
--- 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
+++ 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
@@ -72,7 +72,7 @@ public class EmptyFlattenAsCreateFactoryTest {
   public void getInputNonEmptyThrows() {
 PCollectionList nonEmpty =
 PCollectionList.of(pipeline.apply("unbounded", 
GenerateSequence.from(0)))
-.and(pipeline.apply("bounded", GenerateSequence.fromTo(0, 100)));
+.and(pipeline.apply("bounded", GenerateSequence.from(0).to(100)));
 thrown.expect(IllegalArgumentException.class);
 thrown.expectMessage(nonEmpty.expand().toString());
 thrown.expectMessage(EmptyFlattenAsCreateFactory.class.getSimpleName());

http://git-wip-us.apache.org/repos/asf/beam/blob/dffa6a88/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
--
diff --git 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
 
b/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/PCollectionsTest.java
index 48aa1f1..be3755c 100644
--- 

[3/5] beam git commit: Replaces all usages of CountingInput with GenerateSequence

2017-04-21 Thread jkff
Replaces all usages of CountingInput with GenerateSequence


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6a9a24c0
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6a9a24c0
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6a9a24c0

Branch: refs/heads/master
Commit: 6a9a24c064518bb83a7383babdff9b263dc61346
Parents: 88c6612
Author: Eugene Kirpichov 
Authored: Wed Apr 19 15:32:08 2017 -0700
Committer: Eugene Kirpichov 
Committed: Fri Apr 21 16:53:50 2017 -0700

--
 .../translation/ReadUnboundTranslatorTest.java  |  6 +-
 .../EmptyFlattenAsCreateFactoryTest.java| 10 ++--
 .../core/construction/PCollectionsTest.java |  6 +-
 .../PTransformReplacementsTest.java |  4 +-
 .../core/construction/PTransformsTest.java  | 17 +++---
 .../core/construction/SdkComponentsTest.java| 16 +++---
 .../runners/direct/DirectGraphVisitorTest.java  |  6 +-
 .../beam/runners/direct/DirectRunnerTest.java   | 10 ++--
 .../runners/direct/EvaluationContextTest.java   |  4 +-
 .../beam/runners/flink/ReadSourceITCase.java|  4 +-
 .../flink/ReadSourceStreamingITCase.java|  4 +-
 .../streaming/StreamingSourceMetricsTest.java   |  7 ++-
 .../org/apache/beam/sdk/io/CountingSource.java  | 43 ---
 .../apache/beam/sdk/io/GenerateSequence.java|  5 ++
 .../org/apache/beam/sdk/values/PCollection.java |  9 +--
 .../java/org/apache/beam/sdk/PipelineTest.java  | 58 +++-
 .../beam/sdk/io/GenerateSequenceTest.java   | 18 +++---
 .../apache/beam/sdk/metrics/MetricsTest.java| 12 ++--
 .../sdk/runners/TransformHierarchyTest.java |  7 +--
 .../beam/sdk/testing/GatherAllPanesTest.java|  8 +--
 .../apache/beam/sdk/testing/PAssertTest.java| 12 ++--
 .../apache/beam/sdk/transforms/FlattenTest.java |  8 +--
 .../apache/beam/sdk/transforms/ParDoTest.java   |  4 +-
 .../sdk/transforms/windowing/WindowTest.java|  4 +-
 .../beam/sdk/values/PCollectionListTest.java| 25 +
 .../beam/sdk/values/PCollectionTupleTest.java   |  6 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java |  6 +-
 .../sdk/io/gcp/bigtable/BigtableWriteIT.java|  4 +-
 .../beam/sdk/io/gcp/datastore/V1WriteIT.java|  4 +-
 29 files changed, 162 insertions(+), 165 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6a9a24c0/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
--
diff --git 
a/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
 
b/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
index 6d19bb9..e0cc251 100644
--- 
a/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
+++ 
b/runners/apex/src/test/java/org/apache/beam/runners/apex/translation/ReadUnboundTranslatorTest.java
@@ -35,7 +35,7 @@ import 
org.apache.beam.runners.apex.translation.operators.ApexReadUnboundedInput
 import org.apache.beam.runners.apex.translation.utils.CollectionSource;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
-import org.apache.beam.sdk.io.CountingInput;
+import org.apache.beam.sdk.io.GenerateSequence;
 import org.apache.beam.sdk.io.Read;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
 import org.apache.beam.sdk.transforms.DoFn;
@@ -92,12 +92,12 @@ public class ReadUnboundTranslatorTest {
 Pipeline p = Pipeline.create(options);
 
 Set expected = ContiguousSet.create(Range.closedOpen(0L, 10L), 
DiscreteDomain.longs());
-p.apply(CountingInput.upTo(10))
+p.apply(GenerateSequence.fromTo(0, 10))
 .apply(ParDo.of(new EmbeddedCollector()));
 
 ApexRunnerResult result = (ApexRunnerResult) p.run();
 DAG dag = result.getApexDAG();
-String operatorName = 
"CountingInput.BoundedCountingInput/Read(BoundedCountingSource)";
+String operatorName = "GenerateSequence/Read(BoundedCountingSource)";
 DAG.OperatorMeta om = dag.getOperatorMeta(operatorName);
 Assert.assertNotNull(om);
 Assert.assertEquals(om.getOperator().getClass(), 
ApexReadUnboundedInputOperator.class);

http://git-wip-us.apache.org/repos/asf/beam/blob/6a9a24c0/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
--
diff --git 
a/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/EmptyFlattenAsCreateFactoryTest.java
 

[1/5] beam git commit: Introduces GenerateSequence transform

2017-04-21 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 62f041e56 -> 781c15522


Introduces GenerateSequence transform

It is a replacement for CountingInput, which will be deprecated.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/88c66129
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/88c66129
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/88c66129

Branch: refs/heads/master
Commit: 88c66129ba0cff9c8319f21ad317597d9bd8b5cd
Parents: 62f041e
Author: Eugene Kirpichov 
Authored: Tue Apr 18 16:48:38 2017 -0700
Committer: Eugene Kirpichov 
Committed: Fri Apr 21 16:53:49 2017 -0700

--
 .../org/apache/beam/sdk/io/CountingInput.java   |   2 +-
 .../org/apache/beam/sdk/io/CountingSource.java  |   4 +-
 .../apache/beam/sdk/io/GenerateSequence.java| 194 +++
 .../apache/beam/sdk/io/CountingSourceTest.java  |   4 +-
 .../beam/sdk/io/GenerateSequenceTest.java   | 194 +++
 5 files changed, 393 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/88c66129/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
index 72ebd97..ab006d4 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingInput.java
@@ -247,7 +247,7 @@ public class CountingInput {
 public PCollection expand(PBegin begin) {
   Unbounded read =
   Read.from(
-  CountingSource.createUnbounded()
+  CountingSource.createUnboundedFrom(0)
   .withTimestampFn(timestampFn)
   .withRate(elementsPerPeriod, period));
   if (!maxNumRecords.isPresent() && !maxReadTime.isPresent()) {

http://git-wip-us.apache.org/repos/asf/beam/blob/88c66129/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java
index 73b663d..dd018f4 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/CountingSource.java
@@ -103,8 +103,8 @@ public class CountingSource {
* Create a new {@link UnboundedCountingSource}.
*/
   // package-private to return a typed UnboundedCountingSource rather than the 
UnboundedSource type.
-  static UnboundedCountingSource createUnbounded() {
-return new UnboundedCountingSource(0, 1, 1L, Duration.ZERO, new 
NowTimestampFn());
+  static UnboundedCountingSource createUnboundedFrom(long start) {
+return new UnboundedCountingSource(start, 1, 1L, Duration.ZERO, new 
NowTimestampFn());
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/beam/blob/88c66129/sdks/java/core/src/main/java/org/apache/beam/sdk/io/GenerateSequence.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/GenerateSequence.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/GenerateSequence.java
new file mode 100644
index 000..189539f
--- /dev/null
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/GenerateSequence.java
@@ -0,0 +1,194 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io;
+
+import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.common.base.Preconditions.checkNotNull;
+
+import com.google.auto.value.AutoValue;
+import javax.annotation.Nullable;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.SerializableFunction;
+import 

[jira] [Commented] (BEAM-1988) utils.path.join does not correctly handle GCS bucket roots

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979608#comment-15979608
 ] 

ASF GitHub Bot commented on BEAM-1988:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2643

[BEAM-1988] Migrate from utils.path to BFS

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @chamikaramj PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-1988-path-join-filesystem-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2643


commit 75719dc0d91089ecb29622c2a4f68db644ae500a
Author: Sourabh Bajaj 
Date:   2017-04-21T23:39:34Z

[BEAM-1988] Migrate from utils.path to BFS




> utils.path.join does not correctly handle GCS bucket roots
> --
>
> Key: BEAM-1988
> URL: https://issues.apache.org/jira/browse/BEAM-1988
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Sourabh Bajaj
> Fix For: First stable release
>
>
> Here:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/path.py#L22
> Joining a bucket root with a filename e.g. (gs://mybucket/ , myfile) results 
> in invalid 'gs://mybucket//myfile', notice the double // between mybucket and 
> myfile. (It actually does not handle anything that already ends with {{/}} 
> correctly)
> [~sb2nov] could you take this one? Also, should the `join` operation move to 
> a BeamFileSystem level code.
> (cc: [~chamikara])



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2643: [BEAM-1988] Migrate from utils.path to BFS

2017-04-21 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2643

[BEAM-1988] Migrate from utils.path to BFS

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @chamikaramj PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-1988-path-join-filesystem-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2643.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2643


commit 75719dc0d91089ecb29622c2a4f68db644ae500a
Author: Sourabh Bajaj 
Date:   2017-04-21T23:39:34Z

[BEAM-1988] Migrate from utils.path to BFS




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (BEAM-2020) Move CloudObject to Dataflow runner

2017-04-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-2020:
---

Assignee: Thomas Groh  (was: Luke Cwik)

> Move CloudObject to Dataflow runner
> ---
>
> Key: BEAM-2020
> URL: https://issues.apache.org/jira/browse/BEAM-2020
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
> Fix For: First stable release
>
>
> This entails primarily eliminating Coder.asCloudObject() by adding the needed 
> accessors, and possibly a serialization registrar discipline, for coders in 
> the Runner API proto.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979594#comment-15979594
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2640


> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2640: [BEAM-1871] Move Xml IO and related classes to new ...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2640


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/4] beam git commit: [BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.

2017-04-21 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 022d5b657 -> 62f041e56


http://git-wip-us.apache.org/repos/asf/beam/blob/393a90c7/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/package-info.java
--
diff --git 
a/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/package-info.java 
b/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/package-info.java
new file mode 100644
index 000..9c5089a
--- /dev/null
+++ 
b/sdks/java/io/xml/src/main/java/org/apache/beam/sdk/io/xml/package-info.java
@@ -0,0 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/**
+ * Transforms for reading and writing Xml files.
+ */
+package org.apache.beam.sdk.io.xml;

http://git-wip-us.apache.org/repos/asf/beam/blob/393a90c7/sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/JAXBCoderTest.java
--
diff --git 
a/sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/JAXBCoderTest.java 
b/sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/JAXBCoderTest.java
new file mode 100644
index 000..5f1330d
--- /dev/null
+++ 
b/sdks/java/io/xml/src/test/java/org/apache/beam/sdk/io/xml/JAXBCoderTest.java
@@ -0,0 +1,228 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.xml;
+
+import static org.hamcrest.Matchers.equalTo;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertThat;
+
+import com.google.common.collect.ImmutableList;
+import java.io.IOException;
+import java.io.InputStream;
+import java.io.OutputStream;
+import java.util.List;
+import java.util.concurrent.CountDownLatch;
+import java.util.concurrent.Executor;
+import java.util.concurrent.Executors;
+import java.util.concurrent.atomic.AtomicReference;
+import javax.xml.bind.annotation.XmlRootElement;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.CoderException;
+import org.apache.beam.sdk.coders.StandardCoder;
+import org.apache.beam.sdk.coders.VarIntCoder;
+import org.apache.beam.sdk.coders.VarLongCoder;
+import org.apache.beam.sdk.testing.CoderProperties;
+import org.apache.beam.sdk.util.CoderUtils;
+import org.apache.beam.sdk.util.SerializableUtils;
+import org.apache.beam.sdk.values.TypeDescriptor;
+import org.junit.Test;
+import org.junit.runner.RunWith;
+import org.junit.runners.JUnit4;
+
+/** Unit tests for {@link JAXBCoder}. */
+@RunWith(JUnit4.class)
+public class JAXBCoderTest {
+
+  @XmlRootElement
+  static class TestType {
+private String testString = null;
+private int testInt;
+
+public TestType() {}
+
+public TestType(String testString, int testInt) {
+  this.testString = testString;
+  this.testInt = testInt;
+}
+
+public String getTestString() {
+  return testString;
+}
+
+public void setTestString(String testString) {
+  this.testString = testString;
+}
+
+public int getTestInt() {
+  return testInt;
+}
+
+public void setTestInt(int testInt) {
+  this.testInt = testInt;
+}
+
+@Override
+public int hashCode() {
+  int hashCode = 1;
+  hashCode = 31 * hashCode + (testString == null ? 0 : 
testString.hashCode());
+  hashCode = 31 * hashCode + testInt;
+  return hashCode;
+}
+
+@Override
+public boolean equals(Object obj) {
+  if (!(obj instanceof 

[3/4] beam git commit: [BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.

2017-04-21 Thread lcwik
[BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/393a90c7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/393a90c7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/393a90c7

Branch: refs/heads/master
Commit: 393a90c74a86d7484d047316a5ccb22cd360a4d0
Parents: 022d5b6
Author: Luke Cwik 
Authored: Fri Apr 21 15:45:04 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 16:37:44 2017 -0700

--
 sdks/java/core/pom.xml  |  31 +-
 .../org/apache/beam/sdk/coders/JAXBCoder.java   | 201 -
 .../beam/sdk/coders/StringDelegateCoder.java|   3 +-
 .../org/apache/beam/sdk/io/FileBasedSource.java |   2 +-
 .../main/java/org/apache/beam/sdk/io/XmlIO.java | 476 --
 .../java/org/apache/beam/sdk/io/XmlSink.java| 153 
 .../java/org/apache/beam/sdk/io/XmlSource.java  | 404 -
 .../beam/sdk/testing/SourceTestUtils.java   |   2 +-
 .../apache/beam/sdk/coders/JAXBCoderTest.java   | 223 -
 .../org/apache/beam/sdk/io/XmlSinkTest.java | 253 --
 .../org/apache/beam/sdk/io/XmlSourceTest.java   | 892 --
 sdks/java/io/pom.xml|   1 +
 sdks/java/io/xml/pom.xml| 118 +++
 .../org/apache/beam/sdk/io/xml/JAXBCoder.java   | 203 +
 .../java/org/apache/beam/sdk/io/xml/XmlIO.java  | 469 ++
 .../org/apache/beam/sdk/io/xml/XmlSink.java | 160 
 .../org/apache/beam/sdk/io/xml/XmlSource.java   | 404 +
 .../apache/beam/sdk/io/xml/package-info.java|  22 +
 .../apache/beam/sdk/io/xml/JAXBCoderTest.java   | 228 +
 .../org/apache/beam/sdk/io/xml/XmlSinkTest.java | 253 ++
 .../apache/beam/sdk/io/xml/XmlSourceTest.java   | 893 +++
 21 files changed, 2755 insertions(+), 2636 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/393a90c7/sdks/java/core/pom.xml
--
diff --git a/sdks/java/core/pom.xml b/sdks/java/core/pom.xml
index 7af1444..ac7a3bb 100644
--- a/sdks/java/core/pom.xml
+++ b/sdks/java/core/pom.xml
@@ -232,36 +232,7 @@
   joda-time
 
 
-
-
-  org.codehaus.woodstox
-  stax2-api
-  ${stax2.version}
-  true
-
-
-
-  org.codehaus.woodstox
-  woodstox-core-asl
-  ${woodstox.version}
-  runtime
-  true
-  
-
-
-  javax.xml.stream
-  stax-api
-
-  
-
-
-
 
   org.tukaani

http://git-wip-us.apache.org/repos/asf/beam/blob/393a90c7/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java
deleted file mode 100644
index ea636fc..000
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java
+++ /dev/null
@@ -1,201 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.coders;
-
-import com.fasterxml.jackson.annotation.JsonCreator;
-import com.fasterxml.jackson.annotation.JsonProperty;
-import com.google.common.io.ByteStreams;
-import java.io.FilterInputStream;
-import java.io.FilterOutputStream;
-import java.io.IOException;
-import java.io.InputStream;
-import java.io.OutputStream;
-import javax.xml.bind.JAXBContext;
-import javax.xml.bind.JAXBException;
-import javax.xml.bind.Marshaller;
-import javax.xml.bind.Unmarshaller;
-import org.apache.beam.sdk.util.CloudObject;
-import org.apache.beam.sdk.util.EmptyOnDeserializationThreadLocal;
-import org.apache.beam.sdk.util.Structs;
-import org.apache.beam.sdk.util.VarInt;
-import org.apache.beam.sdk.values.TypeDescriptor;
-
-/**
- * A coder for JAXB annotated objects. This coder uses JAXB 
marshalling/unmarshalling mechanisms
- * to encode/decode 

[4/4] beam git commit: [BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.

2017-04-21 Thread lcwik
[BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.

This closes #2640


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/62f041e5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/62f041e5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/62f041e5

Branch: refs/heads/master
Commit: 62f041e56e8e252ce015fb530e6c4d26b8674d93
Parents: 022d5b6 393a90c
Author: Luke Cwik 
Authored: Fri Apr 21 16:38:08 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 16:38:08 2017 -0700

--
 sdks/java/core/pom.xml  |  31 +-
 .../org/apache/beam/sdk/coders/JAXBCoder.java   | 201 -
 .../beam/sdk/coders/StringDelegateCoder.java|   3 +-
 .../org/apache/beam/sdk/io/FileBasedSource.java |   2 +-
 .../main/java/org/apache/beam/sdk/io/XmlIO.java | 476 --
 .../java/org/apache/beam/sdk/io/XmlSink.java| 153 
 .../java/org/apache/beam/sdk/io/XmlSource.java  | 404 -
 .../beam/sdk/testing/SourceTestUtils.java   |   2 +-
 .../apache/beam/sdk/coders/JAXBCoderTest.java   | 223 -
 .../org/apache/beam/sdk/io/XmlSinkTest.java | 253 --
 .../org/apache/beam/sdk/io/XmlSourceTest.java   | 892 --
 sdks/java/io/pom.xml|   1 +
 sdks/java/io/xml/pom.xml| 118 +++
 .../org/apache/beam/sdk/io/xml/JAXBCoder.java   | 203 +
 .../java/org/apache/beam/sdk/io/xml/XmlIO.java  | 469 ++
 .../org/apache/beam/sdk/io/xml/XmlSink.java | 160 
 .../org/apache/beam/sdk/io/xml/XmlSource.java   | 404 +
 .../apache/beam/sdk/io/xml/package-info.java|  22 +
 .../apache/beam/sdk/io/xml/JAXBCoderTest.java   | 228 +
 .../org/apache/beam/sdk/io/xml/XmlSinkTest.java | 253 ++
 .../apache/beam/sdk/io/xml/XmlSourceTest.java   | 893 +++
 21 files changed, 2755 insertions(+), 2636 deletions(-)
--




[1/2] beam git commit: [BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud Storage.

2017-04-21 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master eab04b029 -> 022d5b657


[BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud Storage.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/50622ee9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/50622ee9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/50622ee9

Branch: refs/heads/master
Commit: 50622ee9b434d7be0a37e3169765d0408a9ecd5e
Parents: eab04b0
Author: Luke Cwik 
Authored: Fri Apr 21 14:28:20 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 16:36:03 2017 -0700

--
 sdks/java/core/pom.xml | 12 
 1 file changed, 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/50622ee9/sdks/java/core/pom.xml
--
diff --git a/sdks/java/core/pom.xml b/sdks/java/core/pom.xml
index ea4b725..7af1444 100644
--- a/sdks/java/core/pom.xml
+++ b/sdks/java/core/pom.xml
@@ -144,18 +144,6 @@
   google-http-client
 
 
-
-  com.google.cloud.bigdataoss
-  gcsio
-  runtime
-
-
-
-  com.google.cloud.bigdataoss
-  util
-  runtime
-
-
 
 



[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979589#comment-15979589
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2638


> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: [BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud Storage

2017-04-21 Thread lcwik
[BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud Storage

This closes #2638


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/022d5b65
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/022d5b65
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/022d5b65

Branch: refs/heads/master
Commit: 022d5b657e367b0a40eb1813f49271d928278f3a
Parents: eab04b0 50622ee
Author: Luke Cwik 
Authored: Fri Apr 21 16:36:31 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 16:36:31 2017 -0700

--
 sdks/java/core/pom.xml | 12 
 1 file changed, 12 deletions(-)
--




[GitHub] beam pull request #2642: Add Cloud Object Translators for Coders

2017-04-21 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2642

Add Cloud Object Translators for Coders

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Currently unused. These will replace the coder.asCloudObject within
Dataflow.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam cloud_objects_in_df

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2642.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2642


commit e36dd885e0376a2a73ec693f7b66e5d640852b03
Author: Thomas Groh 
Date:   2017-04-21T17:56:15Z

Add Cloud Object Translators for Coders

Currently unused. These will replace the coder.asCloudObject within
Dataflow.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (BEAM-1860) SerializableCoder should not extend DeterministicStandardCoder

2017-04-21 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh resolved BEAM-1860.
---
   Resolution: Fixed
Fix Version/s: Not applicable

This is fixed as part of the broader push to get a meaningful Coder Hierarchy.

> SerializableCoder should not extend DeterministicStandardCoder
> --
>
> Key: BEAM-1860
> URL: https://issues.apache.org/jira/browse/BEAM-1860
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.6.0
>Reporter: Wesley Tanaka
>Assignee: Thomas Groh
> Fix For: Not applicable
>
>
> Not sure if this is just a doc bug, but:
> https://beam.apache.org/documentation/sdks/javadoc/0.6.0/org/apache/beam/sdk/coders/SerializableCoder.html
>  says:
> SerializableCoder does not guarantee a deterministic encoding, as Java 
> serialization may produce different binary encodings for two equivalent 
> objects.
> Yet 
> https://beam.apache.org/documentation/sdks/javadoc/0.6.0/org/apache/beam/sdk/coders/DeterministicStandardCoder.html
>  says:
> A DeterministicStandardCoder is a StandardCoder that is deterministic, in the 
> sense that for objects considered equal according to Object.equals(Object), 
> the encoded bytes are also equal.
> These sound like they conflict, and thus that SerializableCoder should not 
> extend DeterministicStandardCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1786) AutoService registration of coders, like we do with PipelineRunners

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979588#comment-15979588
 ] 

ASF GitHub Bot commented on BEAM-1786:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2637


> AutoService registration of coders, like we do with PipelineRunners
> ---
>
> Key: BEAM-1786
> URL: https://issues.apache.org/jira/browse/BEAM-1786
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Today, registering coders for auxiliary data types for a library transform is 
> not very convenient. It the appears in an output/covariant position then it 
> might be possible to use {{getDefaultOutputCoder}} to solve things. But for 
> writes/contravariant positions this is not applicable and the library 
> transform must contort itself to avoid requiring the user to come up with a 
> coder for a type they don't own.
> Probably the best case today is an explicit call to 
> {{LibraryTransform.registerCoders(Pipeline)}} which is far too manual.
> This could likely be solved quite easily with {{@AutoService}} and a static 
> global coder registry, as we do with pipeline runners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: [BEAM-1786] Post Dataflow worker CoderRegistry clean-up

2017-04-21 Thread lcwik
[BEAM-1786] Post Dataflow worker CoderRegistry clean-up

This closes #2637


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/eab04b02
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/eab04b02
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/eab04b02

Branch: refs/heads/master
Commit: eab04b029f665926f8533895ffbe8f739e3bbb11
Parents: a79bf57 1479b23
Author: Luke Cwik 
Authored: Fri Apr 21 16:35:24 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 16:35:24 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)
--




[GitHub] beam pull request #2637: [BEAM-1786] Post Dataflow worker CoderRegistry clea...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2637


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-1786] Post Dataflow worker CoderRegistry clean-up

2017-04-21 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master a79bf5709 -> eab04b029


[BEAM-1786] Post Dataflow worker CoderRegistry clean-up


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1479b233
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1479b233
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1479b233

Branch: refs/heads/master
Commit: 1479b2330f114268fcb8c8eb008a704b0a04c214
Parents: a79bf57
Author: Luke Cwik 
Authored: Fri Apr 21 14:04:51 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 16:35:14 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1479b233/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java
index 6b909d4..4238293 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/CoderRegistry.java
@@ -162,15 +162,12 @@ public class CoderRegistry implements CoderProvider {
 return new CoderRegistry();
   }
 
-  public CoderRegistry() {
+  private CoderRegistry() {
 coderFactoryMap = new HashMap<>(REGISTERED_CODER_FACTORIES_PER_CLASS);
 setFallbackCoderProvider(
 CoderProviders.firstOf(ProtoCoder.coderProvider(), 
SerializableCoder.PROVIDER));
   }
 
-  public void registerStandardCoders() {
-  }
-
   /**
* Registers {@code coderClazz} as the default {@link Coder} class to handle 
encoding and
* decoding instances of {@code clazz}, overriding prior registrations if 
any exist.



[jira] [Commented] (BEAM-2021) Fix Java's Coder class hierarchy

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979565#comment-15979565
 ] 

ASF GitHub Bot commented on BEAM-2021:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2606


> Fix Java's Coder class hierarchy
> 
>
> Key: BEAM-2021
> URL: https://issues.apache.org/jira/browse/BEAM-2021
> Project: Beam
>  Issue Type: Improvement
>  Components: beam-model-runner-api, sdk-java-core
>Affects Versions: First stable release
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> This is thoroughly out of hand. In the runner API world, there are two paths:
> 1. URN plus component coders plus custom payload (in the form of component 
> coders alongside an SdkFunctionSpec)
> 2. Custom coder (a single URN) and payload is serialized Java. I think this 
> never has component coders.
> The other base classes have now been shown to be extraneous: they favor 
> saving ~3 lines of boilerplate for rarely written code at the cost of 
> readability. Instead they should just be dropped.
> The custom payload is an Any proto in the runner API. But tying the Coder 
> interface to proto would be unfortunate from a design perspective and cannot 
> be done anyhow due to dependency hell.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2606: [BEAM-2021][BEAM-1871] Remove DeterministicStandard...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2606


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: This closes #2606

2017-04-21 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master cf5450f8a -> a79bf5709


This closes #2606


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a79bf570
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a79bf570
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a79bf570

Branch: refs/heads/master
Commit: a79bf57097839997d8ff17c0def3a69f3c5dffb1
Parents: cf5450f e3b2521
Author: Thomas Groh 
Authored: Fri Apr 21 16:24:14 2017 -0700
Committer: Thomas Groh 
Committed: Fri Apr 21 16:24:14 2017 -0700

--
 .../org/apache/beam/sdk/coders/AtomicCoder.java |  5 ++-
 .../sdk/coders/DeterministicStandardCoder.java  | 39 
 .../beam/sdk/coders/CoderRegistryTest.java  | 18 +++--
 .../beam/sdk/coders/NullableCoderTest.java  |  5 ++-
 .../beam/sdk/util/SerializableUtilsTest.java| 12 --
 5 files changed, 22 insertions(+), 57 deletions(-)
--




[2/2] beam git commit: Remove DeterministicStandardCoder

2017-04-21 Thread tgroh
Remove DeterministicStandardCoder

This isn't a particularly useful Coder. It has no defined methods other
than verifyDeterministic, which has an empty implementation.
Additionally, there are no guarantees that a DeterministicStandardCoder
is determinsitic.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e3b25215
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e3b25215
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e3b25215

Branch: refs/heads/master
Commit: e3b25215a2d91d363e3889c94a94ae6c0fc7b14d
Parents: cf5450f
Author: Thomas Groh 
Authored: Wed Apr 19 17:58:58 2017 -0700
Committer: Thomas Groh 
Committed: Fri Apr 21 16:24:14 2017 -0700

--
 .../org/apache/beam/sdk/coders/AtomicCoder.java |  5 ++-
 .../sdk/coders/DeterministicStandardCoder.java  | 39 
 .../beam/sdk/coders/CoderRegistryTest.java  | 18 +++--
 .../beam/sdk/coders/NullableCoderTest.java  |  5 ++-
 .../beam/sdk/util/SerializableUtilsTest.java| 12 --
 5 files changed, 22 insertions(+), 57 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e3b25215/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AtomicCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AtomicCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AtomicCoder.java
index c024f89..816af87 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AtomicCoder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/AtomicCoder.java
@@ -28,10 +28,13 @@ import java.util.List;
  *
  * @param  the type of the values being transcoded
  */
-public abstract class AtomicCoder extends DeterministicStandardCoder {
+public abstract class AtomicCoder extends StandardCoder {
   protected AtomicCoder() { }
 
   @Override
+  public void verifyDeterministic() throws NonDeterministicException { }
+
+  @Override
   public final List getCoderArguments() {
 return null;
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/e3b25215/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DeterministicStandardCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DeterministicStandardCoder.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DeterministicStandardCoder.java
deleted file mode 100644
index 8998ea5..000
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DeterministicStandardCoder.java
+++ /dev/null
@@ -1,39 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.coders;
-
-/**
- * A {@link DeterministicStandardCoder} is a {@link StandardCoder} that is
- * deterministic, in the sense that for objects considered equal
- * according to {@link Object#equals(Object)}, the encoded bytes are
- * also equal.
- *
- * @param  the type of the values being transcoded
- */
-public abstract class DeterministicStandardCoder extends StandardCoder {
-  protected DeterministicStandardCoder() {}
-
-  /**
-   * {@inheritDoc}
-   *
-   * @throws NonDeterministicException never, unless overridden. A
-   * {@link DeterministicStandardCoder} is presumed deterministic.
-   */
-  @Override
-  public void verifyDeterministic() throws NonDeterministicException { }
-}

http://git-wip-us.apache.org/repos/asf/beam/blob/e3b25215/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CoderRegistryTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CoderRegistryTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CoderRegistryTest.java
index 774ca9d..10e011f 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CoderRegistryTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/CoderRegistryTest.java
@@ 

[jira] [Created] (BEAM-2051) Reduce scope of the PCollectionView interface

2017-04-21 Thread Thomas Groh (JIRA)
Thomas Groh created BEAM-2051:
-

 Summary: Reduce scope of the PCollectionView interface
 Key: BEAM-2051
 URL: https://issues.apache.org/jira/browse/BEAM-2051
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Thomas Groh
Assignee: Thomas Groh


Users should only ever use a PCollectionView class as a token to access a view. 
A Runner can cast down to a more expressive type if required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2051) Reduce scope of the PCollectionView interface

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979564#comment-15979564
 ] 

ASF GitHub Bot commented on BEAM-2051:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2641

[BEAM-2051] Make SimplePCollectionView Visible

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
View will be replaced as a marker interface. Runners can expect to
always recieve a subclass of SimplePCollectionView, and cast to it when
methods are required.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam view_as_marker

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2641.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2641


commit b32c2fc99596e19ca31272c1d61ad40cc292446a
Author: Thomas Groh 
Date:   2017-04-21T16:46:28Z

Make SimplePCollectionView Visible

View will be replaced as a marker interface. Runners can expect to
always recieve a subclass of SimplePCollectionView, and cast to it when
methods are required.




> Reduce scope of the PCollectionView interface
> -
>
> Key: BEAM-2051
> URL: https://issues.apache.org/jira/browse/BEAM-2051
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> Users should only ever use a PCollectionView class as a token to access a 
> view. A Runner can cast down to a more expressive type if required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2641: [BEAM-2051] Make SimplePCollectionView Visible

2017-04-21 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2641

[BEAM-2051] Make SimplePCollectionView Visible

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
View will be replaced as a marker interface. Runners can expect to
always recieve a subclass of SimplePCollectionView, and cast to it when
methods are required.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam view_as_marker

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2641.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2641


commit b32c2fc99596e19ca31272c1d61ad40cc292446a
Author: Thomas Groh 
Date:   2017-04-21T16:46:28Z

Make SimplePCollectionView Visible

View will be replaced as a marker interface. Runners can expect to
always recieve a subclass of SimplePCollectionView, and cast to it when
methods are required.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2639: Change dataflow Job log from info to debug

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2639


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2639

2017-04-21 Thread chamikara
This closes #2639


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cf5450f8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cf5450f8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cf5450f8

Branch: refs/heads/master
Commit: cf5450f8ab294a2c9ebee34f9bac10fad4022bf3
Parents: a2047ac 06cf0b5
Author: Chamikara Jayalath 
Authored: Fri Apr 21 16:21:42 2017 -0700
Committer: Chamikara Jayalath 
Committed: Fri Apr 21 16:21:42 2017 -0700

--
 sdks/python/apache_beam/runners/dataflow/internal/apiclient.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)
--




[jira] [Commented] (BEAM-1988) utils.path.join does not correctly handle GCS bucket roots

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979559#comment-15979559
 ] 

ASF GitHub Bot commented on BEAM-1988:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2585


> utils.path.join does not correctly handle GCS bucket roots
> --
>
> Key: BEAM-1988
> URL: https://issues.apache.org/jira/browse/BEAM-1988
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Sourabh Bajaj
> Fix For: First stable release
>
>
> Here:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/path.py#L22
> Joining a bucket root with a filename e.g. (gs://mybucket/ , myfile) results 
> in invalid 'gs://mybucket//myfile', notice the double // between mybucket and 
> myfile. (It actually does not handle anything that already ends with {{/}} 
> correctly)
> [~sb2nov] could you take this one? Also, should the `join` operation move to 
> a BeamFileSystem level code.
> (cc: [~chamikara])



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2585: [BEAM-1988] Add join operation to the filesystem

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2585


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-1988] Add join operation to the filesystem

2017-04-21 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master bebee2a72 -> a2047acdb


[BEAM-1988] Add join operation to the filesystem


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/82dcfc6f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/82dcfc6f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/82dcfc6f

Branch: refs/heads/master
Commit: 82dcfc6ffea46bc1dc5f12b3d8365af98caf7a94
Parents: bebee2a
Author: Sourabh Bajaj 
Authored: Tue Apr 18 16:18:38 2017 -0700
Committer: Chamikara Jayalath 
Committed: Fri Apr 21 16:18:40 2017 -0700

--
 sdks/python/apache_beam/io/filesystem.py| 12 +++
 sdks/python/apache_beam/io/gcp/gcsfilesystem.py | 19 +++
 .../apache_beam/io/gcp/gcsfilesystem_test.py|  9 ++
 sdks/python/apache_beam/io/localfilesystem.py   | 11 +++
 .../apache_beam/io/localfilesystem_test.py  | 33 ++--
 5 files changed, 82 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/82dcfc6f/sdks/python/apache_beam/io/filesystem.py
--
diff --git a/sdks/python/apache_beam/io/filesystem.py 
b/sdks/python/apache_beam/io/filesystem.py
index 3a71ac1..591d0b0 100644
--- a/sdks/python/apache_beam/io/filesystem.py
+++ b/sdks/python/apache_beam/io/filesystem.py
@@ -426,6 +426,18 @@ class FileSystem(object):
 return compression_type
 
   @abc.abstractmethod
+  def join(self, basepath, *paths):
+"""Join two or more pathname components for the filesystem
+
+Args:
+  basepath: string path of the first component of the path
+  paths: path components to be added
+
+Returns: full path after combining all the passed components
+"""
+raise NotImplementedError
+
+  @abc.abstractmethod
   def mkdirs(self, path):
 """Recursively create directories for the provided path.
 

http://git-wip-us.apache.org/repos/asf/beam/blob/82dcfc6f/sdks/python/apache_beam/io/gcp/gcsfilesystem.py
--
diff --git a/sdks/python/apache_beam/io/gcp/gcsfilesystem.py 
b/sdks/python/apache_beam/io/gcp/gcsfilesystem.py
index a10a3d2..99f27f8 100644
--- a/sdks/python/apache_beam/io/gcp/gcsfilesystem.py
+++ b/sdks/python/apache_beam/io/gcp/gcsfilesystem.py
@@ -33,6 +33,25 @@ class GCSFileSystem(FileSystem):
 
   CHUNK_SIZE = gcsio.MAX_BATCH_OPERATION_SIZE  # Chuck size in batch operations
 
+  def join(self, basepath, *paths):
+"""Join two or more pathname components for the filesystem
+
+Args:
+  basepath: string path of the first component of the path
+  paths: path components to be added
+
+Returns: full path after combining all the passed components
+"""
+if not basepath.startswith('gs://'):
+  raise ValueError('Basepath %r must be GCS path.', basepath)
+path = basepath
+for p in paths:
+  if path == '' or path.endswith('/'):
+path += p
+  else:
+path += '/' + p
+return path
+
   def mkdirs(self, path):
 """Recursively create directories for the provided path.
 

http://git-wip-us.apache.org/repos/asf/beam/blob/82dcfc6f/sdks/python/apache_beam/io/gcp/gcsfilesystem_test.py
--
diff --git a/sdks/python/apache_beam/io/gcp/gcsfilesystem_test.py 
b/sdks/python/apache_beam/io/gcp/gcsfilesystem_test.py
index 5a1f10d..d6a8fd7 100644
--- a/sdks/python/apache_beam/io/gcp/gcsfilesystem_test.py
+++ b/sdks/python/apache_beam/io/gcp/gcsfilesystem_test.py
@@ -36,6 +36,15 @@ except ImportError:
 @unittest.skipIf(gcsfilesystem is None, 'GCP dependencies are not installed')
 class GCSFileSystemTest(unittest.TestCase):
 
+  def test_join(self):
+file_system = gcsfilesystem.GCSFileSystem()
+self.assertEqual('gs://bucket/path/to/file',
+ file_system.join('gs://bucket/path', 'to', 'file'))
+self.assertEqual('gs://bucket/path/to/file',
+ file_system.join('gs://bucket/path', 'to/file'))
+self.assertEqual('gs://bucket/path//to/file',
+ file_system.join('gs://bucket/path', '/to/file'))
+
   @mock.patch('apache_beam.io.gcp.gcsfilesystem.gcsio')
   def test_match_multiples(self, mock_gcsio):
 # Prepare mocks.

http://git-wip-us.apache.org/repos/asf/beam/blob/82dcfc6f/sdks/python/apache_beam/io/localfilesystem.py
--
diff --git a/sdks/python/apache_beam/io/localfilesystem.py 
b/sdks/python/apache_beam/io/localfilesystem.py
index 7637f2a..fbb65bf 100644
--- a/sdks/python/apache_beam/io/localfilesystem.py
+++ b/sdks/python/apache_beam/io/localfilesystem.py
@@ -34,6 +34,17 @@ class 

[jira] [Commented] (BEAM-1575) Add ValidatesRunner test to PipelineTest.test_metrics_in_source

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979531#comment-15979531
 ] 

ASF GitHub Bot commented on BEAM-1575:
--

GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2593

[BEAM-1575] Adding validatesrunner test for metrics in sources



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam source-metrics-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2593


commit 7aa16e3a56bea1e4b87e77f1757be5bb152b1ba4
Author: Pablo 
Date:   2017-04-19T16:44:54Z

Adding validatesrunner test for sources

commit c7307bc32fda18ae8fe834f73df24257b349d509
Author: Pablo 
Date:   2017-04-19T22:59:56Z

Fixing test

commit 1d96fe3bd330bf22599a313905d56505753517c9
Author: Pablo 
Date:   2017-04-19T23:08:30Z

Fix lint issues

commit fcc205d2a9614d74bde14c28752a07a0638751c9
Author: Pablo 
Date:   2017-04-20T17:00:11Z

Improving coverage. Fixing lint issue.

commit b0da056815f78a531fd6996a521d0e46bb634f47
Author: Pablo 
Date:   2017-04-20T18:31:04Z

Reusing existing source.

commit e20c4ead8513aacd7334d797f183590993f1cacc
Author: Pablo 
Date:   2017-04-20T19:24:09Z

Fixing lint issue




> Add ValidatesRunner test to PipelineTest.test_metrics_in_source
> ---
>
> Key: BEAM-1575
> URL: https://issues.apache.org/jira/browse/BEAM-1575
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>
> Currently, the source does not work other than in unittest. Need a source 
> that can be used in all runners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2593: [BEAM-1575] Adding validatesrunner test for metrics...

2017-04-21 Thread pabloem
Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/2593


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2593: [BEAM-1575] Adding validatesrunner test for metrics...

2017-04-21 Thread pabloem
GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2593

[BEAM-1575] Adding validatesrunner test for metrics in sources



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam source-metrics-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2593


commit 7aa16e3a56bea1e4b87e77f1757be5bb152b1ba4
Author: Pablo 
Date:   2017-04-19T16:44:54Z

Adding validatesrunner test for sources

commit c7307bc32fda18ae8fe834f73df24257b349d509
Author: Pablo 
Date:   2017-04-19T22:59:56Z

Fixing test

commit 1d96fe3bd330bf22599a313905d56505753517c9
Author: Pablo 
Date:   2017-04-19T23:08:30Z

Fix lint issues

commit fcc205d2a9614d74bde14c28752a07a0638751c9
Author: Pablo 
Date:   2017-04-20T17:00:11Z

Improving coverage. Fixing lint issue.

commit b0da056815f78a531fd6996a521d0e46bb634f47
Author: Pablo 
Date:   2017-04-20T18:31:04Z

Reusing existing source.

commit e20c4ead8513aacd7334d797f183590993f1cacc
Author: Pablo 
Date:   2017-04-20T19:24:09Z

Fixing lint issue




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979521#comment-15979521
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2640

[BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.


Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam thin_sdk_core3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2640


commit 823e2dc09f3fc7c510859335e472ed1457b222f5
Author: Luke Cwik 
Date:   2017-04-21T22:45:04Z

[BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.




> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2640: [BEAM-1871] Move Xml IO and related classes to new ...

2017-04-21 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2640

[BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.


Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam thin_sdk_core3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2640.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2640


commit 823e2dc09f3fc7c510859335e472ed1457b222f5
Author: Luke Cwik 
Date:   2017-04-21T22:45:04Z

[BEAM-1871] Move Xml IO and related classes to new sdks/java/io/xml package.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2639: Change dataflow Job log from info to debug

2017-04-21 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/2639

Change dataflow Job log from info to debug

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam py_log

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2639.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2639


commit e232ddecb2bc7f8a5afa129fcd8d9763ed671b59
Author: Vikas Kedigehalli 
Date:   2017-04-21T22:26:50Z

Change dataflow Job log from info to debug




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : beam_PostCommit_Python_Verify #1956

2017-04-21 Thread Apache Jenkins Server
See 




[jira] [Assigned] (BEAM-2027) get error sometimes while running the same code using beam0.6

2017-04-21 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-2027:
-

Assignee: Aviem Zur  (was: Kenneth Knowles)

> get error sometimes while running the same code using beam0.6
> -
>
> Key: BEAM-2027
> URL: https://issues.apache.org/jira/browse/BEAM-2027
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core, runner-spark
> Environment: spark-1.6.2-bin-hadoop2.6, hadoop-2.6.0, source:hdfs 
> sink:hdfs
>Reporter: liyuntian
>Assignee: Aviem Zur
>
> run a yarn job using beam0.6.0, I get file from hdfs and write record to 
> hdfs, I use spark-1.6.2-bin-hadoop2.6,hadoop-2.6.0. I get error sometime 
> below, 
> 17/04/20 21:10:45 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 
> (TID 0, etl-develop-003): java.io.InvalidClassException: 
> org.apache.beam.runners.spark.coders.CoderHelpers$3; local class 
> incompatible: stream classdesc serialVersionUID = 1334222146820528045, local 
> class serialVersionUID = 5119956493581628999
>   at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:617)
>   at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1622)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>   at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>   at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> 

[jira] [Commented] (BEAM-301) Add a Beam SQL DSL

2017-04-21 Thread Tyler Akidau (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979469#comment-15979469
 ] 

Tyler Akidau commented on BEAM-301:
---

To keep the JIRA in sync w/ the dev list, the first of the two docs I 
referenced above is now available: http://s.apache.org/beam-streams-tables .

> Add a Beam SQL DSL
> --
>
> Key: BEAM-301
> URL: https://issues.apache.org/jira/browse/BEAM-301
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql, sdk-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Xu Mingmin
>
> The SQL DSL helps developers to build a Beam pipeline from SQL statement in 
> String directly. 
> In Phase I, it starts to support INSERT/SELECT queries with FILTERs, one 
> example SQL as below:
> {code}
> INSERT INTO `SUB_USEREVENT` (`SITEID`, `PAGEID`, `PAGENAME`, `EVENTTIMESTAMP`)
> (SELECT STREAM `USEREVENT`.`SITEID`, `USEREVENT`.`PAGEID`, 
> `USEREVENT`.`PAGENAME`, `USEREVENT`.`EVENTTIMESTAMP`
> FROM `USEREVENT` AS `USEREVENT`
> WHERE `USEREVENT`.`SITEID` > 10)
> {code}
> A design doc is available at 
> https://docs.google.com/document/d/1Uc5xYTpO9qsLXtT38OfuoqSLimH_0a1Bz5BsCROMzCU/edit?usp=sharing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2625: Coder.structuralValue(T) should never throw

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2625


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: This closes #2625

2017-04-21 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master 552ddb4ad -> bebee2a72


This closes #2625


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bebee2a7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bebee2a7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bebee2a7

Branch: refs/heads/master
Commit: bebee2a7284ca3c463ce69dee217e4dfa998be57
Parents: 552ddb4 f8a10ff
Author: Thomas Groh 
Authored: Fri Apr 21 15:00:07 2017 -0700
Committer: Thomas Groh 
Committed: Fri Apr 21 15:00:07 2017 -0700

--
 .../apache/beam/runners/dataflow/internal/IsmFormat.java  |  2 +-
 .../src/main/java/org/apache/beam/sdk/coders/Coder.java   |  2 +-
 .../java/org/apache/beam/sdk/coders/DelegateCoder.java| 10 --
 .../src/main/java/org/apache/beam/sdk/coders/KvCoder.java |  2 +-
 .../java/org/apache/beam/sdk/coders/NullableCoder.java|  2 +-
 .../java/org/apache/beam/sdk/coders/StandardCoder.java|  2 +-
 .../org/apache/beam/sdk/coders/StringDelegateCoder.java   |  2 +-
 .../org/apache/beam/sdk/io/kafka/KafkaRecordCoder.java|  2 +-
 8 files changed, 15 insertions(+), 9 deletions(-)
--




[2/2] beam git commit: Coder.structuralValue(T) should never throw

2017-04-21 Thread tgroh
Coder.structuralValue(T) should never throw

In the worst case, encoding to a byte array should never fail due to IO.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f8a10ffd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f8a10ffd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f8a10ffd

Branch: refs/heads/master
Commit: f8a10ffd7af93705d8011d5287eb2225e540a1fd
Parents: 552ddb4
Author: Thomas Groh 
Authored: Thu Apr 20 20:00:07 2017 -0700
Committer: Thomas Groh 
Committed: Fri Apr 21 15:00:07 2017 -0700

--
 .../apache/beam/runners/dataflow/internal/IsmFormat.java  |  2 +-
 .../src/main/java/org/apache/beam/sdk/coders/Coder.java   |  2 +-
 .../java/org/apache/beam/sdk/coders/DelegateCoder.java| 10 --
 .../src/main/java/org/apache/beam/sdk/coders/KvCoder.java |  2 +-
 .../java/org/apache/beam/sdk/coders/NullableCoder.java|  2 +-
 .../java/org/apache/beam/sdk/coders/StandardCoder.java|  2 +-
 .../org/apache/beam/sdk/coders/StringDelegateCoder.java   |  2 +-
 .../org/apache/beam/sdk/io/kafka/KafkaRecordCoder.java|  2 +-
 8 files changed, 15 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f8a10ffd/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
index 6daddc6..33c27f8 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
@@ -403,7 +403,7 @@ public class IsmFormat {
 }
 
 @Override
-public Object structuralValue(IsmRecord record) throws Exception {
+public Object structuralValue(IsmRecord record) {
   checkNotNull(record);
   checkState(record.getKeyComponents().size() == keyComponentCoders.size(),
   "Expected the number of key component coders %s "

http://git-wip-us.apache.org/repos/asf/beam/blob/f8a10ffd/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
index 39efaf2..779961e 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/Coder.java
@@ -198,7 +198,7 @@ public interface Coder extends Serializable {
*
* See also {@link #consistentWithEquals()}.
*/
-  Object structuralValue(T value) throws Exception;
+  Object structuralValue(T value);
 
   /**
* Returns whether {@link #registerByteSizeObserver} cheap enough to

http://git-wip-us.apache.org/repos/asf/beam/blob/f8a10ffd/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
index 1762243..7e1154a 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/DelegateCoder.java
@@ -107,8 +107,14 @@ public final class DelegateCoder extends 
CustomCoder {
* coder.
*/
   @Override
-  public Object structuralValue(T value) throws Exception {
-return coder.structuralValue(toFn.apply(value));
+  public Object structuralValue(T value) {
+try {
+  IntermediateT intermediate = toFn.apply(value);
+  return coder.structuralValue(intermediate);
+} catch (Exception exn) {
+  throw new IllegalArgumentException(
+  "Unable to encode element '" + value + "' with coder '" + this + 
"'.", exn);
+}
   }
 
   @Override

http://git-wip-us.apache.org/repos/asf/beam/blob/f8a10ffd/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/KvCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/KvCoder.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/KvCoder.java
index 3c61bf6..fcb906c 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/KvCoder.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/KvCoder.java
@@ -114,7 +114,7 @@ 

[jira] [Commented] (BEAM-2046) Better API for querying metrics

2017-04-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979464#comment-15979464
 ] 

Ismaël Mejía commented on BEAM-2046:


Huge +1, also it would be nice a method that receives a list (or vargs) of 
names in a namespace and return a map with the associate existing values for 
the given metrics.

> Better API for querying metrics
> ---
>
> Key: BEAM-2046
> URL: https://issues.apache.org/jira/browse/BEAM-2046
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: First stable release
>Reporter: Malo Denielou
>Assignee: Davor Bonaci
>
> I just want to read a metric :-).
> Can we have a better API than this:
> Iterable seenMetrics = job.metrics()
> .queryMetrics(
> MetricsFilter.builder()
> .addNameFilter(MetricNameFilter.named("XX", "YY"))
> .build())
> .counters();
> long seenSentinels = Iterables.isEmpty(seenMetrics) ? 0
> : Iterables.getFirst(seenMetrics, null).committed();
> This is very clunky :-P.
> Ideally I'd like to read a metric with a name, and provide a default value if 
> the metric is not there.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1988) utils.path.join does not correctly handle GCS bucket roots

2017-04-21 Thread Sourabh Bajaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Bajaj updated BEAM-1988:

Fix Version/s: First stable release

> utils.path.join does not correctly handle GCS bucket roots
> --
>
> Key: BEAM-1988
> URL: https://issues.apache.org/jira/browse/BEAM-1988
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Sourabh Bajaj
> Fix For: First stable release
>
>
> Here:
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/utils/path.py#L22
> Joining a bucket root with a filename e.g. (gs://mybucket/ , myfile) results 
> in invalid 'gs://mybucket//myfile', notice the double // between mybucket and 
> myfile. (It actually does not handle anything that already ends with {{/}} 
> correctly)
> [~sb2nov] could you take this one? Also, should the `join` operation move to 
> a BeamFileSystem level code.
> (cc: [~chamikara])



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #3417

2017-04-21 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #3416

2017-04-21 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2638: [BEAM-1871] Remove unnecessary runtime dependencies...

2017-04-21 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2638

[BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud Storage

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam thin_sdk_core2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2638.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2638


commit f2dfa36ac68e474703797dd5c41d5bb3bdddb791
Author: Luke Cwik 
Date:   2017-04-21T21:28:20Z

[BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud 
Storage.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1871) Thin Java SDK Core

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979421#comment-15979421
 ] 

ASF GitHub Bot commented on BEAM-1871:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2638

[BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud Storage

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam thin_sdk_core2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2638.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2638


commit f2dfa36ac68e474703797dd5c41d5bb3bdddb791
Author: Luke Cwik 
Date:   2017-04-21T21:28:20Z

[BEAM-1871] Remove unnecessary runtime dependencies for Google Cloud 
Storage.




> Thin Java SDK Core
> --
>
> Key: BEAM-1871
> URL: https://issues.apache.org/jira/browse/BEAM-1871
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Before first stable release we need to thin out {{sdk-java-core}} module. 
> Some candidates for removal, but not a non-exhaustive list:
> {{sdk/io}}
> * anything BigQuery related
> * anything PubSub related
> * everything Protobuf related
> * TFRecordIO
> * XMLSink
> {{sdk/util}}
> * Everything GCS related
> * Everything Backoff related
> * Everything Google API related: ResponseInterceptors, RetryHttpBackoff, etc.
> * Everything CloudObject-related
> * Pubsub stuff
> {{sdk/coders}}
> * JAXBCoder
> * TableRowJsoNCoder



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1786) AutoService registration of coders, like we do with PipelineRunners

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979398#comment-15979398
 ] 

ASF GitHub Bot commented on BEAM-1786:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2637

[BEAM-1786] Post Dataflow worker CoderRegistry clean-up

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam thin_sdk_core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2637


commit 1b8678136a420e360e23e6ff5deac5f368a7a43c
Author: Luke Cwik 
Date:   2017-04-21T21:04:51Z

[BEAM-1786] Post Dataflow worker CoderRegistry clean-up




> AutoService registration of coders, like we do with PipelineRunners
> ---
>
> Key: BEAM-1786
> URL: https://issues.apache.org/jira/browse/BEAM-1786
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Today, registering coders for auxiliary data types for a library transform is 
> not very convenient. It the appears in an output/covariant position then it 
> might be possible to use {{getDefaultOutputCoder}} to solve things. But for 
> writes/contravariant positions this is not applicable and the library 
> transform must contort itself to avoid requiring the user to come up with a 
> coder for a type they don't own.
> Probably the best case today is an explicit call to 
> {{LibraryTransform.registerCoders(Pipeline)}} which is far too manual.
> This could likely be solved quite easily with {{@AutoService}} and a static 
> global coder registry, as we do with pipeline runners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2637: [BEAM-1786] Post Dataflow worker CoderRegistry clea...

2017-04-21 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2637

[BEAM-1786] Post Dataflow worker CoderRegistry clean-up

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam thin_sdk_core

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2637.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2637


commit 1b8678136a420e360e23e6ff5deac5f368a7a43c
Author: Luke Cwik 
Date:   2017-04-21T21:04:51Z

[BEAM-1786] Post Dataflow worker CoderRegistry clean-up




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2050) CombineTest and CombineFnsTest should not rely on behavior of opaque TestCombineFn

2017-04-21 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2050:
-

 Summary: CombineTest and CombineFnsTest should not rely on 
behavior of opaque TestCombineFn
 Key: BEAM-2050
 URL: https://issues.apache.org/jira/browse/BEAM-2050
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Kenneth Knowles


This is a CombineFn that "exhaustively" uses all the capabilities of Combine. 
It results in readability problems for tests. Each test should clearly set 
things up and check particular functionality.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PostCommit_Python_Verify #1955

2017-04-21 Thread Apache Jenkins Server
See 


Changes:

[lcwik] Validates that input and output GCS paths specify a bucket

--
[...truncated 708.19 KB...]
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-35.0.1.zip
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-35.0.1.zip
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.0.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.10.0.tar.gz
Collecting packaging>=16.8 (from setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/packaging-16.8.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.10.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting appdirs>=1.4.0 (from setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/appdirs-1.4.3.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.0.0.tar.gz
Collecting pyparsing (from packaging>=16.8->setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-3.0.0.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/pyparsing-2.2.0.tar.gz
Collecting packaging>=16.8 (from setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/packaging-16.8.tar.gz
Collecting packaging>=16.8 (from setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/packaging-16.8.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr packaging 
appdirs pyparsing
Collecting appdirs>=1.4.0 (from setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/appdirs-1.4.3.tar.gz
Collecting pyparsing (from packaging>=16.8->setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
Collecting appdirs>=1.4.0 (from setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/appdirs-1.4.3.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/pyparsing-2.2.0.tar.gz
Collecting pyparsing (from packaging>=16.8->setuptools->pyhamcrest->-r 
postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/pyparsing-2.2.0.tar.gz
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr packaging 
appdirs pyparsing
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr packaging 
appdirs pyparsing
test_multiple_empty_outputs 
(apache_beam.transforms.ptransform_test.PTransformTest) ... ok
:132:
 UserWarning: Using fallback coder for typehint: List[Any].
  warnings.warn('Using fallback coder for typehint: %r.' % typehint)
:132:
 UserWarning: Using fallback coder for typehint: Union[].
  warnings.warn('Using fallback coder for typehint: %r.' % typehint)
DEPRECATION: pip install --download has been deprecated and will be removed in 
the future. Pip now has a download command that should be used instead.
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-35.0.1.zip
Collecting six (from pyhamcrest->-r 

[jira] [Commented] (BEAM-2049) Remove KeyedCombineFn

2017-04-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979367#comment-15979367
 ] 

ASF GitHub Bot commented on BEAM-2049:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2636

[BEAM-2049] Remove KeyedCombineFn

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

I think it is probably best to leave tweaking `StateTag` and `StateSpec` 
until a later PR...


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam delete-KeyedCombineFn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2636.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2636


commit 6ccacf632dd712750a46ba992683bde676213819
Author: Kenneth Knowles 
Date:   2017-04-21T21:04:02Z

Remove KeyedCombineFn




> Remove KeyedCombineFn
> -
>
> Key: BEAM-2049
> URL: https://issues.apache.org/jira/browse/BEAM-2049
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2636: [BEAM-2049] Remove KeyedCombineFn

2017-04-21 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2636

[BEAM-2049] Remove KeyedCombineFn

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

I think it is probably best to leave tweaking `StateTag` and `StateSpec` 
until a later PR...


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam delete-KeyedCombineFn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2636.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2636


commit 6ccacf632dd712750a46ba992683bde676213819
Author: Kenneth Knowles 
Date:   2017-04-21T21:04:02Z

Remove KeyedCombineFn




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2631: Update Dataflow Worker Version

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2631


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: Update Dataflow Worker Version

2017-04-21 Thread lcwik
Update Dataflow Worker Version

This closes #2631


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/552ddb4a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/552ddb4a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/552ddb4a

Branch: refs/heads/master
Commit: 552ddb4adec72fb9098f7d100bb14ef8956bf8f0
Parents: 6a1f581 78f8267
Author: Luke Cwik 
Authored: Fri Apr 21 14:02:54 2017 -0700
Committer: Luke Cwik 
Committed: Fri Apr 21 14:02:54 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Commented] (BEAM-2048) --worker_harness_container_image should issue a warning if not used with --runner DataflowRunner

2017-04-21 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979360#comment-15979360
 ] 

María GH commented on BEAM-2048:


Can the runner reading an option it doesn't use issue a "N/A" kind of warning?

> --worker_harness_container_image should issue a warning if not used with 
> --runner DataflowRunner
> 
>
> Key: BEAM-2048
> URL: https://issues.apache.org/jira/browse/BEAM-2048
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: María GH
>Priority: Minor
>
> Running
> python -m apache_beam.examples.wordcount --output counts 
> --worker_harness_container_image 
> doesn't issue a warning saying that the job is being run by DirectRunner 
> (default) and therefore no worker_harness_container_image will be used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2049) Remove KeyedCombineFn

2017-04-21 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2049:
-

 Summary: Remove KeyedCombineFn
 Key: BEAM-2049
 URL: https://issues.apache.org/jira/browse/BEAM-2049
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles
 Fix For: First stable release






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2635: Makes cachedSplitResult transient in BigQuerySource...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2635


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Makes cachedSplitResult transient in BigQuerySourceBase

2017-04-21 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 0527f6b66 -> 6a1f58156


Makes cachedSplitResult transient in BigQuerySourceBase


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/79d187c6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/79d187c6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/79d187c6

Branch: refs/heads/master
Commit: 79d187c6d403abb66a638fc720a79b45fa5288bf
Parents: 0527f6b
Author: Eugene Kirpichov 
Authored: Fri Apr 21 13:23:53 2017 -0700
Committer: Eugene Kirpichov 
Committed: Fri Apr 21 13:23:53 2017 -0700

--
 .../org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java| 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/79d187c6/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
index 53d395b..ab7f4e8 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
@@ -68,7 +68,7 @@ abstract class BigQuerySourceBase extends 
BoundedSource {
   protected final BigQueryServices bqServices;
   protected final ValueProvider executingProject;
 
-  private List cachedSplitResult;
+  private transient List cachedSplitResult;
 
   BigQuerySourceBase(
   ValueProvider jobIdToken,



[GitHub] beam pull request #2635: Makes cachedSplitResult transient in BigQuerySource...

2017-04-21 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/2635

Makes cachedSplitResult transient in BigQuerySourceBase

Otherwise this leads to O(N^2) explosion in serialized size of splits.

R: @dhalperi 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam bq-transient

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2635.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2635


commit 79d187c6d403abb66a638fc720a79b45fa5288bf
Author: Eugene Kirpichov 
Date:   2017-04-21T20:23:53Z

Makes cachedSplitResult transient in BigQuerySourceBase




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #197: Transfer content from Create Your Pipeline to P...

2017-04-21 Thread hadarhg
Github user hadarhg closed the pull request at:

https://github.com/apache/beam-site/pull/197


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2048) --worker_harness_container_image should issue a warning if not used with --runner DataflowRunner

2017-04-21 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979300#comment-15979300
 ] 

Ahmet Altay commented on BEAM-2048:
---

This same is true for any `WorkerOptions`. Maybe we can rename the option group 
to something else to be more clear. I am not sure what would be a good strategy 
for the general problem. A user may pass a set of options and those options may 
or may not be used by a particular runner. 

> --worker_harness_container_image should issue a warning if not used with 
> --runner DataflowRunner
> 
>
> Key: BEAM-2048
> URL: https://issues.apache.org/jira/browse/BEAM-2048
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: María GH
>Assignee: Ahmet Altay
>Priority: Minor
>
> Running
> python -m apache_beam.examples.wordcount --output counts 
> --worker_harness_container_image 
> doesn't issue a warning saying that the job is being run by DirectRunner 
> (default) and therefore no worker_harness_container_image will be used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-2048) --worker_harness_container_image should issue a warning if not used with --runner DataflowRunner

2017-04-21 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-2048:
-

Assignee: (was: Ahmet Altay)

> --worker_harness_container_image should issue a warning if not used with 
> --runner DataflowRunner
> 
>
> Key: BEAM-2048
> URL: https://issues.apache.org/jira/browse/BEAM-2048
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: María GH
>Priority: Minor
>
> Running
> python -m apache_beam.examples.wordcount --output counts 
> --worker_harness_container_image 
> doesn't issue a warning saying that the job is being run by DirectRunner 
> (default) and therefore no worker_harness_container_image will be used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2602: Validates that input and output GCS paths specify a...

2017-04-21 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2602


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


  1   2   3   >