[GitHub] beam pull request #2834: [BEAM-59] Move GcsFileSystem to gcp-core

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2834


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-59] Move GcsFileSystem to gcp-core

2017-05-02 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 027dd777d -> 5bfd3e049


[BEAM-59] Move GcsFileSystem to gcp-core

It is used by both runner and IO, so should be in core.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b2a4ae2b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b2a4ae2b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b2a4ae2b

Branch: refs/heads/master
Commit: b2a4ae2b307b8c540ff8a40878521bf3d5e532ff
Parents: 027dd77
Author: Dan Halperin 
Authored: Tue May 2 11:08:16 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 16:17:57 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml |   2 +-
 .../extensions/gcp/storage/GcsFileSystem.java   | 266 ++
 .../gcp/storage/GcsFileSystemRegistrar.java |  43 +++
 .../extensions/gcp/storage/GcsResourceId.java   | 128 +
 .../extensions/gcp/storage/package-info.java|  21 ++
 .../gcp/storage/GcsFileSystemRegistrarTest.java |  52 
 .../gcp/storage/GcsFileSystemTest.java  | 274 +++
 .../gcp/storage/GcsResourceIdTest.java  | 169 
 sdks/java/io/google-cloud-platform/pom.xml  |   5 -
 .../beam/sdk/io/gcp/storage/GcsFileSystem.java  | 266 --
 .../io/gcp/storage/GcsFileSystemRegistrar.java  |  43 ---
 .../beam/sdk/io/gcp/storage/GcsResourceId.java  | 128 -
 .../beam/sdk/io/gcp/storage/package-info.java   |  21 --
 .../gcp/storage/GcsFileSystemRegistrarTest.java |  52 
 .../sdk/io/gcp/storage/GcsFileSystemTest.java   | 274 ---
 .../sdk/io/gcp/storage/GcsResourceIdTest.java   | 169 
 16 files changed, 954 insertions(+), 959 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b2a4ae2b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
--
diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml 
b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
index d1d8b4d..28bbc3c 100644
--- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
+++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
@@ -203,7 +203,7 @@
   
 
   
-
+
 
 
 

[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993995#comment-15993995
 ] 

ASF GitHub Bot commented on BEAM-59:


Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2834


> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #2834

2017-05-02 Thread dhalperi
This closes #2834


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5bfd3e04
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5bfd3e04
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5bfd3e04

Branch: refs/heads/master
Commit: 5bfd3e049c0ca0744165b0243a645e8e427032d5
Parents: 027dd77 b2a4ae2
Author: Dan Halperin 
Authored: Tue May 2 16:18:01 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 16:18:01 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml |   2 +-
 .../extensions/gcp/storage/GcsFileSystem.java   | 266 ++
 .../gcp/storage/GcsFileSystemRegistrar.java |  43 +++
 .../extensions/gcp/storage/GcsResourceId.java   | 128 +
 .../extensions/gcp/storage/package-info.java|  21 ++
 .../gcp/storage/GcsFileSystemRegistrarTest.java |  52 
 .../gcp/storage/GcsFileSystemTest.java  | 274 +++
 .../gcp/storage/GcsResourceIdTest.java  | 169 
 sdks/java/io/google-cloud-platform/pom.xml  |   5 -
 .../beam/sdk/io/gcp/storage/GcsFileSystem.java  | 266 --
 .../io/gcp/storage/GcsFileSystemRegistrar.java  |  43 ---
 .../beam/sdk/io/gcp/storage/GcsResourceId.java  | 128 -
 .../beam/sdk/io/gcp/storage/package-info.java   |  21 --
 .../gcp/storage/GcsFileSystemRegistrarTest.java |  52 
 .../sdk/io/gcp/storage/GcsFileSystemTest.java   | 274 ---
 .../sdk/io/gcp/storage/GcsResourceIdTest.java   | 169 
 16 files changed, 954 insertions(+), 959 deletions(-)
--




[GitHub] beam pull request #2850: PR only for Jenkins testing

2017-05-02 Thread markflyhigh
GitHub user markflyhigh opened a pull request:

https://github.com/apache/beam/pull/2850

PR only for Jenkins testing

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

Add new Jenkins branch to run Java postcommit test cross JDK versions.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/markflyhigh/incubator-beam 
jenkins_java_sdk_version_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2850.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2850


commit ea50a1e942ee3c5fd256c09e8aacdbcecd98ddbc
Author: Mark Liu 
Date:   2017-03-08T23:44:38Z

Jenkins build test

commit 95fa1842f9a00d4a6944ded0085a342cbd521152
Author: Mark Liu 
Date:   2017-03-08T23:55:08Z

use default branch

commit ca9fcd82fb4f9ac2902248c335709a0e7f8a5168
Author: Mark Liu 
Date:   2017-03-09T00:48:54Z

fic maven command line

commit 4f930b3909ea4a4f7f5f46ad2aac034e9d0f675e
Author: Mark Liu 
Date:   2017-03-09T01:13:18Z

fix seed job failed

commit ac6987ff889a6c2c2f70cb892eff884b79c502a1
Author: Mark Liu 
Date:   2017-03-09T01:21:49Z

fixup! fix maven config

commit 054d78415d786218b8503a0d892118a5a8a5c867
Author: Mark Liu 
Date:   2017-03-09T01:44:41Z

fixup

commit cd35cfccf008a26ead9028cfbcb9499d89fca36a
Author: Mark Liu 
Date:   2017-03-09T02:08:06Z

fixup! remove label

commit b39ac501538b37ca3e9702b073d806734c1e102f
Author: Mark Liu 
Date:   2017-03-09T02:23:33Z

Remove SeedJob failure notification to dev@

commit c28113a4321ff11e2d8ccb645dd3554f11943006
Author: Mark Liu 
Date:   2017-03-09T02:59:42Z

remove exclude python project in build due to Jenkins bug

commit 6c7741f9e54e321913293a862ef3c1f7b609629f
Author: Mark Liu 
Date:   2017-03-09T05:43:48Z

Run Java postcommit test suite

commit 9bb19d664808f5bef6fdcf6a333cd0d4a8b10f95
Author: Mark Liu 
Date:   2017-03-09T06:06:53Z

add label in axes

commit 21f054d898bcf3d9966688f4851957c33095bcb1
Author: Mark Liu 
Date:   2017-03-09T18:40:23Z

Add JDK 1.8 version

commit d60ce6889469f73190383bbf9f34e665f1becd94
Author: Mark Liu 
Date:   2017-03-15T19:45:27Z

fixup! add env inspecting command and clean maven command

commit c9ba3b0468267f84582721d4f177abdafbc9b643
Author: Mark Liu 
Date:   2017-03-15T22:45:21Z

fixup! Ignore API tests and Skip Python project build

commit 43e53603842adeac8399135ebc9b9b19a2218125
Author: Mark Liu 
Date:   2017-05-02T23:13:06Z

Add api test back and rebase branch




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2849: Trigger for Java SDK post submit tests

2017-05-02 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/2849

Trigger for Java SDK post submit tests

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam post_submit_trigger

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2849.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2849


commit bc91825a836f421857fcf2439fc989a231c92322
Author: Vikas Kedigehalli 
Date:   2017-05-02T23:11:09Z

Triggers for Java SDK post submit tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3597

2017-05-02 Thread Apache Jenkins Server
See 




[GitHub] beam-site pull request #228: Fix broken HDFS link on built-in IO page

2017-05-02 Thread melap
GitHub user melap opened a pull request:

https://github.com/apache/beam-site/pull/228

Fix broken HDFS link on built-in IO page

R: @lukecwik @aaltay 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/melap/beam-site hdfslink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/228.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #228


commit cda6b2a8903207bb29aa91b0faf287b76b2f09bc
Author: melissa 
Date:   2017-05-02T23:07:20Z

Fix broken HDFS link on built-in IO page




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam-site pull request #227: [BEAM-2142] Add composite transforms Python sni...

2017-05-02 Thread melap
GitHub user melap opened a pull request:

https://github.com/apache/beam-site/pull/227

[BEAM-2142] Add composite transforms Python snippets to programming guide

Add new snippets from BEAM-1926
R: @aaltay 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/melap/beam-site compositesnippets

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/227.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #227


commit 1061b825f0888ee4454c6675bd039de2df7cbe51
Author: melissa 
Date:   2017-05-02T22:49:41Z

[BEAM-2142] Add composite transforms Python snippets to programming guide




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2142) add missing composite transforms Python snippets to programming guide

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993940#comment-15993940
 ] 

ASF GitHub Bot commented on BEAM-2142:
--

GitHub user melap opened a pull request:

https://github.com/apache/beam-site/pull/227

[BEAM-2142] Add composite transforms Python snippets to programming guide

Add new snippets from BEAM-1926
R: @aaltay 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/melap/beam-site compositesnippets

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/227.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #227


commit 1061b825f0888ee4454c6675bd039de2df7cbe51
Author: melissa 
Date:   2017-05-02T22:49:41Z

[BEAM-2142] Add composite transforms Python snippets to programming guide




> add missing composite transforms Python snippets to programming guide
> -
>
> Key: BEAM-2142
> URL: https://issues.apache.org/jira/browse/BEAM-2142
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Melissa Pashniak
>Assignee: Melissa Pashniak
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1925) Make DoFn invocation logic of Python SDK more extensible

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993935#comment-15993935
 ] 

ASF GitHub Bot commented on BEAM-1925:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2848

[BEAM-1925] validate DoFn at pipeline creation time

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @chamikaramj PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-1925-validate-dofn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2848.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2848


commit 26b09ba71b60c330d9605a8baa039900a50c4c9d
Author: Sourabh Bajaj 
Date:   2017-05-02T22:47:54Z

[BEAM-1925] validate DoFn at pipeline creation time




> Make DoFn invocation logic of Python SDK more extensible
> 
>
> Key: BEAM-1925
> URL: https://issues.apache.org/jira/browse/BEAM-1925
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>
> DoFn invocation logic of Python SDK is currently in DoFnRunner class.
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L54
> At initialization of this, we parse a DoFn and create local state. We use 
> this state when invoking DoFn methods process, start_bundle, and 
> finish_bundle. For example, we store a list of  ArgPlaceholder objects within 
> the state of DoFnRunner to facilitate invocation of process method.
> We will need to extend this functionality when adding new features to DoFn 
> class (for example to support Splittable DoFn [1]). So I think it's good to 
> refactor this code to be more extensible. 
> I think a good approach for this is to add DoFnInvoker and DoFnSignature 
> classes similar to Java SDK [2].
> In this approach:
> A DoFnSignature captures the signature of a DoFn including methods and 
> arguments.
> A DoFnInvoker implements a particular way DoFn methods will be executed 
> (initially we'll have simple and per-window invokers [3]).
> A runner uses DoFnRunner to execute methods of a given DoFn. At 
> initialization, DoFnRunner crates a DoFnSignature and a DoFnInvoker for the 
> given DoFn.
> DoFnSignature and DoFnInvoker methods will be used by SplittableDoFn 
> implementation as well. 
> [1] 
> https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH1YDmi-s_ozM/edit#heading=h.e6patunrpiql
> [2]https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
> [3] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L200



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2848: [BEAM-1925] validate DoFn at pipeline creation time

2017-05-02 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2848

[BEAM-1925] validate DoFn at pipeline creation time

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @chamikaramj PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-1925-validate-dofn

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2848.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2848


commit 26b09ba71b60c330d9605a8baa039900a50c4c9d
Author: Sourabh Bajaj 
Date:   2017-05-02T22:47:54Z

[BEAM-1925] validate DoFn at pipeline creation time




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2142) add missing composite transforms Python snippets to programming guide

2017-05-02 Thread Melissa Pashniak (JIRA)
Melissa Pashniak created BEAM-2142:
--

 Summary: add missing composite transforms Python snippets to 
programming guide
 Key: BEAM-2142
 URL: https://issues.apache.org/jira/browse/BEAM-2142
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Melissa Pashniak
Assignee: Melissa Pashniak






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2128) PostCommit Java_MavenInstall broken

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993916#comment-15993916
 ] 

ASF GitHub Bot commented on BEAM-2128:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2795


> PostCommit Java_MavenInstall broken
> ---
>
> Key: BEAM-2128
> URL: https://issues.apache.org/jira/browse/BEAM-2128
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, testing
>Reporter: Daniel Halperin
>Assignee: Kenneth Knowles
>
> The test 
> {{org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2EBigQueryTornadoes}}
> is broken since PR #2666 was merged to master.
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_MavenInstall/3533/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2795: [BEAM-2128] Restore status quo relationship between...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2795


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/3] beam git commit: Instantiate runner briefly in Pipeline

2017-05-02 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master a552fb8f6 -> 027dd777d


Instantiate runner briefly in Pipeline

Today, some runners mutate the PipelineOptions in critical ways
when they are built, and BigQueryIO depends, during construction,
on options that are sometimes the subject of those mutations.

Changes to BigQueryIO and runners will likely fix this, but this
commit unbreaks a postcommit build.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/78b25723
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/78b25723
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/78b25723

Branch: refs/heads/master
Commit: 78b257237e3fa02d868c8a3a0ed898de4c59fd0b
Parents: 6d9b239
Author: Kenneth Knowles 
Authored: Mon May 1 09:57:06 2017 -0700
Committer: Kenneth Knowles 
Committed: Tue May 2 14:01:15 2017 -0700

--
 sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java | 3 +++
 1 file changed, 3 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/78b25723/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java
index d4c46cc..ab8906a 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java
@@ -148,6 +148,9 @@ public class Pipeline {
* @return The newly created pipeline.
*/
   public static Pipeline create(PipelineOptions options) {
+// TODO: fix runners that mutate PipelineOptions in this method, then 
remove this line
+PipelineRunner.fromOptions(options);
+
 Pipeline pipeline = new Pipeline(options);
 LOG.debug("Creating {}", pipeline);
 return pipeline;



[3/3] beam git commit: This closes #2795: Restore status quo relationship between PipelineOptions initialization and PipelineRunners

2017-05-02 Thread kenn
This closes #2795: Restore status quo relationship between PipelineOptions 
initialization and PipelineRunners

  Instantiate runner briefly in Pipeline
  Skip null options when converting back to argv


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/027dd777
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/027dd777
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/027dd777

Branch: refs/heads/master
Commit: 027dd777dd3c207cc689b4e1b350ca984d47191a
Parents: a552fb8 78b2572
Author: Kenneth Knowles 
Authored: Tue May 2 15:33:28 2017 -0700
Committer: Kenneth Knowles 
Committed: Tue May 2 15:33:28 2017 -0700

--
 sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java   | 3 +++
 .../src/main/java/org/apache/beam/sdk/testing/TestPipeline.java  | 4 +++-
 2 files changed, 6 insertions(+), 1 deletion(-)
--




[GitHub] beam pull request #2847: [BEAM-818,BEAM-828] Make tempLocation an inaccessib...

2017-05-02 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2847

[BEAM-818,BEAM-828] Make tempLocation an inaccessible ValueProvider

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

This is my first foray into `ValueProvider` and `PipelineOptions` 
excitement. The goal: allow users to access the `tempLocation` that they 
specify but make it inaccessible to `PTransform.expand(..)` implementations.

The natural way to do this would be to clone and alter the pipeline 
options, something that runners also ought to be doing when they tweak them 
like `TestDataflowRunner` does. I am informed this is not possible or feasible 
in a short timeframe.

So this is a hack that might not actually work that lets `get()` succeed 
but makes `isAccessible()` false. Throwing it out there as a strawman as I 
don't see a way to do this correctly given today's constraints.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam runtime-ValueProvider

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2847.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2847


commit ebfb87a93448b05d1bdac1e919d2d83d41f00405
Author: Kenneth Knowles 
Date:   2017-05-02T21:36:52Z

Make tempLocation an inaccessible ValueProvider




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-818) ValueProvider for tempLocation, runner, etc, that is unavailable to transforms during construction

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993915#comment-15993915
 ] 

ASF GitHub Bot commented on BEAM-818:
-

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2847

[BEAM-818,BEAM-828] Make tempLocation an inaccessible ValueProvider

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

This is my first foray into `ValueProvider` and `PipelineOptions` 
excitement. The goal: allow users to access the `tempLocation` that they 
specify but make it inaccessible to `PTransform.expand(..)` implementations.

The natural way to do this would be to clone and alter the pipeline 
options, something that runners also ought to be doing when they tweak them 
like `TestDataflowRunner` does. I am informed this is not possible or feasible 
in a short timeframe.

So this is a hack that might not actually work that lets `get()` succeed 
but makes `isAccessible()` false. Throwing it out there as a strawman as I 
don't see a way to do this correctly given today's constraints.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam runtime-ValueProvider

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2847.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2847


commit ebfb87a93448b05d1bdac1e919d2d83d41f00405
Author: Kenneth Knowles 
Date:   2017-05-02T21:36:52Z

Make tempLocation an inaccessible ValueProvider




> ValueProvider for tempLocation, runner, etc, that is unavailable to 
> transforms during construction
> --
>
> Key: BEAM-818
> URL: https://issues.apache.org/jira/browse/BEAM-818
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This stops transforms from changing their operation based on 
> construction-time options, and instead requires that configuration to be 
> explicit, or to obtain the configuration at runtime.
> https://docs.google.com/document/d/1Wr05cYdqnCfrLLqSk--XmGMGgDwwNwWZaFbxLKvPqEQ/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2128) PostCommit Java_MavenInstall broken

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993911#comment-15993911
 ] 

ASF GitHub Bot commented on BEAM-2128:
--

GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/2846

[BEAM-2128] Remove job name usages from BigQueryIO at pipeline construction 
time

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam bqio

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2846.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2846


commit a70c69408e5ba863abef9061f95e143d9ae034da
Author: Vikas Kedigehalli 
Date:   2017-05-02T20:55:32Z

Remove job name usages from BigQueryIO at pipeline construction time




> PostCommit Java_MavenInstall broken
> ---
>
> Key: BEAM-2128
> URL: https://issues.apache.org/jira/browse/BEAM-2128
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, testing
>Reporter: Daniel Halperin
>Assignee: Kenneth Knowles
>
> The test 
> {{org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2EBigQueryTornadoes}}
> is broken since PR #2666 was merged to master.
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_MavenInstall/3533/



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2846: [BEAM-2128] Remove job name usages from BigQueryIO ...

2017-05-02 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/2846

[BEAM-2128] Remove job name usages from BigQueryIO at pipeline construction 
time

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam bqio

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2846.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2846


commit a70c69408e5ba863abef9061f95e143d9ae034da
Author: Vikas Kedigehalli 
Date:   2017-05-02T20:55:32Z

Remove job name usages from BigQueryIO at pipeline construction time




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-885) Move PipelineOptions from Pipeline.create() to Pipeline.run()

2017-05-02 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993908#comment-15993908
 ] 

Kenneth Knowles commented on BEAM-885:
--

I think now that things are converging, this is duplicate to BEAM-818

> Move PipelineOptions from Pipeline.create() to Pipeline.run()
> -
>
> Key: BEAM-885
> URL: https://issues.apache.org/jira/browse/BEAM-885
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Thomas Groh
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The specification of a Pipeline should be independent of its PipelineOptions. 
> This delays specification of the options, including choices like Pipeline 
> Runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1926) Need 3 Python snippets for composite transforms section in programming guide

2017-05-02 Thread Melissa Pashniak (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Melissa Pashniak resolved BEAM-1926.

Resolution: Fixed

Thanks!

> Need 3 Python snippets for composite transforms section in programming guide
> 
>
> Key: BEAM-1926
> URL: https://issues.apache.org/jira/browse/BEAM-1926
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py, website
>Reporter: Melissa Pashniak
>Assignee: Melissa Pashniak
>Priority: Minor
> Fix For: First stable release
>
>
> (PR will be out later today, that will have a note in the 3 needed python 
> blocks, pointing to this JIRA for ease of finding)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1926) Need 3 Python snippets for composite transforms section in programming guide

2017-05-02 Thread Melissa Pashniak (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Melissa Pashniak closed BEAM-1926.
--

> Need 3 Python snippets for composite transforms section in programming guide
> 
>
> Key: BEAM-1926
> URL: https://issues.apache.org/jira/browse/BEAM-1926
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py, website
>Reporter: Melissa Pashniak
>Assignee: Melissa Pashniak
>Priority: Minor
> Fix For: First stable release
>
>
> (PR will be out later today, that will have a note in the 3 needed python 
> blocks, pointing to this JIRA for ease of finding)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-885) Move PipelineOptions from Pipeline.create() to Pipeline.run()

2017-05-02 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-885.
--
Resolution: Duplicate

> Move PipelineOptions from Pipeline.create() to Pipeline.run()
> -
>
> Key: BEAM-885
> URL: https://issues.apache.org/jira/browse/BEAM-885
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Thomas Groh
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The specification of a Pipeline should be independent of its PipelineOptions. 
> This delays specification of the options, including choices like Pipeline 
> Runner.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-1986) Job ALREADY_EXISTS in post commit

2017-05-02 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993906#comment-15993906
 ] 

Mark Liu edited comment on BEAM-1986 at 5/2/17 10:28 PM:
-

What is "name of the main application" exactly mean? In my opinion, for testing 
it probably can be like "TestClassName-TestFunctionName". For general pipeline, 
it can be "Class/ModuleName".

I found 
[inspect|https://docs.python.org/2.7/library/inspect.html#inspect.getframeinfo] 
may be useful, which can get caller stacktrace. However, someone mentioned 
[here|http://stackoverflow.com/a/2654130] that it's not recommended for 
production code since it depends on CPython environment. 


was (Author: markflyhigh):
What is "name of the main application" exactly mean? Ideally for testing, it 
probably can be like "TestClassName-TestFunctionName". For general pipeline, it 
can be "Class/ModuleName".

I found 
[inspect|https://docs.python.org/2.7/library/inspect.html#inspect.getframeinfo] 
may be useful, which can get caller stacktrace. However, someone mentioned 
[here|http://stackoverflow.com/a/2654130] that it's not recommended for 
production code since it depends on CPython environment. 

> Job ALREADY_EXISTS in post commit
> -
>
> Key: BEAM-1986
> URL: https://issues.apache.org/jira/browse/BEAM-1986
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Minor
>
> I noticed a job failed with ALREDY_EXISTS error, a sign of same {{job_name}} 
> auto generated twice. Could we add a 1 second delay to prevent things like 
> this?
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1877/consoleFull
> cc: [~pabloem] Another perspective, would it make sense to add a small random 
> component (e.g. 1-2 digits) to job name to reduce this issue? Or perhaps 
> include ms resolution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1986) Job ALREADY_EXISTS in post commit

2017-05-02 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993906#comment-15993906
 ] 

Mark Liu commented on BEAM-1986:


What is "name of the main application" exactly mean? Ideally for testing, it 
probably can be like "TestClassName-TestFunctionName". For general pipeline, it 
can be "Class/ModuleName".

I found 
[inspect|https://docs.python.org/2.7/library/inspect.html#inspect.getframeinfo] 
may be useful, which can get caller stacktrace. However, someone mentioned 
[here|http://stackoverflow.com/a/2654130] that it's not recommended for 
production code since it depends on CPython environment. 

> Job ALREADY_EXISTS in post commit
> -
>
> Key: BEAM-1986
> URL: https://issues.apache.org/jira/browse/BEAM-1986
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Minor
>
> I noticed a job failed with ALREDY_EXISTS error, a sign of same {{job_name}} 
> auto generated twice. Could we add a 1 second delay to prevent things like 
> this?
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1877/consoleFull
> cc: [~pabloem] Another perspective, would it make sense to add a small random 
> component (e.g. 1-2 digits) to job name to reduce this issue? Or perhaps 
> include ms resolution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3596

2017-05-02 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3014

2017-05-02 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3595

2017-05-02 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1986) Job ALREADY_EXISTS in post commit

2017-05-02 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993875#comment-15993875
 ] 

Ahmet Altay commented on BEAM-1986:
---

{{beampp}} is a placeholder for {{ApplicationName}}. Is it possible to get the 
name of the main application in python? If yes, we should use that.

Let's add {{RandomInteger}} part regardless of the above.

> Job ALREADY_EXISTS in post commit
> -
>
> Key: BEAM-1986
> URL: https://issues.apache.org/jira/browse/BEAM-1986
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Minor
>
> I noticed a job failed with ALREDY_EXISTS error, a sign of same {{job_name}} 
> auto generated twice. Could we add a 1 second delay to prevent things like 
> this?
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1877/consoleFull
> cc: [~pabloem] Another perspective, would it make sense to add a small random 
> component (e.g. 1-2 digits) to job name to reduce this issue? Or perhaps 
> include ms resolution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2139) Disable SplittableDoFn ValidatesRunner tests for Streaming Flink Runner

2017-05-02 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2139.
-
   Resolution: Fixed
Fix Version/s: First stable release

> Disable SplittableDoFn ValidatesRunner tests for Streaming Flink Runner
> ---
>
> Key: BEAM-2139
> URL: https://issues.apache.org/jira/browse/BEAM-2139
> Project: Beam
>  Issue Type: Task
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aviem Zur
> Fix For: First stable release
>
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We should 
> disable those tests for now until we fix it to unblock the open PR for 
> BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3594

2017-05-02 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2139) Disable SplittableDoFn ValidatesRunner tests for Streaming Flink Runner

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993866#comment-15993866
 ] 

ASF GitHub Bot commented on BEAM-2139:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2824


> Disable SplittableDoFn ValidatesRunner tests for Streaming Flink Runner
> ---
>
> Key: BEAM-2139
> URL: https://issues.apache.org/jira/browse/BEAM-2139
> Project: Beam
>  Issue Type: Task
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aviem Zur
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We should 
> disable those tests for now until we fix it to unblock the open PR for 
> BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2824: [BEAM-2139] Disable SplittableDoFn ValidatesRunner ...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2824


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2139] Disable SplittableDoFn ValidatesRunner tests for Streaming Flink Runner

2017-05-02 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master b5561f718 -> a552fb8f6


[BEAM-2139] Disable SplittableDoFn ValidatesRunner tests for Streaming Flink 
Runner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b2283bf2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b2283bf2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b2283bf2

Branch: refs/heads/master
Commit: b2283bf2785ee9fc3927895f38eba93a8945ed49
Parents: b5561f7
Author: Aviem Zur 
Authored: Tue May 2 18:01:54 2017 +0300
Committer: Luke Cwik 
Committed: Tue May 2 14:56:22 2017 -0700

--
 runners/flink/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b2283bf2/runners/flink/pom.xml
--
diff --git a/runners/flink/pom.xml b/runners/flink/pom.xml
index 59a3f2d..eb2b005 100644
--- a/runners/flink/pom.xml
+++ b/runners/flink/pom.xml
@@ -93,7 +93,7 @@
 org.apache.beam.sdk.testing.UsesMapState,
 org.apache.beam.sdk.testing.UsesCommittedMetrics,
 org.apache.beam.sdk.testing.UsesTestStream,
-
org.apache.beam.sdk.testing.UsesSplittableParDoWithWindowedSideInputs
+org.apache.beam.sdk.testing.UsesSplittableParDo
   
   none
   true



[2/2] beam git commit: [BEAM-2139] Disable SplittableDoFn ValidatesRunner tests for Streaming Flink Runner

2017-05-02 Thread lcwik
[BEAM-2139] Disable SplittableDoFn ValidatesRunner tests for Streaming Flink 
Runner

This closes #2824


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a552fb8f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a552fb8f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a552fb8f

Branch: refs/heads/master
Commit: a552fb8f69d1bcab64377d1743f5e7d4d653509a
Parents: b5561f7 b2283bf
Author: Luke Cwik 
Authored: Tue May 2 14:56:52 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 14:56:52 2017 -0700

--
 runners/flink/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Commented] (BEAM-59) Switch from IOChannelFactory to FileSystems

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993859#comment-15993859
 ] 

ASF GitHub Bot commented on BEAM-59:


Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2818


> Switch from IOChannelFactory to FileSystems
> ---
>
> Key: BEAM-59
> URL: https://issues.apache.org/jira/browse/BEAM-59
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> Right now, FileBasedSource and FileBasedSink communication is mediated by 
> IOChannelFactory. There are a number of issues:
> * Global configuration -- e.g., all 'gs://' URIs use the same credentials. 
> This should be per-source/per-sink/etc.
> * Supported APIs -- currently IOChannelFactory is in the "non-public API" 
> util package and subject to change. We need users to be able to add new 
> backends ('s3://', 'hdfs://', etc.) directly, without fear that they will be 
> broken.
> * Per-backend features: e.g., creating buckets in GCS/s3, setting expiration 
> time, etc.
> Updates:
> Design docs posted on dev@ list:
> Part 1: IOChannelFactory Redesign: 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#
> Part 2: Configurable BeamFileSystem:
> https://docs.google.com/document/d/1-7vo9nLRsEEzDGnb562PuL4q9mUiq_ZVpCAiyyJw8p8/edit#heading=h.p3gc3colc2cs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2818: [BEAM-59] Delete old restrictions on output file pa...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2818


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2818

2017-05-02 Thread dhalperi
This closes #2818


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b5561f71
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b5561f71
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b5561f71

Branch: refs/heads/master
Commit: b5561f7188630f6dbe0efc7218541196080877ef
Parents: ccbb00e 00bee9b
Author: Dan Halperin 
Authored: Tue May 2 14:53:11 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 14:53:11 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/io/AvroIO.java | 14 --
 .../main/java/org/apache/beam/sdk/io/TFRecordIO.java | 14 --
 .../src/main/java/org/apache/beam/sdk/io/TextIO.java | 14 --
 .../test/java/org/apache/beam/sdk/io/TextIOTest.java | 15 ---
 4 files changed, 57 deletions(-)
--




[1/2] beam git commit: [BEAM-59] Delete old restrictions on output file paths

2017-05-02 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master ccbb00e38 -> b5561f718


[BEAM-59] Delete old restrictions on output file paths

These predate Apache Beam and are no longer relevant now that Text and Avro are 
implemented
in the SDK


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/00bee9b7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/00bee9b7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/00bee9b7

Branch: refs/heads/master
Commit: 00bee9b70f6fabbb6bfe0655188f8d2c0a7239f9
Parents: ccbb00e
Author: Dan Halperin 
Authored: Mon May 1 23:36:57 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 14:48:21 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/io/AvroIO.java | 14 --
 .../main/java/org/apache/beam/sdk/io/TFRecordIO.java | 14 --
 .../src/main/java/org/apache/beam/sdk/io/TextIO.java | 14 --
 .../test/java/org/apache/beam/sdk/io/TextIOTest.java | 15 ---
 4 files changed, 57 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/00bee9b7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
index 755cdb9..3bb61a2 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/AvroIO.java
@@ -24,7 +24,6 @@ import com.google.common.collect.ImmutableMap;
 import com.google.common.collect.Maps;
 import com.google.common.io.BaseEncoding;
 import java.util.Map;
-import java.util.regex.Pattern;
 import javax.annotation.Nullable;
 import org.apache.avro.Schema;
 import org.apache.avro.file.CodecFactory;
@@ -302,7 +301,6 @@ public class AvroIO {
  * in a common extension, if given by {@link #withSuffix}.
  */
 public Write to(String filenamePrefix) {
-  validateOutputComponent(filenamePrefix);
   return toBuilder().setFilenamePrefix(filenamePrefix).build();
 }
 
@@ -317,7 +315,6 @@ public class AvroIO {
  * See {@link ShardNameTemplate} for a description of shard templates.
  */
 public Write withSuffix(String filenameSuffix) {
-  validateOutputComponent(filenameSuffix);
   return toBuilder().setFilenameSuffix(filenameSuffix).build();
 }
 
@@ -474,17 +471,6 @@ public class AvroIO {
 }
   }
 
-  // Pattern which matches old-style shard output patterns, which are now
-  // disallowed.
-  private static final Pattern SHARD_OUTPUT_PATTERN = 
Pattern.compile("@([0-9]+|\\*)");
-
-  private static void validateOutputComponent(String partialFilePattern) {
-checkArgument(
-!SHARD_OUTPUT_PATTERN.matcher(partialFilePattern).find(),
-"Output name components are not allowed to contain @* or @N patterns: "
-+ partialFilePattern);
-  }
-
   /
 
   /** Disallow construction of utility class. */

http://git-wip-us.apache.org/repos/asf/beam/blob/00bee9b7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java
index 8a1870e..fe0b97d 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TFRecordIO.java
@@ -30,7 +30,6 @@ import java.nio.ByteOrder;
 import java.nio.channels.ReadableByteChannel;
 import java.nio.channels.WritableByteChannel;
 import java.util.NoSuchElementException;
-import java.util.regex.Pattern;
 import javax.annotation.Nullable;
 import org.apache.beam.sdk.coders.ByteArrayCoder;
 import org.apache.beam.sdk.coders.Coder;
@@ -268,7 +267,6 @@ public class TFRecordIO {
  * in a common extension, if given by {@link #withSuffix(String)}.
  */
 public Write to(String filenamePrefix) {
-  validateOutputComponent(filenamePrefix);
   return to(StaticValueProvider.of(filenamePrefix));
 }
 
@@ -285,7 +283,6 @@ public class TFRecordIO {
  * @see ShardNameTemplate
  */
 public Write withSuffix(String nameExtension) {
-  validateOutputComponent(nameExtension);
   return toBuilder().setFilenameSuffix(nameExtension).build();
 }
 
@@ -422,17 +419,6 @@ public class TFRecordIO {
 }
   }
 
-  // Pattern which matches old-style shard output patterns, which are now
-  // disallowed.
-  private static final Pattern SHARD_OUTPUT_PATTERN = 
Pattern.compile("@([0-9]+|\\*)");
-
-  private static void 

[jira] [Comment Edited] (BEAM-1986) Job ALREADY_EXISTS in post commit

2017-05-02 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993850#comment-15993850
 ] 

Mark Liu edited comment on BEAM-1986 at 5/2/17 9:52 PM:


After investigation, I have some ideas to solve it and want to discuss it here:

1. Append some random number in default job_name which is pretty straight 
forward. 

2. I want to improve the descriptive of job_name, which can solve this problem 
and potentially benefit for other test runner in the further. Currently, the 
default pattern is {code}"beamapp-${USER_NAME}-${DATETIME}"{code} which is hard 
to tell which test is running. In Java SDK, default job_name format is more 
helpful: {code}"ApplicationName-UserName-Date-RandomInteger"{code} 
(https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L261).
 However, Java provides ApplicationNameOptions.class to set this value and is 
used by constructing default job_name, but Python doesn't provide the similar 
options. 

I prefer the second solution but it involves with pipeline options design. Just 
want to rise ideas here and hear more thoughts. [~altay] [~pabloem]


was (Author: markflyhigh):
After investigation, I have some ideas to solve it and want to discuss it here:

1. Append some random number in default job_name which is pretty straight 
forward. 

2. I want to improve the descriptive of job_name, which can solve this problem 
and potentially benefit for other test runner in the further. Currently, the 
default pattern is {code}"beamapp-${USER_NAME}-${DATETIME}"{code} which is hard 
to tell which test is running. In Java SDK, default job_name format is more 
helpful: {code}"ApplicationName-UserName-Date-RandomInteger"{code} 
(https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L261).
 However, Java provides ApplicationNameOptions.class to set this value and is 
used by constructing default job_name, but Python doesn't provide the similar 
options. 

I prefer the second solution but it involves with pipeline options design, so I 
want to rise ideas here and here more thoughts. [~altay] [~pabloem]

> Job ALREADY_EXISTS in post commit
> -
>
> Key: BEAM-1986
> URL: https://issues.apache.org/jira/browse/BEAM-1986
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Minor
>
> I noticed a job failed with ALREDY_EXISTS error, a sign of same {{job_name}} 
> auto generated twice. Could we add a 1 second delay to prevent things like 
> this?
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1877/consoleFull
> cc: [~pabloem] Another perspective, would it make sense to add a small random 
> component (e.g. 1-2 digits) to job name to reduce this issue? Or perhaps 
> include ms resolution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-1986) Job ALREADY_EXISTS in post commit

2017-05-02 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993850#comment-15993850
 ] 

Mark Liu edited comment on BEAM-1986 at 5/2/17 9:51 PM:


After investigation, I have some ideas to solve it and want to discuss it here:

1. Append some random number in default job_name which is pretty straight 
forward. 

2. I want to improve the descriptive of job_name, which can solve this problem 
and potentially benefit for other test runner in the further. Currently, the 
default pattern is {code}"beamapp-${USER_NAME}-${DATETIME}"{code} which is hard 
to tell which test is running. In Java SDK, default job_name format is more 
helpful: {code}"ApplicationName-UserName-Date-RandomInteger"{code} 
(https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L261).
 However, Java provides ApplicationNameOptions.class to set this value and is 
used by constructing default job_name, but Python doesn't provide the similar 
options. 

I prefer the second solution but it involves with pipeline options design, so I 
want to rise ideas here and here more thoughts. [~altay] [~pabloem]


was (Author: markflyhigh):
After investigation, I have some ideas to solve it and want to discuss it here:

1. Append some random number in default job_name which is pretty straight 
forward. 

2. I want to improve the descriptive of job_name, which can solve this problem 
and potentially benefit for other test runner in the further. Currently, the 
default pattern is "beamapp-${USER_NAME}-${DATETIME}" which is hard to tell 
which test is running. In Java SDK, default job_name format is more helpful: 
"ApplicationName-UserName-Date-RandomInteger" 
(https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L261).
 However, Java provides ApplicationNameOptions.class to set this value and is 
used by constructing default job_name, but Python doesn't provide the similar 
options. 

I prefer the second solution but it involves with pipeline options design, so I 
want to rise ideas here and here more thoughts. [~altay] [~pabloem]

> Job ALREADY_EXISTS in post commit
> -
>
> Key: BEAM-1986
> URL: https://issues.apache.org/jira/browse/BEAM-1986
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Minor
>
> I noticed a job failed with ALREDY_EXISTS error, a sign of same {{job_name}} 
> auto generated twice. Could we add a 1 second delay to prevent things like 
> this?
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1877/consoleFull
> cc: [~pabloem] Another perspective, would it make sense to add a small random 
> component (e.g. 1-2 digits) to job name to reduce this issue? Or perhaps 
> include ms resolution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1986) Job ALREADY_EXISTS in post commit

2017-05-02 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993850#comment-15993850
 ] 

Mark Liu commented on BEAM-1986:


After investigation, I have some ideas to solve it and want to discuss it here:

1. Append some random number in default job_name which is pretty straight 
forward. 

2. I want to improve the descriptive of job_name, which can solve this problem 
and potentially benefit for other test runner in the further. Currently, the 
default pattern is "beamapp-${USER_NAME}-${DATETIME}" which is hard to tell 
which test is running. In Java SDK, default job_name format is more helpful: 
"ApplicationName-UserName-Date-RandomInteger" 
(https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/options/PipelineOptions.java#L261).
 However, Java provides ApplicationNameOptions.class to set this value and is 
used by constructing default job_name, but Python doesn't provide the similar 
options. 

I prefer the second solution but it involves with pipeline options design, so I 
want to rise ideas here and here more thoughts. [~altay] [~pabloem]

> Job ALREADY_EXISTS in post commit
> -
>
> Key: BEAM-1986
> URL: https://issues.apache.org/jira/browse/BEAM-1986
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Minor
>
> I noticed a job failed with ALREDY_EXISTS error, a sign of same {{job_name}} 
> auto generated twice. Could we add a 1 second delay to prevent things like 
> this?
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Python_Verify/1877/consoleFull
> cc: [~pabloem] Another perspective, would it make sense to add a small random 
> component (e.g. 1-2 digits) to job name to reduce this issue? Or perhaps 
> include ms resolution. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2737: [BEAM-2093] Use the jackson version from the maven ...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2737


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2737

2017-05-02 Thread davor
This closes #2737


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ccbb00e3
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ccbb00e3
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ccbb00e3

Branch: refs/heads/master
Commit: ccbb00e384b78882f103e986fbdbb37f7c31e5e4
Parents: 986d727 403a64b
Author: Davor Bonaci 
Authored: Tue May 2 14:44:29 2017 -0700
Committer: Davor Bonaci 
Committed: Tue May 2 14:44:29 2017 -0700

--
 pom.xml |  6 
 .../main/resources/archetype-resources/pom.xml  |  4 +--
 .../main/resources/archetype-resources/pom.xml  |  4 +--
 sdks/java/maven-archetypes/pom.xml  | 35 
 .../main/resources/archetype-resources/pom.xml  |  2 +-
 5 files changed, 46 insertions(+), 5 deletions(-)
--




[1/2] beam git commit: [BEAM-2093] Use the jackson version from the maven property in maven archetypes

2017-05-02 Thread davor
Repository: beam
Updated Branches:
  refs/heads/master 986d727f6 -> ccbb00e38


[BEAM-2093] Use the jackson version from the maven property in maven archetypes

[BEAM-2093] pom.xml organization cleanup, and use filtering for project version 
as well


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/403a64ba
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/403a64ba
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/403a64ba

Branch: refs/heads/master
Commit: 403a64ba23baf497eb0bf0bca89b9db1966cb1f7
Parents: 986d727
Author: Elek, Márton 
Authored: Thu Apr 27 14:52:05 2017 +0200
Committer: Davor Bonaci 
Committed: Tue May 2 14:44:12 2017 -0700

--
 pom.xml |  6 
 .../main/resources/archetype-resources/pom.xml  |  4 +--
 .../main/resources/archetype-resources/pom.xml  |  4 +--
 sdks/java/maven-archetypes/pom.xml  | 35 
 .../main/resources/archetype-resources/pom.xml  |  2 +-
 5 files changed, 46 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/403a64ba/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 4fa183b..59f6963 100644
--- a/pom.xml
+++ b/pom.xml
@@ -141,6 +141,7 @@
 2.0
 2.20
 2.20
+3.0.2
 
 -Werror
 
-Xpkginfo:always
@@ -216,6 +217,11 @@
 
   
 
+
+  org.apache.maven.plugins
+  maven-resources-plugin
+  ${maven-resources-plugin.version}
+
   
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/403a64ba/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
index 46e526e..508ff9c 100644
--- 
a/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples-java8/src/main/resources/archetype-resources/pom.xml
@@ -27,7 +27,7 @@
   jar
 
   
-0.7.0-SNAPSHOT
+@project.version@
 2.20
   
 
@@ -224,7 +224,7 @@
 
   com.fasterxml.jackson.module
   jackson-module-scala_2.10
-  2.7.2
+  @jackson.version@
   runtime
 
   

http://git-wip-us.apache.org/repos/asf/beam/blob/403a64ba/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
--
diff --git 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
index 99835e4..511e875 100644
--- 
a/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
+++ 
b/sdks/java/maven-archetypes/examples/src/main/resources/archetype-resources/pom.xml
@@ -27,7 +27,7 @@
   jar
 
   
-0.7.0-SNAPSHOT
+@project.version@
 2.20
   
 
@@ -224,7 +224,7 @@
 
   com.fasterxml.jackson.module
   jackson-module-scala_2.10
-  2.7.2
+  @jackson.version@
   runtime
 
   

http://git-wip-us.apache.org/repos/asf/beam/blob/403a64ba/sdks/java/maven-archetypes/pom.xml
--
diff --git a/sdks/java/maven-archetypes/pom.xml 
b/sdks/java/maven-archetypes/pom.xml
index 78e6f08..9d39a1e 100644
--- a/sdks/java/maven-archetypes/pom.xml
+++ b/sdks/java/maven-archetypes/pom.xml
@@ -35,6 +35,41 @@
 starter
   
 
+  
+
+  
+  
+src/main/resources
+true
+
+  archetype-resources/pom.xml
+
+  
+  
+  
+src/main/resources
+false
+
+  archetype-resources/pom.xml
+
+  
+
+
+
+  
+
+  org.apache.maven.plugins
+  maven-resources-plugin
+  
+
+  @
+
+false
+  
+
+  
+
+  
   
 

[jira] [Commented] (BEAM-2135) Rename hdfs module to hadoop-file-system, rename gcp-core to google-cloud-platform-core

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993826#comment-15993826
 ] 

ASF GitHub Bot commented on BEAM-2135:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2844


> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> ---
>
> Key: BEAM-2135
> URL: https://issues.apache.org/jira/browse/BEAM-2135
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> Similarly rename directories as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3593

2017-05-02 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #2844: [BEAM-2135] Fix pointers to sdks/java/io/hadoop-fil...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2844


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: [BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system

2017-05-02 Thread lcwik
[BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system

This closes #2844


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/986d727f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/986d727f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/986d727f

Branch: refs/heads/master
Commit: 986d727f6a115f7cfcd5dda9d39b8fb9cf16ffb2
Parents: 2047d8f 6a1a400
Author: Luke Cwik 
Authored: Tue May 2 14:38:11 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 14:38:11 2017 -0700

--
 pom.xml   | 4 ++--
 sdks/java/javadoc/pom.xml | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)
--




[1/2] beam git commit: [BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system

2017-05-02 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 2047d8fc4 -> 986d727f6


[BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6a1a4009
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6a1a4009
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6a1a4009

Branch: refs/heads/master
Commit: 6a1a40099e18e8191a5b042a530bc300ed6d2528
Parents: 2047d8f
Author: Luke Cwik 
Authored: Tue May 2 14:03:57 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 14:05:42 2017 -0700

--
 pom.xml   | 4 ++--
 sdks/java/javadoc/pom.xml | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6a1a4009/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 8c4ea9f..4fa183b 100644
--- a/pom.xml
+++ b/pom.xml
@@ -423,13 +423,13 @@
 
   
 org.apache.beam
-beam-sdks-java-io-hbase
+beam-sdks-java-io-hadoop-file-system
 ${project.version}
   
 
   
 org.apache.beam
-beam-sdks-java-io-hdfs
+beam-sdks-java-io-hbase
 ${project.version}
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/6a1a4009/sdks/java/javadoc/pom.xml
--
diff --git a/sdks/java/javadoc/pom.xml b/sdks/java/javadoc/pom.xml
index 0d0bec6..b6aa978 100644
--- a/sdks/java/javadoc/pom.xml
+++ b/sdks/java/javadoc/pom.xml
@@ -114,12 +114,12 @@
 
 
   org.apache.beam
-  beam-sdks-java-io-hbase
+  beam-sdks-java-io-hadoop-file-system
 
 
 
   org.apache.beam
-  beam-sdks-java-io-hdfs
+  beam-sdks-java-io-hbase
 
 
 



[GitHub] beam pull request #2845: Updating Dataflow API protos and client

2017-05-02 Thread pabloem
GitHub user pabloem opened a pull request:

https://github.com/apache/beam/pull/2845

Updating Dataflow API protos and client

r:@aaltay 
Ran job 2017-05-02_14_25_41-229271276661569030 with these changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam update-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2845.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2845


commit 539eea96019bf73dafc74a1719309323ee7e9dd0
Author: Pablo 
Date:   2017-05-02T21:23:44Z

Updating Dataflow API protos and client




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2135) Rename hdfs module to hadoop-file-system, rename gcp-core to google-cloud-platform-core

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993754#comment-15993754
 ] 

ASF GitHub Bot commented on BEAM-2135:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2844

[BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam hdfs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2844.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2844


commit 6a1a40099e18e8191a5b042a530bc300ed6d2528
Author: Luke Cwik 
Date:   2017-05-02T21:03:57Z

[BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system




> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> ---
>
> Key: BEAM-2135
> URL: https://issues.apache.org/jira/browse/BEAM-2135
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> Similarly rename directories as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2844: [BEAM-2135] Fix pointers to sdks/java/io/hadoop-fil...

2017-05-02 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2844

[BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam hdfs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2844.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2844


commit 6a1a40099e18e8191a5b042a530bc300ed6d2528
Author: Luke Cwik 
Date:   2017-05-02T21:03:57Z

[BEAM-2135] Fix pointers to sdks/java/io/hadoop-file-system




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2843: Fix pointers to GCP-core

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2843


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: [BEAM-2135] Fix pointers to GCP-core

2017-05-02 Thread lcwik
[BEAM-2135] Fix pointers to GCP-core

This closes #2843


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2047d8fc
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2047d8fc
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2047d8fc

Branch: refs/heads/master
Commit: 2047d8fc4a642b3c0ede6afe1c8ff76cb57573dd
Parents: 4d6f6a1 6f8f821
Author: Luke Cwik 
Authored: Tue May 2 14:01:48 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 14:01:48 2017 -0700

--
 examples/java/pom.xml  | 2 +-
 examples/java8/pom.xml | 2 +-
 pom.xml| 4 ++--
 runners/google-cloud-dataflow-java/pom.xml | 4 ++--
 sdks/java/harness/pom.xml  | 2 +-
 sdks/java/io/google-cloud-platform/pom.xml | 4 ++--
 6 files changed, 9 insertions(+), 9 deletions(-)
--




[1/2] beam git commit: [BEAM-2135] Fix pointers to GCP-core

2017-05-02 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 4d6f6a10f -> 2047d8fc4


[BEAM-2135] Fix pointers to GCP-core


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6f8f8214
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6f8f8214
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6f8f8214

Branch: refs/heads/master
Commit: 6f8f82143db3c2ce6f07b34472a063113819bfdb
Parents: 4d6f6a1
Author: Dan Halperin 
Authored: Tue May 2 13:56:41 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 14:01:03 2017 -0700

--
 examples/java/pom.xml  | 2 +-
 examples/java8/pom.xml | 2 +-
 pom.xml| 4 ++--
 runners/google-cloud-dataflow-java/pom.xml | 4 ++--
 sdks/java/harness/pom.xml  | 2 +-
 sdks/java/io/google-cloud-platform/pom.xml | 4 ++--
 6 files changed, 9 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6f8f8214/examples/java/pom.xml
--
diff --git a/examples/java/pom.xml b/examples/java/pom.xml
index d1db3c3..d673da2 100644
--- a/examples/java/pom.xml
+++ b/examples/java/pom.xml
@@ -453,7 +453,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/6f8f8214/examples/java8/pom.xml
--
diff --git a/examples/java8/pom.xml b/examples/java8/pom.xml
index a30c6a0..2180a49 100644
--- a/examples/java8/pom.xml
+++ b/examples/java8/pom.xml
@@ -203,7 +203,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/6f8f8214/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 9d92d54..8c4ea9f 100644
--- a/pom.xml
+++ b/pom.xml
@@ -360,13 +360,13 @@
 
   
 org.apache.beam
-beam-sdks-java-extensions-gcp-core
+
beam-sdks-java-extensions-google-cloud-platform-core
 ${project.version}
   
 
   
 org.apache.beam
-beam-sdks-java-extensions-gcp-core
+
beam-sdks-java-extensions-google-cloud-platform-core
 tests
 ${project.version}
   

http://git-wip-us.apache.org/repos/asf/beam/blob/6f8f8214/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 1f3db34..c0b6328 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -203,7 +203,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
 
 
 
@@ -372,7 +372,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
   tests
   test
 

http://git-wip-us.apache.org/repos/asf/beam/blob/6f8f8214/sdks/java/harness/pom.xml
--
diff --git a/sdks/java/harness/pom.xml b/sdks/java/harness/pom.xml
index 5cff5cc..73f08cc 100644
--- a/sdks/java/harness/pom.xml
+++ b/sdks/java/harness/pom.xml
@@ -83,7 +83,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/6f8f8214/sdks/java/io/google-cloud-platform/pom.xml
--
diff --git a/sdks/java/io/google-cloud-platform/pom.xml 
b/sdks/java/io/google-cloud-platform/pom.xml
index 6023489..3bdc5d0 100644
--- a/sdks/java/io/google-cloud-platform/pom.xml
+++ b/sdks/java/io/google-cloud-platform/pom.xml
@@ -68,7 +68,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
 
 
 
@@ -257,7 +257,7 @@
 
 
   org.apache.beam
-  beam-sdks-java-extensions-gcp-core
+  
beam-sdks-java-extensions-google-cloud-platform-core
   tests
   test
 



[jira] [Commented] (BEAM-1316) DoFn#startBundle should not be able to output

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993732#comment-15993732
 ] 

ASF GitHub Bot commented on BEAM-1316:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2836


> DoFn#startBundle should not be able to output
> -
>
> Key: BEAM-1316
> URL: https://issues.apache.org/jira/browse/BEAM-1316
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
> Fix For: First stable release
>
>
> While within startBundle and finishBundle, the window in which elements are 
> output is not generally defined. Elements must always be output from within a 
> windowed context, or the {{WindowFn}} used by the {{PCollection}} may not 
> operate appropriately.
> startBundle and finishBundle are suitable for operational duties, similarly 
> to {{setup}} and {{teardown}}, but within the scope of some collection of 
> input elements. This includes actions such as clearing field state within a 
> DoFn and ensuring all live RPCs complete successfully before committing 
> inputs.
> Sometimes it might be reasonable to output from {{@FinishBundle}} but it is 
> hard to imagine a situation where output from {{@StartBundle}} is useful in a 
> way that doesn't seriously abuse things.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2836: [BEAM-1316] Remove the usage of mock from ptransfor...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2836


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2836

2017-05-02 Thread altay
This closes #2836


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4d6f6a10
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4d6f6a10
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4d6f6a10

Branch: refs/heads/master
Commit: 4d6f6a10fdb1b034443911f180ea60ef894df51a
Parents: 87a12af ab55ef3
Author: Ahmet Altay 
Authored: Tue May 2 13:59:10 2017 -0700
Committer: Ahmet Altay 
Committed: Tue May 2 13:59:10 2017 -0700

--
 .../apache_beam/transforms/ptransform_test.py   | 60 +---
 1 file changed, 40 insertions(+), 20 deletions(-)
--




[1/2] beam git commit: [BEAM-1316] Remove the usage of mock from ptransform tests

2017-05-02 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 87a12af6c -> 4d6f6a10f


[BEAM-1316] Remove the usage of mock from ptransform tests


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/ab55ef3b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/ab55ef3b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/ab55ef3b

Branch: refs/heads/master
Commit: ab55ef3b4248789aa4c9eaf4a6dab7262d673819
Parents: 87a12af
Author: Sourabh Bajaj 
Authored: Tue May 2 11:48:19 2017 -0700
Committer: Ahmet Altay 
Committed: Tue May 2 13:59:09 2017 -0700

--
 .../apache_beam/transforms/ptransform_test.py   | 60 +---
 1 file changed, 40 insertions(+), 20 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/ab55ef3b/sdks/python/apache_beam/transforms/ptransform_test.py
--
diff --git a/sdks/python/apache_beam/transforms/ptransform_test.py 
b/sdks/python/apache_beam/transforms/ptransform_test.py
index 80c9768..46c340c 100644
--- a/sdks/python/apache_beam/transforms/ptransform_test.py
+++ b/sdks/python/apache_beam/transforms/ptransform_test.py
@@ -22,7 +22,6 @@ from __future__ import absolute_import
 import operator
 import re
 import unittest
-import mock
 
 import hamcrest as hc
 from nose.plugins.attrib import attr
@@ -47,16 +46,6 @@ from apache_beam.utils.pipeline_options import TypeOptions
 # Disable frequent lint warning due to pipe operator for chaining transforms.
 # pylint: disable=expression-not-assigned
 
-class MyDoFn(beam.DoFn):
-  def start_bundle(self):
-pass
-
-  def process(self, element):
-pass
-
-  def finish_bundle(self):
-yield 'finish'
-
 
 class PTransformTest(unittest.TestCase):
   # Enable nose tests running in parallel
@@ -286,6 +275,13 @@ class PTransformTest(unittest.TestCase):
 self.assertStartswith(cm.exception.message, expected_error_prefix)
 
   def test_do_fn_with_finish(self):
+class MyDoFn(beam.DoFn):
+  def process(self, element):
+pass
+
+  def finish_bundle(self):
+yield 'finish'
+
 pipeline = TestPipeline()
 pcoll = pipeline | 'Start' >> beam.Create([1, 2, 3])
 result = pcoll | 'Do' >> beam.ParDo(MyDoFn())
@@ -300,22 +296,46 @@ class PTransformTest(unittest.TestCase):
 assert_that(result, matcher())
 pipeline.run()
 
-  @mock.patch.object(MyDoFn, 'start_bundle')
-  def test_do_fn_with_start(self, mock_method):
-mock_method.return_value = None
+  def test_do_fn_with_start(self):
+class MyDoFn(beam.DoFn):
+  def __init__(self):
+self.state = 'init'
+
+  def start_bundle(self):
+self.state = 'started'
+return None
+
+  def process(self, element):
+if self.state == 'started':
+  yield 'started'
+self.state = 'process'
+
 pipeline = TestPipeline()
-pipeline | 'Start' >> beam.Create([1, 2, 3]) | 'Do' >> beam.ParDo(MyDoFn())
+pcoll = pipeline | 'Start' >> beam.Create([1, 2, 3])
+result = pcoll | 'Do' >> beam.ParDo(MyDoFn())
+
+# May have many bundles, but each has a start and finish.
+def  matcher():
+  def match(actual):
+equal_to(['started'])(list(set(actual)))
+equal_to([1])([actual.count('started')])
+  return match
+
+assert_that(result, matcher())
 pipeline.run()
-self.assertTrue(mock_method.called)
 
-  @mock.patch.object(MyDoFn, 'start_bundle')
-  def test_do_fn_with_start_error(self, mock_method):
-mock_method.return_value = [1]
+  def test_do_fn_with_start_error(self):
+class MyDoFn(beam.DoFn):
+  def start_bundle(self):
+return [1]
+
+  def process(self, element):
+pass
+
 pipeline = TestPipeline()
 pipeline | 'Start' >> beam.Create([1, 2, 3]) | 'Do' >> beam.ParDo(MyDoFn())
 with self.assertRaises(RuntimeError):
   pipeline.run()
-self.assertTrue(mock_method.called)
 
   def test_filter(self):
 pipeline = TestPipeline()



[jira] [Commented] (BEAM-539) Error when writing to the root of a GCS location

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993730#comment-15993730
 ] 

ASF GitHub Bot commented on BEAM-539:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2770


> Error when writing to the root of a GCS location
> 
>
> Key: BEAM-539
> URL: https://issues.apache.org/jira/browse/BEAM-539
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Chamikara Jayalath
>Priority: Minor
>  Labels: newbie, starter
> Fix For: First stable release
>
>
> User issue: 
> http://stackoverflow.com/questions/38811152/google-dataflow-python-pipeline-write-failure
> Reproduction: use a TextFileSink and set output locations as gs://mybucket 
> and it fails. Change it to gs://mybucket/ and it works.
> The final output path is generated here:
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L495
> And this seemingly works in the Java SDK.
> Stack:
>   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/iobase.py", 
> line 1058, in finish_bundle
> yield window.TimestampedValue(self.writer.close(), window.MAX_TIMESTAMP)
>   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/fileio.py", 
> line 601, in close
> self.sink.close(self.temp_handle)
>   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/fileio.py", 
> line 687, in close
> file_handle.close()
>   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcsio.py", line 
> 617, in close
> self._flush_write_buffer()
>   File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcsio.py", line 
> 647, in _flush_write_buffer
> raise self.upload_thread.last_error  # pylint: disable=raising-bad-type
> HttpError: HttpError accessing 
> :
>  response: <{'status': '404', 'alternate-protocol': '443:quic', 
> 'content-length': '165', 'vary': 'Origin, X-Origin', 'server': 
> 'UploadServer', 'x-guploader-uploadid': 
> 'AEnB2Uq6ZGb_CsrMVxozv6aL48k4OMMiRgYVeVGmJrM-sMQWRGeGMkesOQg5F0W7HZuaqTBog_d4ml-DlIars_ZvJTejdfcbAUr4gswZWVieq82ufc3WR2g',
>  'date': 'Mon, 08 Aug 2016 21:29:46 GMT', 'alt-svc': 'quic=":443"; 
> ma=2592000; v="36,35,34,33,32,31,30"', 'content-type': 'application/json; 
> charset=UTF-8'}>, content <{
>  "error": {
>   "errors": [
>{
> "domain": "global",
> "reason": "notFound",
> "message": "Not Found"
>}
>   ],
>   "code": 404,
>   "message": "Not Found"
>  }
> }



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2770: [BEAM-539] Fixes several issues of FileSink

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2770


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #2843: Fix pointers to GCP-core

2017-05-02 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2843

Fix pointers to GCP-core

R: @lukecwik 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam fix-build

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2843.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2843


commit a188fd55d7d28adb1cd74c4b878949a9c7d07d37
Author: Dan Halperin 
Date:   2017-05-02T20:56:41Z

Fix pointers to GCP-core




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-539] Fixes several issues of FileSink.

2017-05-02 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 0ce01b63f -> 87a12af6c


[BEAM-539] Fixes several issues of FileSink.

(1) Updates FileSink to fail for file name prefixes that only contain a single 
component (for example GCS buckets).

For example, currently FileSink fails for  'gs://aaa' while passing for 
'gs://aaa/'. This change makes FileSink fail for both cases (and makes the 
behaviour consistent with Java).

(2) Updates the name of the temporary directory created by FileSink

Currently, for a filename prefix 'gs://aaa/bbb', the temp path would be of the 
form 'gs://aaa/bbb-temp-...'.
This is error prone since a user pattern 'gs://aaa/bbb*' would match temp 
files. This changes makes the temp path format 'gs://aaa/beam-temp-bbb-...' 
instead.

To achieve above this adds a method 'split()' to FileSystem interface
that is analogous to Python 'os.path.split()' and has the
opposite effect of current method FileSystem.join().


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5ec48c58
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5ec48c58
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5ec48c58

Branch: refs/heads/master
Commit: 5ec48c58c0e32891224598db61ebb63e8731e9fb
Parents: 0ce01b6
Author: Chamikara Jayalath 
Authored: Fri Apr 28 14:38:35 2017 -0700
Committer: chamik...@google.com 
Committed: Tue May 2 13:56:00 2017 -0700

--
 sdks/python/apache_beam/io/fileio.py| 20 +--
 sdks/python/apache_beam/io/fileio_test.py   | 56 
 sdks/python/apache_beam/io/filesystem.py| 17 ++
 sdks/python/apache_beam/io/filesystems.py   | 18 +++
 sdks/python/apache_beam/io/gcp/gcsfilesystem.py | 34 +++-
 .../apache_beam/io/gcp/gcsfilesystem_test.py| 12 +
 sdks/python/apache_beam/io/localfilesystem.py   | 13 +
 .../apache_beam/io/localfilesystem_test.py  | 35 
 8 files changed, 200 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/5ec48c58/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index bb77bfe..49562f7 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -139,12 +139,26 @@ class FileSink(iobase.Sink):
   @check_accessible(['file_path_prefix', 'file_name_suffix'])
   def initialize_write(self):
 file_path_prefix = self.file_path_prefix.get()
-file_name_suffix = self.file_name_suffix.get()
-tmp_dir = file_path_prefix + file_name_suffix + time.strftime(
-'-temp-%Y-%m-%d_%H-%M-%S')
+
+tmp_dir = self._create_temp_dir(file_path_prefix)
 FileSystems.mkdirs(tmp_dir)
 return tmp_dir
 
+  def _create_temp_dir(self, file_path_prefix):
+base_path, last_component = FileSystems.split(file_path_prefix)
+if not last_component:
+  # Trying to re-split the base_path to check if it's a root.
+  new_base_path, _ = FileSystems.split(base_path)
+  if base_path == new_base_path:
+raise ValueError('Cannot create a temporary directory for root path '
+ 'prefix %s. Please specify a file path prefix with '
+ 'at least two components.',
+ file_path_prefix)
+path_components = [base_path,
+   'beam-temp-' + last_component + time.strftime(
+   '-%Y-%m-%d_%H-%M-%S')]
+return FileSystems.join(*path_components)
+
   @check_accessible(['file_path_prefix', 'file_name_suffix'])
   def open_writer(self, init_result, uid):
 # A proper suffix is needed for AUTO compression detection.

http://git-wip-us.apache.org/repos/asf/beam/blob/5ec48c58/sdks/python/apache_beam/io/fileio_test.py
--
diff --git a/sdks/python/apache_beam/io/fileio_test.py 
b/sdks/python/apache_beam/io/fileio_test.py
index 2409873..13778d5 100644
--- a/sdks/python/apache_beam/io/fileio_test.py
+++ b/sdks/python/apache_beam/io/fileio_test.py
@@ -26,10 +26,12 @@ import tempfile
 import unittest
 
 import hamcrest as hc
+import mock
 
 import apache_beam as beam
 from apache_beam import coders
 from apache_beam.io import fileio
+from apache_beam.io.filesystem import BeamIOError
 from apache_beam.test_pipeline import TestPipeline
 from apache_beam.transforms.display import DisplayData
 from apache_beam.transforms.display_test import DisplayDataItemMatcher
@@ -184,6 +186,60 @@ class TestFileSink(_TestCaseWithTempDirCleanUp):
 self.assertTrue('][a][' in concat, concat)
 self.assertTrue('][b][' in concat, concat)
 
+  # Not using 'test' in name so that 

[2/2] beam git commit: This closes #2770

2017-05-02 Thread chamikara
This closes #2770


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/87a12af6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/87a12af6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/87a12af6

Branch: refs/heads/master
Commit: 87a12af6c948ea4678df91821d98902aacaa6b3b
Parents: 0ce01b6 5ec48c5
Author: chamik...@google.com 
Authored: Tue May 2 13:56:57 2017 -0700
Committer: chamik...@google.com 
Committed: Tue May 2 13:56:57 2017 -0700

--
 sdks/python/apache_beam/io/fileio.py| 20 +--
 sdks/python/apache_beam/io/fileio_test.py   | 56 
 sdks/python/apache_beam/io/filesystem.py| 17 ++
 sdks/python/apache_beam/io/filesystems.py   | 18 +++
 sdks/python/apache_beam/io/gcp/gcsfilesystem.py | 34 +++-
 .../apache_beam/io/gcp/gcsfilesystem_test.py| 12 +
 sdks/python/apache_beam/io/localfilesystem.py   | 13 +
 .../apache_beam/io/localfilesystem_test.py  | 35 
 8 files changed, 200 insertions(+), 5 deletions(-)
--




[GitHub] beam pull request #2831: Convert all unknown Coders into CustomCoder CloudOb...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2831


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: [BEAM-2020] Convert all unknown Coders into CustomCoder CloudObjects

2017-05-02 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 3bd8a0f9f -> 0ce01b63f


[BEAM-2020] Convert all unknown Coders into CustomCoder CloudObjects

This ensures that all coders will be serializable, even if there is no
registered coder translator.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0b523b68
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0b523b68
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0b523b68

Branch: refs/heads/master
Commit: 0b523b685cf5581f68b5318b7fa39550232625fe
Parents: 3bd8a0f
Author: Thomas Groh 
Authored: Tue May 2 10:21:27 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 13:44:25 2017 -0700

--
 .../dataflow/util/CloudObjectTranslators.java   | 11 
 .../runners/dataflow/util/CloudObjects.java |  5 +---
 ...aultCoderCloudObjectTranslatorRegistrar.java |  2 +-
 .../runners/dataflow/util/CloudObjectsTest.java | 28 +++-
 4 files changed, 35 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0b523b68/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjectTranslators.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjectTranslators.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjectTranslators.java
index 7a95a9e..c27bee7 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjectTranslators.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjectTranslators.java
@@ -313,10 +313,11 @@ class CloudObjectTranslators {
 
   private static final String CODER_FIELD = "serialized_coder";
   private static final String TYPE_FIELD = "type";
-  public static CloudObjectTranslator custom() {
-return new CloudObjectTranslator() {
+  public static CloudObjectTranslator javaSerialized() {
+return new CloudObjectTranslator() {
   @Override
-  public CloudObject toCloudObject(CustomCoder target) {
+  public CloudObject toCloudObject(Coder target) {
+// CustomCoder is used as the "marker" for a java-serialized coder
 CloudObject cloudObject = CloudObject.forClass(CustomCoder.class);
 Structs.addString(cloudObject, TYPE_FIELD, 
target.getClass().getName());
 Structs.addString(
@@ -327,10 +328,10 @@ class CloudObjectTranslators {
   }
 
   @Override
-  public CustomCoder fromCloudObject(CloudObject cloudObject) {
+  public Coder fromCloudObject(CloudObject cloudObject) {
 String serializedCoder = Structs.getString(cloudObject, CODER_FIELD);
 String type = Structs.getString(cloudObject, TYPE_FIELD);
-return (CustomCoder)
+return (Coder)
 SerializableUtils.deserializeFromByteArray(
 StringUtils.jsonStringToByteArray(serializedCoder), type);
   }

http://git-wip-us.apache.org/repos/asf/beam/blob/0b523b68/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjects.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjects.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjects.java
index a55d10c..9383c48 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjects.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/CloudObjects.java
@@ -67,7 +67,7 @@ public class CloudObjects {
 (CloudObjectTranslator) CODER_TRANSLATORS.get(coder.getClass());
 if (translator != null) {
   return translator.toCloudObject(coder);
-} else if (coder instanceof CustomCoder) {
+} else {
   CloudObjectTranslator customCoderTranslator = 
CODER_TRANSLATORS.get(CustomCoder.class);
   checkNotNull(
   customCoderTranslator,
@@ -77,9 +77,6 @@ public class CloudObjects {
   DefaultCoderCloudObjectTranslatorRegistrar.class.getSimpleName());
   return customCoderTranslator.toCloudObject(coder);
 }
-throw new IllegalArgumentException(
-String.format(
-"Non-Custom %s with no registered %s", Coder.class, 
CloudObjectTranslator.class));
   }
 
   public static Coder coderFromCloudObject(CloudObject cloudObject) {


[2/2] beam git commit: [BEAM-2020] Convert all unknown Coders into CustomCoder CloudObjects

2017-05-02 Thread lcwik
[BEAM-2020] Convert all unknown Coders into CustomCoder CloudObjects

This closes #2831


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0ce01b63
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0ce01b63
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0ce01b63

Branch: refs/heads/master
Commit: 0ce01b63f996a539f57795bfacdd3259655a4910
Parents: 3bd8a0f 0b523b6
Author: Luke Cwik 
Authored: Tue May 2 13:44:56 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 13:44:56 2017 -0700

--
 .../dataflow/util/CloudObjectTranslators.java   | 11 
 .../runners/dataflow/util/CloudObjects.java |  5 +---
 ...aultCoderCloudObjectTranslatorRegistrar.java |  2 +-
 .../runners/dataflow/util/CloudObjectsTest.java | 28 +++-
 4 files changed, 35 insertions(+), 11 deletions(-)
--




[jira] [Commented] (BEAM-1327) Replace OutputTimeFn with enum

2017-05-02 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993689#comment-15993689
 ] 

Kenneth Knowles commented on BEAM-1327:
---

Also, the choice to not call it {{OutputTimeFn}} comes from its reduced scope - 
it has simple well-defined semantics for combining timestamps in a variety of 
possible situations. Previously it seems fixed only to the output of GBK. I'm 
somewhat ambivalent on this. Mostly I want to choose a name we don't have to 
change again. The Runner API should then adopt that name.

> Replace OutputTimeFn with enum
> --
>
> Key: BEAM-1327
> URL: https://issues.apache.org/jira/browse/BEAM-1327
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The class {{OutputTimeFn}} is overkill for a Fn API crossing. There are only 
> three sensible values known: MIN, MAX, EOW. The interface is right for 
> implementing these, but the full class is left over from the days when there 
> was little cost to shipping new kinds of fns. An enum is concise.
> This can be done "mostly" backwards compatibly with legacy adapters in place, 
> but might be less confusing without them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-40) Replace rawtype lambda-incompatible uses of SerializableFunction with SimpleFunction (as appropriate)

2017-05-02 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-40.
-
Resolution: Not A Problem

> Replace rawtype lambda-incompatible uses of SerializableFunction with 
> SimpleFunction (as appropriate)
> -
>
> Key: BEAM-40
> URL: https://issues.apache.org/jira/browse/BEAM-40
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Labels: Java8, backward-incompatible
> Fix For: First stable release
>
>
> When a lambda or method reference is used in Java 8 to provide a 
> SerializableFunction, it is instantiated at the raw type 
> SerializableFunction. We occasionally require reflective access to the actual 
> parameter for OutputT, but it will be unavailable.
> MapElements and FlatMapElements thus use the analogous abstract class 
> SimpleFunction in such situations to prevent use of a lambda or method 
> reference. They then support lambda via separate constructors that require 
> user help to determine the concrete output type.
> This ticket calls for an audit of such situations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2051) Reduce scope of the PCollectionView interface

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993682#comment-15993682
 ] 

ASF GitHub Bot commented on BEAM-2051:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2842

[BEAM-2051] Mark all PCollectionView methods internal

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Add a note that the methods should not be considered to be accessible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam internal_view_methods

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2842.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2842


commit a7f7c55374015cdf33b71f9e944a940cfeceb712
Author: Thomas Groh 
Date:   2017-05-02T17:31:21Z

Mark all PCollectionView methods internal

Add a note that the methods should not be considered to be accessible.




> Reduce scope of the PCollectionView interface
> -
>
> Key: BEAM-2051
> URL: https://issues.apache.org/jira/browse/BEAM-2051
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
> Fix For: First stable release
>
>
> Users should only ever use a PCollectionView class as a token to access a 
> view. A Runner can cast down to a more expressive type if required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-40) Replace rawtype lambda-incompatible uses of SerializableFunction with SimpleFunction (as appropriate)

2017-05-02 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-40?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-40:

Fix Version/s: (was: First stable release)
   Not applicable

> Replace rawtype lambda-incompatible uses of SerializableFunction with 
> SimpleFunction (as appropriate)
> -
>
> Key: BEAM-40
> URL: https://issues.apache.org/jira/browse/BEAM-40
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Labels: Java8, backward-incompatible
> Fix For: Not applicable
>
>
> When a lambda or method reference is used in Java 8 to provide a 
> SerializableFunction, it is instantiated at the raw type 
> SerializableFunction. We occasionally require reflective access to the actual 
> parameter for OutputT, but it will be unavailable.
> MapElements and FlatMapElements thus use the analogous abstract class 
> SimpleFunction in such situations to prevent use of a lambda or method 
> reference. They then support lambda via separate constructors that require 
> user help to determine the concrete output type.
> This ticket calls for an audit of such situations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1327) Replace OutputTimeFn with enum

2017-05-02 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-1327.
---
Resolution: Fixed

> Replace OutputTimeFn with enum
> --
>
> Key: BEAM-1327
> URL: https://issues.apache.org/jira/browse/BEAM-1327
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> The class {{OutputTimeFn}} is overkill for a Fn API crossing. There are only 
> three sensible values known: MIN, MAX, EOW. The interface is right for 
> implementing these, but the full class is left over from the days when there 
> was little cost to shipping new kinds of fns. An enum is concise.
> This can be done "mostly" backwards compatibly with legacy adapters in place, 
> but might be less confusing without them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-40) Replace rawtype lambda-incompatible uses of SerializableFunction with SimpleFunction (as appropriate)

2017-05-02 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993680#comment-15993680
 ] 

Kenneth Knowles commented on BEAM-40:
-

I've flipped through the SDK a bit and haven't found anything I care to adjust 
right now. Details:

In MapElements and FlatMapElements we were careful to devise a builder scheme 
so that users could not accidentally get bitten by rawtypes. In other places 
like Distinct, it seems the approach is to let them get bitten but give them 
methods to improve the behavior. Since the latter can be done without breaking 
superficial backwards compatibility, users will not be stuck if a problem comes 
up later.

> Replace rawtype lambda-incompatible uses of SerializableFunction with 
> SimpleFunction (as appropriate)
> -
>
> Key: BEAM-40
> URL: https://issues.apache.org/jira/browse/BEAM-40
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
>  Labels: Java8, backward-incompatible
> Fix For: Not applicable
>
>
> When a lambda or method reference is used in Java 8 to provide a 
> SerializableFunction, it is instantiated at the raw type 
> SerializableFunction. We occasionally require reflective access to the actual 
> parameter for OutputT, but it will be unavailable.
> MapElements and FlatMapElements thus use the analogous abstract class 
> SimpleFunction in such situations to prevent use of a lambda or method 
> reference. They then support lambda via separate constructors that require 
> user help to determine the concrete output type.
> This ticket calls for an audit of such situations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2842: [BEAM-2051] Mark all PCollectionView methods intern...

2017-05-02 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2842

[BEAM-2051] Mark all PCollectionView methods internal

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Add a note that the methods should not be considered to be accessible.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam internal_view_methods

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2842.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2842


commit a7f7c55374015cdf33b71f9e944a940cfeceb712
Author: Thomas Groh 
Date:   2017-05-02T17:31:21Z

Mark all PCollectionView methods internal

Add a note that the methods should not be considered to be accessible.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-818) ValueProvider for tempLocation, runner, etc, that is unavailable to transforms during construction

2017-05-02 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-818:
-
Summary: ValueProvider for tempLocation, runner, etc, that is unavailable 
to transforms during construction  (was: Remove Pipeline.getPipelineOptions)

> ValueProvider for tempLocation, runner, etc, that is unavailable to 
> transforms during construction
> --
>
> Key: BEAM-818
> URL: https://issues.apache.org/jira/browse/BEAM-818
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This stops transforms from changing their operation based on 
> construction-time options, and instead requires that configuration to be 
> explicit, or to obtain the configuration at runtime.
> https://docs.google.com/document/d/1Wr05cYdqnCfrLLqSk--XmGMGgDwwNwWZaFbxLKvPqEQ/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-818) Remove Pipeline.getPipelineOptions

2017-05-02 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993661#comment-15993661
 ] 

Kenneth Knowles commented on BEAM-818:
--

Upon exploring this, I think we need to do this another way:

BigQueryIO needs a ValueProvider - runtime only is fine - to have a temp 
location for its files. This is a reasonable sort of thing for any IO to need, 
and similar.

So, we don't want it to read the string or validate against it, but we do want 
a way to get the ValueProvider in there without forcing users to always specify 
it (that is why it is a global option).

So I propose making runtime-only options that are unavailable even if a value 
happens to be specified.

> Remove Pipeline.getPipelineOptions
> --
>
> Key: BEAM-818
> URL: https://issues.apache.org/jira/browse/BEAM-818
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This stops transforms from changing their operation based on 
> construction-time options, and instead requires that configuration to be 
> explicit, or to obtain the configuration at runtime.
> https://docs.google.com/document/d/1Wr05cYdqnCfrLLqSk--XmGMGgDwwNwWZaFbxLKvPqEQ/edit#



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3592

2017-05-02 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-48) BigQueryIO.Read reimplemented as BoundedSource

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-48?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993646#comment-15993646
 ] 

ASF GitHub Bot commented on BEAM-48:


Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2832


> BigQueryIO.Read reimplemented as BoundedSource
> --
>
> Key: BEAM-48
> URL: https://issues.apache.org/jira/browse/BEAM-48
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Pei He
> Fix For: 0.1.0-incubating
>
>
> BigQueryIO.Read is currently implemented in a hacky way: the 
> DirectPipelineRunner streams all rows in the table or query result directly 
> using the JSON API, in a single-threaded manner.
> In contrast, the DataflowPipelineRunner uses an entirely different code path 
> implemented in the Google Cloud Dataflow service. (A BigQuery export job to 
> GCS, followed by a parallel read from GCS).
> We need to reimplement BigQueryIO as a BoundedSource in order to support 
> other runners in a scalable way.
> I additionally suggest that we revisit the design of the BigQueryIO source in 
> the process. A short list:
> * Do not use TableRow as the default value for rows. It could be Map Object> with well-defined types, for example, or an Avro GenericRecord. 
> Dropping TableRow will get around a variety of issues with types, fields 
> named 'f', etc., and it will also reduce confusion as we use TableRow objects 
> differently than usual (for good reason).
> * We could also directly add support for a RowParser to a user's POJO.
> * We should expose TableSchema as a side output from the BigQueryIO.Read.
> * Our builders for BigQueryIO.Read are useful and we should keep them. Where 
> possible we should also allow users to provide the JSON objects that 
> configure the underlying intermediate tables, query export, etc. This would 
> let users directly control result flattening, location of intermediate 
> tables, table decorators, etc., and also optimistically let users take 
> advantage of some new BigQuery features without code changes.
> * We could use switch between whether we use a BigQuery export + parallel 
> scan vs API read based on factors such as the size of the table at pipeline 
> construction time.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2832: [BEAM-48] BigQuery: swap from asSingleton to asIter...

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2832


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2832

2017-05-02 Thread dhalperi
This closes #2832


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3bd8a0f9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3bd8a0f9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3bd8a0f9

Branch: refs/heads/master
Commit: 3bd8a0f9f2c0fa4676c090f70a8053de5275de8a
Parents: aaa5e55 059b351
Author: Dan Halperin 
Authored: Tue May 2 13:13:57 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 13:13:57 2017 -0700

--
 .../apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--




[1/2] beam git commit: BigQuery: swap from asSingleton to asIterable for Cleanup

2017-05-02 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master aaa5e55dc -> 3bd8a0f9f


BigQuery: swap from asSingleton to asIterable for Cleanup

asIterable can be simpler for runners to implement as it does not require 
semantically
that the PCollection being viewed contains exactly one element.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/059b351e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/059b351e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/059b351e

Branch: refs/heads/master
Commit: 059b351e58ab746ee699ee5d8ff746a27ec7586e
Parents: aaa5e55
Author: Dan Halperin 
Authored: Tue May 2 10:37:11 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 13:13:52 2017 -0700

--
 .../apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/059b351e/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java
index 75f7b93..f49c4e1 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/PassThroughThenCleanup.java
@@ -53,9 +53,9 @@ class PassThroughThenCleanup extends 
PTransform cleanupSignalView = 
outputs.get(cleanupSignal)
 .setCoder(VoidCoder.of())
-.apply(View.asSingleton().withDefaultValue(null));
+.apply(View.asIterable());
 
 input.getPipeline()
 .apply("Create(CleanupOperation)", Create.of(cleanupOperation))



[jira] [Commented] (BEAM-2141) beam_PerformanceTests_JDBC have not passed in weeks

2017-05-02 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993629#comment-15993629
 ] 

Jean-Baptiste Onofré commented on BEAM-2141:


Sorry, I missed this Jira. Agree to disable the test. However,  I will 
investigate the source of the issue.

> beam_PerformanceTests_JDBC have not passed in weeks
> ---
>
> Key: BEAM-2141
> URL: https://issues.apache.org/jira/browse/BEAM-2141
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Daniel Halperin
> Fix For: Not applicable
>
>
> https://builds.apache.org/job/beam_PerformanceTests_JDBC/
> Disabling them, as no one seems to be maintaining them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2835: [BEAM-2141] Disable JDBC tests

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2835


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #2835

2017-05-02 Thread dhalperi
This closes #2835


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/aaa5e55d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/aaa5e55d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/aaa5e55d

Branch: refs/heads/master
Commit: aaa5e55dc7e02d20e6f91482d96826bb297266f3
Parents: 9f6377f abfd006
Author: Dan Halperin 
Authored: Tue May 2 13:04:26 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 13:04:26 2017 -0700

--
 .test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy | 3 +++
 1 file changed, 3 insertions(+)
--




[1/2] beam git commit: [BEAM-2141] Disable JDBC tests

2017-05-02 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 9f6377fcc -> aaa5e55dc


[BEAM-2141] Disable JDBC tests

They do not pass, and they use up valuable executor space


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/abfd0066
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/abfd0066
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/abfd0066

Branch: refs/heads/master
Commit: abfd0066b3bdc8dfa930541b61c7da5c68cfacf0
Parents: 9f6377f
Author: Dan Halperin 
Authored: Tue May 2 11:45:58 2017 -0700
Committer: Dan Halperin 
Committed: Tue May 2 13:04:22 2017 -0700

--
 .test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy | 3 +++
 1 file changed, 3 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/abfd0066/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
--
diff --git a/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy 
b/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
index 8e581c2..ef73a26 100644
--- a/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
+++ b/.test-infra/jenkins/job_beam_PerformanceTests_JDBC.groovy
@@ -57,4 +57,7 @@ job('beam_PerformanceTests_JDBC'){
 ]
 
 common_job_properties.buildPerformanceTest(delegate, argMap)
+
+// [BEAM-2141] Perf tests do not pass.
+disabled()
 }



[jira] [Commented] (BEAM-1340) Remove or make private public bits of the SDK that shouldn't be public

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993622#comment-15993622
 ] 

ASF GitHub Bot commented on BEAM-1340:
--

GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2841

[BEAM-1340,BEAM-1345] Move state, timers, windowing strategy into public 
folders

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

I have also revised these to mark more bits experimental and internal, but 
additional suggestions welcome.

Even though most of `util` should be package private or private somewhere 
else, I am focused on the pieces that should be _public_ somewhere else.

The way that I help to know what to move is to imagine that `util` did not 
exist, and specifically that we did not publish its javadoc. Then what does a 
user need to know about?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam move-all-the-things

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2841


commit 40c489a300c23e038f9bdbea15135db361205ea0
Author: Kenneth Knowles 
Date:   2017-05-02T17:29:33Z

Move Java sdk.util.state to sdk.state

commit e3a3c7642c8a28f43f92a64ae044b56ab9254675
Author: Kenneth Knowles 
Date:   2017-05-02T17:41:01Z

Add @Internal and @Experimental to state package

commit 0b2d26c8a146779f3dce914f6d7ab9191d6182bf
Author: Kenneth Knowles 
Date:   2017-05-02T19:31:02Z

Mark TimeDomain experimental alongside Timers; improve javadoc

commit 1adbf521d8ffb9238dedf68826453502ca9ef7e7
Author: Kenneth Knowles 
Date:   2017-05-02T19:46:46Z

Move user-facing timer-related classes out of util

commit a5e8e145ed5d5d074a1bcfc3061821c5e53a60d6
Author: Kenneth Knowles 
Date:   2017-05-02T19:53:26Z

Move WindowingStrategy from util to values

WindowingStrategy is a property on PCollection that transform authors
regularly mess with. It is part of the public API.




> Remove or make private public bits of the SDK that shouldn't be public
> --
>
> Key: BEAM-1340
> URL: https://issues.apache.org/jira/browse/BEAM-1340
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions
>Reporter: Kenneth Knowles
>Priority: Blocker
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> This JIRA is for the many small changes that do not merit their own JIRA 
> towards getting the SDK's API surface right. For example, removal of 
> `DoFn.InputProvider` and `DoFn.OutputReceiver`.
> While the above is not quite backwards incompatible, succeeding at this task 
> surely will be.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2141) beam_PerformanceTests_JDBC have not passed in weeks

2017-05-02 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993621#comment-15993621
 ] 

Davor Bonaci commented on BEAM-2141:


LGTM (but please keep this issue opened for the real fix).

> beam_PerformanceTests_JDBC have not passed in weeks
> ---
>
> Key: BEAM-2141
> URL: https://issues.apache.org/jira/browse/BEAM-2141
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Daniel Halperin
> Fix For: Not applicable
>
>
> https://builds.apache.org/job/beam_PerformanceTests_JDBC/
> Disabling them, as no one seems to be maintaining them.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2841: [BEAM-1340,BEAM-1345] Move state, timers, windowing...

2017-05-02 Thread kennknowles
GitHub user kennknowles opened a pull request:

https://github.com/apache/beam/pull/2841

[BEAM-1340,BEAM-1345] Move state, timers, windowing strategy into public 
folders

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

I have also revised these to mark more bits experimental and internal, but 
additional suggestions welcome.

Even though most of `util` should be package private or private somewhere 
else, I am focused on the pieces that should be _public_ somewhere else.

The way that I help to know what to move is to imagine that `util` did not 
exist, and specifically that we did not publish its javadoc. Then what does a 
user need to know about?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kennknowles/beam move-all-the-things

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2841.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2841


commit 40c489a300c23e038f9bdbea15135db361205ea0
Author: Kenneth Knowles 
Date:   2017-05-02T17:29:33Z

Move Java sdk.util.state to sdk.state

commit e3a3c7642c8a28f43f92a64ae044b56ab9254675
Author: Kenneth Knowles 
Date:   2017-05-02T17:41:01Z

Add @Internal and @Experimental to state package

commit 0b2d26c8a146779f3dce914f6d7ab9191d6182bf
Author: Kenneth Knowles 
Date:   2017-05-02T19:31:02Z

Mark TimeDomain experimental alongside Timers; improve javadoc

commit 1adbf521d8ffb9238dedf68826453502ca9ef7e7
Author: Kenneth Knowles 
Date:   2017-05-02T19:46:46Z

Move user-facing timer-related classes out of util

commit a5e8e145ed5d5d074a1bcfc3061821c5e53a60d6
Author: Kenneth Knowles 
Date:   2017-05-02T19:53:26Z

Move WindowingStrategy from util to values

WindowingStrategy is a property on PCollection that transform authors
regularly mess with. It is part of the public API.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (BEAM-2135) Rename hdfs module to hadoop-file-system, rename gcp-core to google-cloud-platform-core

2017-05-02 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2135.
-
   Resolution: Fixed
Fix Version/s: First stable release

> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> ---
>
> Key: BEAM-2135
> URL: https://issues.apache.org/jira/browse/BEAM-2135
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: First stable release
>
>
> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> Similarly rename directories as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2828: [BEAM-2135] Move hdfs to hadoop-file-system

2017-05-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2828


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2135) Rename hdfs module to hadoop-file-system, rename gcp-core to google-cloud-platform-core

2017-05-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993614#comment-15993614
 ] 

ASF GitHub Bot commented on BEAM-2135:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2828


> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> ---
>
> Key: BEAM-2135
> URL: https://issues.apache.org/jira/browse/BEAM-2135
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions, sdk-java-gcp
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> Rename hdfs module to hadoop-file-system, rename gcp-core to 
> google-cloud-platform-core
> Similarly rename directories as well.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/5] beam git commit: [BEAM-2135] Move hdfs to hadoop-file-system

2017-05-02 Thread lcwik
http://git-wip-us.apache.org/repos/asf/beam/blob/bacd33c8/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
--
diff --git 
a/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java 
b/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
deleted file mode 100644
index aee73c4..000
--- 
a/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HDFSFileSink.java
+++ /dev/null
@@ -1,478 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io.hdfs;
-
-import static com.google.common.base.Preconditions.checkNotNull;
-import static com.google.common.base.Preconditions.checkState;
-
-import com.google.auto.value.AutoValue;
-import com.google.common.collect.Lists;
-import com.google.common.collect.Sets;
-import java.io.IOException;
-import java.net.URI;
-import java.security.PrivilegedExceptionAction;
-import java.util.Random;
-import java.util.Set;
-import javax.annotation.Nullable;
-import org.apache.avro.Schema;
-import org.apache.avro.generic.GenericRecord;
-import org.apache.avro.mapred.AvroKey;
-import org.apache.avro.mapreduce.AvroKeyOutputFormat;
-import org.apache.beam.sdk.annotations.Experimental;
-import org.apache.beam.sdk.coders.AvroCoder;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.StringUtf8Coder;
-import org.apache.beam.sdk.io.hadoop.SerializableConfiguration;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.transforms.SerializableFunction;
-import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
-import org.apache.beam.sdk.transforms.windowing.PaneInfo;
-import org.apache.beam.sdk.values.KV;
-import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileStatus;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.PathFilter;
-import org.apache.hadoop.io.NullWritable;
-import org.apache.hadoop.io.Text;
-import org.apache.hadoop.mapreduce.Job;
-import org.apache.hadoop.mapreduce.JobContext;
-import org.apache.hadoop.mapreduce.JobID;
-import org.apache.hadoop.mapreduce.RecordWriter;
-import org.apache.hadoop.mapreduce.TaskAttemptContext;
-import org.apache.hadoop.mapreduce.TaskAttemptID;
-import org.apache.hadoop.mapreduce.TaskID;
-import org.apache.hadoop.mapreduce.TaskType;
-import org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter;
-import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
-import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
-import org.apache.hadoop.mapreduce.task.JobContextImpl;
-import org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl;
-
-/**
- * A {@link Sink} for writing records to a Hadoop filesystem using a Hadoop 
file-based
- * output
- * format.
- *
- * To write a {@link org.apache.beam.sdk.values.PCollection} of elements of 
type T to Hadoop
- * filesystem use {@link HDFSFileSink#to}, specify the path (this can be any 
Hadoop supported
- * filesystem: HDFS, S3, GCS etc), the Hadoop {@link FileOutputFormat}, the 
key class K and the
- * value class V and finally the {@link SerializableFunction} to map from T to 
{@link KV} of K
- * and V.
- *
- * {@code HDFSFileSink} can be used by {@link Write} to create write
- * transform. See example below.
- *
- * {@code HDFSFileSink} comes with helper methods to write text and Apache 
Avro. For example:
- *
- * 
- * {@code
- * HDFSFileSink sink =
- *   HDFSFileSink.toAvro(path, AvroCoder.of(CustomSpecificAvroClass.class));
- * avroRecordsPCollection.apply(Write.to(sink));
- * }
- * 
- *
- * @param  the type of elements of the input {@link 
org.apache.beam.sdk.values.PCollection}.
- * @param  the type of keys to be written to the sink via {@link 
FileOutputFormat}.
- * @param  the type of values to be written to the sink via {@link 
FileOutputFormat}.
- */
-@AutoValue
-@Experimental
-public abstract class HDFSFileSink extends Sink {
-
-  private static final JobID jobId = new JobID(
-  Long.toString(System.currentTimeMillis()),
-  new 

[4/5] beam git commit: [BEAM-2135] Move hdfs to hadoop-file-system

2017-05-02 Thread lcwik
[BEAM-2135] Move hdfs to hadoop-file-system


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bacd33c8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bacd33c8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bacd33c8

Branch: refs/heads/master
Commit: bacd33c81d99f4a3d1b11eb391a7f790087fc2a1
Parents: 3161904
Author: Luke Cwik 
Authored: Tue May 2 09:20:49 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 12:57:44 2017 -0700

--
 sdks/java/io/hadoop-file-system/README.md   |  43 ++
 sdks/java/io/hadoop-file-system/pom.xml | 195 ++
 .../apache/beam/sdk/io/hdfs/HDFSFileSink.java   | 478 ++
 .../apache/beam/sdk/io/hdfs/HDFSFileSource.java | 625 +++
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 240 +++
 .../sdk/io/hdfs/HadoopFileSystemModule.java |  84 +++
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|  49 ++
 .../hdfs/HadoopFileSystemOptionsRegistrar.java  |  35 ++
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  62 ++
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  81 +++
 .../java/org/apache/beam/sdk/io/hdfs/Sink.java  | 195 ++
 .../org/apache/beam/sdk/io/hdfs/UGIHelper.java  |  38 ++
 .../java/org/apache/beam/sdk/io/hdfs/Write.java | 585 +
 .../apache/beam/sdk/io/hdfs/package-info.java   |  22 +
 .../beam/sdk/io/hdfs/HDFSFileSinkTest.java  | 172 +
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 231 +++
 .../sdk/io/hdfs/HadoopFileSystemModuleTest.java |  65 ++
 .../HadoopFileSystemOptionsRegistrarTest.java   |  49 ++
 .../io/hdfs/HadoopFileSystemOptionsTest.java|  48 ++
 .../io/hdfs/HadoopFileSystemRegistrarTest.java  |  81 +++
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  | 247 
 sdks/java/io/hdfs/README.md |  43 --
 sdks/java/io/hdfs/pom.xml   | 195 --
 .../apache/beam/sdk/io/hdfs/HDFSFileSink.java   | 478 --
 .../apache/beam/sdk/io/hdfs/HDFSFileSource.java | 625 ---
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 240 ---
 .../sdk/io/hdfs/HadoopFileSystemModule.java |  84 ---
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|  49 --
 .../hdfs/HadoopFileSystemOptionsRegistrar.java  |  35 --
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  62 --
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  81 ---
 .../java/org/apache/beam/sdk/io/hdfs/Sink.java  | 195 --
 .../org/apache/beam/sdk/io/hdfs/UGIHelper.java  |  38 --
 .../java/org/apache/beam/sdk/io/hdfs/Write.java | 585 -
 .../apache/beam/sdk/io/hdfs/package-info.java   |  22 -
 .../beam/sdk/io/hdfs/HDFSFileSinkTest.java  | 172 -
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 231 ---
 .../sdk/io/hdfs/HadoopFileSystemModuleTest.java |  65 --
 .../HadoopFileSystemOptionsRegistrarTest.java   |  49 --
 .../io/hdfs/HadoopFileSystemOptionsTest.java|  48 --
 .../io/hdfs/HadoopFileSystemRegistrarTest.java  |  81 ---
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  | 247 
 sdks/java/io/pom.xml|   2 +-
 43 files changed, 3626 insertions(+), 3626 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/bacd33c8/sdks/java/io/hadoop-file-system/README.md
--
diff --git a/sdks/java/io/hadoop-file-system/README.md 
b/sdks/java/io/hadoop-file-system/README.md
new file mode 100644
index 000..3a734f2
--- /dev/null
+++ b/sdks/java/io/hadoop-file-system/README.md
@@ -0,0 +1,43 @@
+
+
+# HDFS IO
+
+This library provides HDFS sources and sinks to make it possible to read and
+write Apache Hadoop file formats from Apache Beam pipelines.
+
+Currently, only the read path is implemented. A `HDFSFileSource` allows any
+Hadoop `FileInputFormat` to be read as a `PCollection`.
+
+A `HDFSFileSource` can be read from using the
+`org.apache.beam.sdk.io.Read` transform. For example:
+
+```java
+HDFSFileSource source = HDFSFileSource.from(path, MyInputFormat.class,
+  MyKey.class, MyValue.class);
+PCollection> records = pipeline.apply(Read.from(mySource));
+```
+
+Alternatively, the `readFrom` method is a convenience method that returns a 
read
+transform. For example:
+
+```java
+PCollection> records = 
pipeline.apply(HDFSFileSource.readFrom(path,
+  MyInputFormat.class, MyKey.class, MyValue.class));
+```

http://git-wip-us.apache.org/repos/asf/beam/blob/bacd33c8/sdks/java/io/hadoop-file-system/pom.xml
--
diff --git a/sdks/java/io/hadoop-file-system/pom.xml 
b/sdks/java/io/hadoop-file-system/pom.xml
new file mode 100644
index 000..3ec9848
--- /dev/null
+++ 

[5/5] beam git commit: [BEAM-2135] Move hdfs to hadoop-file-system

2017-05-02 Thread lcwik
[BEAM-2135] Move hdfs to hadoop-file-system

This closes #2828


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9f6377fc
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9f6377fc
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9f6377fc

Branch: refs/heads/master
Commit: 9f6377fcc1a996d5aa855f5c5755ec835804fbe4
Parents: 3161904 bacd33c
Author: Luke Cwik 
Authored: Tue May 2 12:58:25 2017 -0700
Committer: Luke Cwik 
Committed: Tue May 2 12:58:25 2017 -0700

--
 sdks/java/io/hadoop-file-system/README.md   |  43 ++
 sdks/java/io/hadoop-file-system/pom.xml | 195 ++
 .../apache/beam/sdk/io/hdfs/HDFSFileSink.java   | 478 ++
 .../apache/beam/sdk/io/hdfs/HDFSFileSource.java | 625 +++
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 240 +++
 .../sdk/io/hdfs/HadoopFileSystemModule.java |  84 +++
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|  49 ++
 .../hdfs/HadoopFileSystemOptionsRegistrar.java  |  35 ++
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  62 ++
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  81 +++
 .../java/org/apache/beam/sdk/io/hdfs/Sink.java  | 195 ++
 .../org/apache/beam/sdk/io/hdfs/UGIHelper.java  |  38 ++
 .../java/org/apache/beam/sdk/io/hdfs/Write.java | 585 +
 .../apache/beam/sdk/io/hdfs/package-info.java   |  22 +
 .../beam/sdk/io/hdfs/HDFSFileSinkTest.java  | 172 +
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 231 +++
 .../sdk/io/hdfs/HadoopFileSystemModuleTest.java |  65 ++
 .../HadoopFileSystemOptionsRegistrarTest.java   |  49 ++
 .../io/hdfs/HadoopFileSystemOptionsTest.java|  48 ++
 .../io/hdfs/HadoopFileSystemRegistrarTest.java  |  81 +++
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  | 247 
 sdks/java/io/hdfs/README.md |  43 --
 sdks/java/io/hdfs/pom.xml   | 195 --
 .../apache/beam/sdk/io/hdfs/HDFSFileSink.java   | 478 --
 .../apache/beam/sdk/io/hdfs/HDFSFileSource.java | 625 ---
 .../beam/sdk/io/hdfs/HadoopFileSystem.java  | 240 ---
 .../sdk/io/hdfs/HadoopFileSystemModule.java |  84 ---
 .../sdk/io/hdfs/HadoopFileSystemOptions.java|  49 --
 .../hdfs/HadoopFileSystemOptionsRegistrar.java  |  35 --
 .../sdk/io/hdfs/HadoopFileSystemRegistrar.java  |  62 --
 .../beam/sdk/io/hdfs/HadoopResourceId.java  |  81 ---
 .../java/org/apache/beam/sdk/io/hdfs/Sink.java  | 195 --
 .../org/apache/beam/sdk/io/hdfs/UGIHelper.java  |  38 --
 .../java/org/apache/beam/sdk/io/hdfs/Write.java | 585 -
 .../apache/beam/sdk/io/hdfs/package-info.java   |  22 -
 .../beam/sdk/io/hdfs/HDFSFileSinkTest.java  | 172 -
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 231 ---
 .../sdk/io/hdfs/HadoopFileSystemModuleTest.java |  65 --
 .../HadoopFileSystemOptionsRegistrarTest.java   |  49 --
 .../io/hdfs/HadoopFileSystemOptionsTest.java|  48 --
 .../io/hdfs/HadoopFileSystemRegistrarTest.java  |  81 ---
 .../beam/sdk/io/hdfs/HadoopFileSystemTest.java  | 247 
 sdks/java/io/pom.xml|   2 +-
 43 files changed, 3626 insertions(+), 3626 deletions(-)
--




[3/5] beam git commit: [BEAM-2135] Move hdfs to hadoop-file-system

2017-05-02 Thread lcwik
http://git-wip-us.apache.org/repos/asf/beam/blob/bacd33c8/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/Sink.java
--
diff --git 
a/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/Sink.java
 
b/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/Sink.java
new file mode 100644
index 000..fe2db5f
--- /dev/null
+++ 
b/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/Sink.java
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.io.hdfs;
+
+import java.io.Serializable;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.transforms.display.DisplayData;
+import org.apache.beam.sdk.transforms.display.HasDisplayData;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.PaneInfo;
+
+/**
+ * This class is deprecated, and only exists for HDFSFileSink.
+ */
+@Deprecated
+public abstract class Sink implements Serializable, HasDisplayData {
+  /**
+   * Ensures that the sink is valid and can be written to before the write 
operation begins. One
+   * should use {@link com.google.common.base.Preconditions} to implement this 
method.
+   */
+  public abstract void validate(PipelineOptions options);
+
+  /**
+   * Returns an instance of a {@link WriteOperation} that can write to this 
Sink.
+   */
+  public abstract WriteOperation createWriteOperation();
+
+  /**
+   * {@inheritDoc}
+   *
+   * By default, does not register any display data. Implementors may 
override this method
+   * to provide their own display data.
+   */
+  @Override
+  public void populateDisplayData(DisplayData.Builder builder) {}
+
+  /**
+   * A {@link WriteOperation} defines the process of a parallel write of 
objects to a Sink.
+   *
+   * The {@code WriteOperation} defines how to perform initialization and 
finalization of a
+   * parallel write to a sink as well as how to create a {@link Sink.Writer} 
object that can write
+   * a bundle to the sink.
+   *
+   * Since operations in Beam may be run multiple times for redundancy or 
fault-tolerance,
+   * the initialization and finalization defined by a WriteOperation must 
be idempotent.
+   *
+   * {@code WriteOperation}s may be mutable; a {@code WriteOperation} is 
serialized after the
+   * call to {@code initialize} method and deserialized before calls to
+   * {@code createWriter} and {@code finalized}. However, it is not
+   * reserialized after {@code createWriter}, so {@code createWriter} should 
not mutate the
+   * state of the {@code WriteOperation}.
+   *
+   * See {@link Sink} for more detailed documentation about the process of 
writing to a Sink.
+   *
+   * @param  The type of objects to write
+   * @param  The result of a per-bundle write
+   */
+  public abstract static class WriteOperation implements 
Serializable {
+/**
+ * Performs initialization before writing to the sink. Called before 
writing begins.
+ */
+public abstract void initialize(PipelineOptions options) throws Exception;
+
+/**
+ * Indicates that the operation will be performing windowed writes.
+ */
+public abstract void setWindowedWrites(boolean windowedWrites);
+
+/**
+ * Given an Iterable of results from bundle writes, performs finalization 
after writing and
+ * closes the sink. Called after all bundle writes are complete.
+ *
+ * The results that are passed to finalize are those returned by 
bundles that completed
+ * successfully. Although bundles may have been run multiple times (for 
fault-tolerance), only
+ * one writer result will be passed to finalize for each bundle. An 
implementation of finalize
+ * should perform clean up of any failed and successfully retried bundles. 
 Note that these
+ * failed bundles will not have their writer result passed to finalize, so 
finalize should be
+ * capable of locating any temporary/partial output written by failed 
bundles.
+ *
+ * A best practice is to make 

[1/5] beam git commit: [BEAM-2135] Move hdfs to hadoop-file-system

2017-05-02 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 3161904d9 -> 9f6377fcc


http://git-wip-us.apache.org/repos/asf/beam/blob/bacd33c8/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/Write.java
--
diff --git 
a/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/Write.java 
b/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/Write.java
deleted file mode 100644
index 86a9246..000
--- a/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/Write.java
+++ /dev/null
@@ -1,585 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.sdk.io.hdfs;
-
-import static com.google.common.base.Preconditions.checkArgument;
-import static com.google.common.base.Preconditions.checkNotNull;
-
-import com.google.common.collect.ImmutableList;
-import com.google.common.collect.Lists;
-import java.util.List;
-import java.util.UUID;
-import java.util.concurrent.ThreadLocalRandom;
-import javax.annotation.Nullable;
-import org.apache.beam.sdk.Pipeline;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.KvCoder;
-import org.apache.beam.sdk.coders.SerializableCoder;
-import org.apache.beam.sdk.coders.VoidCoder;
-import org.apache.beam.sdk.io.hdfs.Sink.WriteOperation;
-import org.apache.beam.sdk.io.hdfs.Sink.Writer;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.options.ValueProvider;
-import org.apache.beam.sdk.options.ValueProvider.StaticValueProvider;
-import org.apache.beam.sdk.transforms.Create;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.transforms.GroupByKey;
-import org.apache.beam.sdk.transforms.PTransform;
-import org.apache.beam.sdk.transforms.ParDo;
-import org.apache.beam.sdk.transforms.View;
-import org.apache.beam.sdk.transforms.WithKeys;
-import org.apache.beam.sdk.transforms.display.DisplayData;
-import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
-import org.apache.beam.sdk.transforms.windowing.DefaultTrigger;
-import org.apache.beam.sdk.transforms.windowing.GlobalWindows;
-import org.apache.beam.sdk.transforms.windowing.Window;
-import org.apache.beam.sdk.values.KV;
-import org.apache.beam.sdk.values.PCollection;
-import org.apache.beam.sdk.values.PCollection.IsBounded;
-import org.apache.beam.sdk.values.PCollectionView;
-import org.apache.beam.sdk.values.PDone;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-/**
- * This class is deprecated, and only exists currently for HDFSFileSink.
- */
-@Deprecated
-public class Write extends PTransform {
-  private static final Logger LOG = LoggerFactory.getLogger(Write.class);
-
-  private static final int UNKNOWN_SHARDNUM = -1;
-  private static final int UNKNOWN_NUMSHARDS = -1;
-
-  private final Sink sink;
-  // This allows the number of shards to be dynamically computed based on the 
input
-  // PCollection.
-  @Nullable
-  private final PTransform 
computeNumShards;
-  // We don't use a side input for static sharding, as we want this value to 
be updatable
-  // when a pipeline is updated.
-  @Nullable
-  private final ValueProvider numShardsProvider;
-  private boolean windowedWrites;
-
-  /**
-   * Creates a {@link Write} transform that writes to the given {@link Sink}, 
letting the runner
-   * control how many different shards are produced.
-   */
-  public static  Write to(Sink sink) {
-checkNotNull(sink, "sink");
-return new Write<>(sink, null /* runner-determined sharding */, null, 
false);
-  }
-
-  private Write(
-  Sink sink,
-  @Nullable PTransform 
computeNumShards,
-  @Nullable ValueProvider numShardsProvider,
-  boolean windowedWrites) {
-this.sink = sink;
-this.computeNumShards = computeNumShards;
-this.numShardsProvider = numShardsProvider;
-this.windowedWrites = windowedWrites;
-  }
-
-  @Override
-  public PDone expand(PCollection input) {
-checkArgument(IsBounded.BOUNDED == input.isBounded() || windowedWrites,
-"%s can only be applied to an unbounded PCollection if doing windowed 
writes",
-

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #3591

2017-05-02 Thread Apache Jenkins Server
See 




<    1   2   3   4   >