[jira] [Updated] (BEAM-559) DoFnTester should handle Setup / TearDown

2016-08-15 Thread Vikas Kedigehalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikas Kedigehalli updated BEAM-559:
---
Assignee: Daniel Halperin

> DoFnTester should handle Setup / TearDown
> -
>
> Key: BEAM-559
> URL: https://issues.apache.org/jira/browse/BEAM-559
> Project: Beam
>  Issue Type: Improvement
>Reporter: Vikas Kedigehalli
>Assignee: Daniel Halperin
>
> Now that DoFn supports setup and teardown, it would be nice for DoFnTester to 
> add them to its lifecycle so as to avoid calling these methods explicitly in 
> DoFn unit tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-559) DoFnTester should handle Setup / TearDown

2016-08-15 Thread Vikas Kedigehalli (JIRA)
Vikas Kedigehalli created BEAM-559:
--

 Summary: DoFnTester should handle Setup / TearDown
 Key: BEAM-559
 URL: https://issues.apache.org/jira/browse/BEAM-559
 Project: Beam
  Issue Type: Improvement
Reporter: Vikas Kedigehalli


Now that DoFn supports setup and teardown, it would be nice for DoFnTester to 
add them to its lifecycle so as to avoid calling these methods explicitly in 
DoFn unit tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-534) README.md contains dead link

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-534.

   Resolution: Fixed
Fix Version/s: Not applicable

> README.md contains dead link
> 
>
> Key: BEAM-534
> URL: https://issues.apache.org/jira/browse/BEAM-534
> Project: Beam
>  Issue Type: Bug
>  Components: project-management
>Reporter: Frank Yellin
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
> Fix For: Not applicable
>
>
> The final link on README.md is to 
> http://beam.incubator.apache.org/getting_started/
> Clinking on the link gives a 404.  I'm not quite sure what this is supposed 
> to be pointing to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-326) A minor update to KafkaIO javadoc

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-326.

   Resolution: Fixed
Fix Version/s: 0.2.0-incubating

> A minor update to KafkaIO javadoc
> -
>
> Key: BEAM-326
> URL: https://issues.apache.org/jira/browse/BEAM-326
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Raghu Angadi
>Assignee: Raghu Angadi
>Priority: Trivial
> Fix For: 0.2.0-incubating
>
>
> Update code sample for KafkaIO.write() to be consistent with another code 
> sample below it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-306) Make java-only PubsubIO work in InProcessRunner

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-306.

   Resolution: Fixed
Fix Version/s: 0.1.0-incubating

> Make java-only PubsubIO work in InProcessRunner 
> 
>
> Key: BEAM-306
> URL: https://issues.apache.org/jira/browse/BEAM-306
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
> Fix For: 0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-79) Gearpump runner

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-79?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-79:

Component/s: (was: runner-ideas)
 runner-gearpump

> Gearpump runner
> ---
>
> Key: BEAM-79
> URL: https://issues.apache.org/jira/browse/BEAM-79
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-gearpump
>Reporter: Tyler Akidau
>Assignee: Manu Zhang
>
> Intel is submitting Gearpump (http://www.gearpump.io) to ASF 
> (https://wiki.apache.org/incubator/GearpumpProposal). Appears to be a mix of 
> low-level primitives a la MillWheel, with some higher level primitives like 
> non-merging windowing mixed in. Seems like it would make a nice Beam runner.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-459) Add BigInteger to TypeDescriptors

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-459.

   Resolution: Fixed
Fix Version/s: 0.2.0-incubating

> Add BigInteger to TypeDescriptors
> -
>
> Key: BEAM-459
> URL: https://issues.apache.org/jira/browse/BEAM-459
> Project: Beam
>  Issue Type: Bug
>Reporter: Jesse Anderson
>Assignee: Jesse Anderson
> Fix For: 0.2.0-incubating
>
>
> The TypeDescriptors class is missing a BigInteger TypeDescriptor method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-46) Unbounded sink for Google Cloud Bigtable

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-46?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-46.
---
   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> Unbounded sink for Google Cloud Bigtable
> 
>
> Key: BEAM-46
> URL: https://issues.apache.org/jira/browse/BEAM-46
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Reporter: Daniel Halperin
>Assignee: Ian Zhou
> Fix For: 0.3.0-incubating
>
>
> Google Cloud Bigtable is currently in Beta. 
> https://cloud.google.com/bigtable/ A bounded sink is included in the initial 
> code for Beam, and uses asynchronous row mutations (with bounded memory) for 
> maximum throughput.
> The unbounded sink code is in principle not too different. The key areas of 
> focus are better connection management, thread management, and fault 
> tolerance (e.g., so connections are not leaked if bundles fail) in the 
> unbounded case in which there are hundreds of active threads and very small 
> bundles.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-38) Avoid repeated DoFn deserialization

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-38?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-38.
---
   Resolution: Fixed
Fix Version/s: 0.2.0-incubating

> Avoid repeated DoFn deserialization
> ---
>
> Key: BEAM-38
> URL: https://issues.apache.org/jira/browse/BEAM-38
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
> Fix For: 0.2.0-incubating
>
>
> Alluded to in our technical vision, we wish to remove at least some 
> guarantees about when a fresh DoFn instance is created. Repeated 
> deserialization can become a major cost when bundles are very small, as in 
> some streaming pipelines.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-31) When triggers are changed via pipeline update, stale finished triggers data applied

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-31?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-31:

Component/s: (was: runner-core)
 runner-dataflow

> When triggers are changed via pipeline update, stale finished triggers data 
> applied
> ---
>
> Key: BEAM-31
> URL: https://issues.apache.org/jira/browse/BEAM-31
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>  Labels: Triggers, Update
>
> TriggerRunner tracks which trigger subexpresions are finished. When the 
> trigger expession is updated via pipeline update, this data is applied 
> arbitrarily to the new trigger expression.
> Implementation note: trigger subexpressions are identified by number, and 
> their finished state stored in a bit set.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-122) GlobalWindow and allowedLateness can cause inconsistent timer interpretation

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-122.

   Resolution: Fixed
Fix Version/s: 0.1.0-incubating

> GlobalWindow and allowedLateness can cause inconsistent timer interpretation 
> -
>
> Key: BEAM-122
> URL: https://issues.apache.org/jira/browse/BEAM-122
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
> Fix For: 0.1.0-incubating
>
>
> In ReduceFnRunner we have code such as
>window.getMaxTimestamp().plus(windowingStrategy.getAllowedLateness())
> If window is global then maxTimestamp will be 
> BoundedWindow.TIMESTAMP_MAX_VALUE.
> Meanwhile, timestamps beyond BoundedWindow.TIMESTAMP_MAX_VALUE will be 
> clipped in most runners.
> This could cause the time of an expected timer (eg for garbage collection) to 
> not match the actual time of a fired timer.
> We should either make non-zero allowedLateness on the Global window illegal 
> (probably obnoxious) or explicitly clip it to zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-153) Support timeout in runner API

2016-08-15 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422039#comment-15422039
 ] 

Daniel Halperin commented on BEAM-153:
--

Pei, this seems pretty well covered by some of your recent job changes. Any 
thoughts?

[~pei...@gmail.com]

> Support timeout in runner API
> -
>
> Key: BEAM-153
> URL: https://issues.apache.org/jira/browse/BEAM-153
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Eugene Kirpichov
>
> Some users want to make sure that their pipeline doesn't run longer than X 
> minutes (e.g. because sometimes it runs longer than that due to bugs, and in 
> that case they'd rather auto-cancel it than incur the costs).
> The runner API should have a timeout option, so that if the pipeline isn't in 
> a terminal state by then, it is automatically cancelled.
> Naturally, this only applies to batch pipelines.
> A simple way to implement this for a blocking runner (such as 
> BlockingDataflowPipelineRunner) is a wrapper of the sort "start pipeline, and 
> cancel it after timeout" inside run(). For a non-blocking runner this will 
> require support on the underlying execution environment side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-86) CountingSource should expose PTransform> rather than sources directly

2016-08-15 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-86?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin closed BEAM-86.
---
   Resolution: Fixed
Fix Version/s: 0.1.0-incubating

> CountingSource should expose PTransform> rather 
> than sources directly
> ---
>
> Key: BEAM-86
> URL: https://issues.apache.org/jira/browse/BEAM-86
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
> Fix For: 0.1.0-incubating
>
>
> We can make the Source exposure be package-private for tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #835: Fix NPE in BigQueryIO.TransformingReader w...

2016-08-15 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/835

Fix NPE in BigQueryIO.TransformingReader when it is on top of an 
unsplittable reader.





You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam bq-custom-npe

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/835.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #835


commit 28e7d4cc46c67d5057e3ad6c9b4c32059be73fb2
Author: Pei He 
Date:   2016-08-16T00:23:20Z

Fix NPE in BigQueryIO.TransformingReader




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to stable : beam_PostCommit_RunnableOnService_GoogleCloudDataflow #961

2016-08-15 Thread Apache Jenkins Server
See 




[GitHub] incubator-beam pull request #832: [BEAM-557] Exclude guava-testlib from shad...

2016-08-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/832


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-557) Test-scoped dependencies should be excluded from shading package relocation

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421926#comment-15421926
 ] 

ASF GitHub Bot commented on BEAM-557:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/832


> Test-scoped dependencies should be excluded from shading package relocation
> ---
>
> Key: BEAM-557
> URL: https://issues.apache.org/jira/browse/BEAM-557
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> Currently, guava-testlib is being relocated as part of the shading process, 
> but test-scope dependencies aren't bundled in the uber-jar. As a result, the 
> output JAR is unusable without recreating the same shading rules in a 
> consuming project.
> Note that this does not effect our maven test process because tests are run 
> on the unshaded JAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/2] incubator-beam git commit: Exclude guava-testlib from shading relocation

2016-08-15 Thread lcwik
Repository: incubator-beam
Updated Branches:
  refs/heads/master ab104a16c -> f1935d5d1


Exclude guava-testlib from shading relocation

Previously, guava-testlib guava-testlib was being relocated as part of
the shading process, but test-scope dependencies aren't bundled in the
uber-jar. As a result, the output JAR was unusable without recreating the
same shading rules in a consuming project.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/914ae520
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/914ae520
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/914ae520

Branch: refs/heads/master
Commit: 914ae520b15b8a0ed3304b206adcc802f9cce7fb
Parents: ab104a1
Author: Scott Wegner 
Authored: Mon Aug 15 15:39:34 2016 -0700
Committer: Luke Cwik 
Committed: Mon Aug 15 16:52:25 2016 -0700

--
 pom.xml| 2 ++
 runners/direct-java/pom.xml| 6 ++
 runners/google-cloud-dataflow-java/pom.xml | 6 ++
 sdks/java/core/pom.xml | 6 ++
 4 files changed, 20 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/914ae520/pom.xml
--
diff --git a/pom.xml b/pom.xml
index d5e609b..b5f30c1 100644
--- a/pom.xml
+++ b/pom.xml
@@ -633,6 +633,8 @@
   
 
   
+
 com.google.guava
 guava-testlib
 ${guava.version}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/914ae520/runners/direct-java/pom.xml
--
diff --git a/runners/direct-java/pom.xml b/runners/direct-java/pom.xml
index 0a2b4b9..11481f1 100644
--- a/runners/direct-java/pom.xml
+++ b/runners/direct-java/pom.xml
@@ -192,6 +192,10 @@
  the second relocation. -->
 
   com.google.common
+  
+
+com.google.common.**.testing
+  
   
org.apache.beam.runners.direct.repackaged.com.google.common
 
 
@@ -264,6 +268,8 @@
 
 
 
+  
   com.google.guava
   guava-testlib
   test

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/914ae520/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index d130281..d5485ef 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -196,6 +196,10 @@
  the second relocation. -->
 
   com.google.common
+  
+
+com.google.common.**.testing
+  
   
org.apache.beam.sdk.repackaged.com.google.common
 
 
@@ -310,6 +314,8 @@
 
 
 
+  
   com.google.guava
   guava-testlib
   test

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/914ae520/sdks/java/core/pom.xml
--
diff --git a/sdks/java/core/pom.xml b/sdks/java/core/pom.xml
index aff4f66..fddccea 100644
--- a/sdks/java/core/pom.xml
+++ b/sdks/java/core/pom.xml
@@ -193,6 +193,10 @@
   exclude 'org.apache.beam.**', and remove the second 
relocation. -->
 
   com.google.common
+  
+
+com.google.common.**.testing
+  
   
org.apache.beam.sdk.repackaged.com.google.common
 
 
@@ -421,6 +425,8 @@
 
 
 
+  
   com.google.guava
   guava-testlib
   test



[2/2] incubator-beam git commit: [BEAM-557] Exclude guava-testlib from shading relocation

2016-08-15 Thread lcwik
[BEAM-557] Exclude guava-testlib from shading relocation

This closes #832


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/f1935d5d
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/f1935d5d
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/f1935d5d

Branch: refs/heads/master
Commit: f1935d5d139b1f6f7ccf0125752fd999be1dbb53
Parents: ab104a1 914ae52
Author: Luke Cwik 
Authored: Mon Aug 15 16:52:51 2016 -0700
Committer: Luke Cwik 
Committed: Mon Aug 15 16:52:51 2016 -0700

--
 pom.xml| 2 ++
 runners/direct-java/pom.xml| 6 ++
 runners/google-cloud-dataflow-java/pom.xml | 6 ++
 sdks/java/core/pom.xml | 6 ++
 4 files changed, 20 insertions(+)
--




[jira] [Commented] (BEAM-558) DataflowRunner does not support setup/teardown

2016-08-15 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421903#comment-15421903
 ] 

Daniel Halperin commented on BEAM-558:
--

RunnableOnService tests: ParDoTest.java and ParDoLifecycleTest.java need to be 
re-enabled

https://github.com/apache/incubator-beam/blob/master/runners/google-cloud-dataflow-java/pom.xml#L68

> DataflowRunner does not support setup/teardown
> --
>
> Key: BEAM-558
> URL: https://issues.apache.org/jira/browse/BEAM-558
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
>Priority: Critical
>
> Tracking bug for tests we disable for Dataflow that depend on this behavior :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-558) DataflowRunner does not support setup/teardown

2016-08-15 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-558:


 Summary: DataflowRunner does not support setup/teardown
 Key: BEAM-558
 URL: https://issues.apache.org/jira/browse/BEAM-558
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Daniel Halperin
Assignee: Thomas Groh
Priority: Critical


Tracking bug for tests we disable for Dataflow that depend on this behavior :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #834: BigtableIO StreamingWrite Setup and Teardo...

2016-08-15 Thread ianzhou1
GitHub user ianzhou1 opened a pull request:

https://github.com/apache/incubator-beam/pull/834

BigtableIO StreamingWrite Setup and Teardown



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ianzhou1/incubator-beam 
BigtableStreamingSetupTeardown

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/834.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #834


commit 5fd70ab69f8b56574c14f964e6c3f27ac78c1b92
Author: Ian Zhou 
Date:   2016-08-15T22:39:56Z

Modified BigtableIO to use DoFn setup/tearDown methods instead of 
startBundle/finishBundle

commit 5d2107f7cd5b105634e982992028e8ed732716f3
Author: Ian Zhou 
Date:   2016-08-15T23:26:56Z

Added flushing and logging




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] incubator-beam git commit: Exclude ParDoTest from Dataflow @RunnableOnService

2016-08-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 7c680079b -> ab104a16c


Exclude ParDoTest from Dataflow @RunnableOnService

Until we implement it for Dataflow runner.

Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/23a229eb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/23a229eb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/23a229eb

Branch: refs/heads/master
Commit: 23a229ebfc33734319df80a4085470932f851ce8
Parents: 7c68007
Author: Daniel Halperin 
Authored: Mon Aug 15 15:21:15 2016 -0700
Committer: GitHub 
Committed: Mon Aug 15 15:21:15 2016 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/23a229eb/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index c32e184..d130281 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -66,6 +66,7 @@
 
   
 
org/apache/beam/sdk/transforms/ParDoLifecycleTest.java
+
org/apache/beam/sdk/transforms/ParDoTest.java
   
 
   



[GitHub] incubator-beam pull request #831: Exclude ParDoTest from Dataflow @RunnableO...

2016-08-15 Thread dhalperi
Github user dhalperi closed the pull request at:

https://github.com/apache/incubator-beam/pull/831


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #831

2016-08-15 Thread dhalperi
Closes #831


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/ab104a16
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/ab104a16
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/ab104a16

Branch: refs/heads/master
Commit: ab104a16cd1221dde0ed0716a7a4de99fed8565b
Parents: 7c68007 23a229e
Author: Dan Halperin 
Authored: Mon Aug 15 16:25:36 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 16:25:36 2016 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 1 +
 1 file changed, 1 insertion(+)
--




[jira] [Commented] (BEAM-550) Datastore should support writes for Unbounded PCollections

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421843#comment-15421843
 ] 

ASF GitHub Bot commented on BEAM-550:
-

Github user vikkyrk closed the pull request at:

https://github.com/apache/incubator-beam/pull/825


> Datastore should support writes for Unbounded PCollections 
> ---
>
> Key: BEAM-550
> URL: https://issues.apache.org/jira/browse/BEAM-550
> Project: Beam
>  Issue Type: Bug
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #825: [BEAM-550] DatastoreIO Write PTranform for...

2016-08-15 Thread vikkyrk
Github user vikkyrk closed the pull request at:

https://github.com/apache/incubator-beam/pull/825


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-550) Datastore should support writes for Unbounded PCollections

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421823#comment-15421823
 ] 

ASF GitHub Bot commented on BEAM-550:
-

GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/833

[BEAM-550] DatastoreIO Sink as ParDo

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
- DatastoreIO doesn't have an atomic operations, making 
WriteOperation#(initialize/finalize) a noop. There is no benefit for the 
additional complexity the Sink API brings. Hence cleaning up those classes and 
directly using a ParDo to implement Datastore writes.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam vikasrk/datastore_sink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #833


commit 3ad876dddbf3affe39093a748578aa5d93f7b4e0
Author: Vikas Kedigehalli 
Date:   2016-08-15T22:28:07Z

DatastoreIO Sink as ParDo




> Datastore should support writes for Unbounded PCollections 
> ---
>
> Key: BEAM-550
> URL: https://issues.apache.org/jira/browse/BEAM-550
> Project: Beam
>  Issue Type: Bug
>Reporter: Vikas Kedigehalli
>Assignee: Vikas Kedigehalli
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #833: [BEAM-550] DatastoreIO Sink as ParDo

2016-08-15 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/incubator-beam/pull/833

[BEAM-550] DatastoreIO Sink as ParDo

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
- DatastoreIO doesn't have an atomic operations, making 
WriteOperation#(initialize/finalize) a noop. There is no benefit for the 
additional complexity the Sink API brings. Hence cleaning up those classes and 
directly using a ParDo to implement Datastore writes.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam vikasrk/datastore_sink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/833.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #833


commit 3ad876dddbf3affe39093a748578aa5d93f7b4e0
Author: Vikas Kedigehalli 
Date:   2016-08-15T22:28:07Z

DatastoreIO Sink as ParDo




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-557) Test-scoped dependencies should be excluded from shading package relocation

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421818#comment-15421818
 ] 

ASF GitHub Bot commented on BEAM-557:
-

GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/832

[BEAM-557] Exclude guava-testlib from shading relocation

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Previously, guava-testlib guava-testlib was being relocated as part of
the shading process, but test-scope dependencies aren't bundled in the
uber-jar. As a result, the output JAR was unusable without recreating the
same shading rules in a consuming project.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam test-shading

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #832


commit 03aafacafa2b917d8469dbd7043caf87dfe99e5f
Author: Scott Wegner 
Date:   2016-08-15T22:39:34Z

Exclude guava-testlib from shading relocation

Previously, guava-testlib guava-testlib was being relocated as part of
the shading process, but test-scope dependencies aren't bundled in the
uber-jar. As a result, the output JAR was unusable without recreating the
same shading rules in a consuming project.




> Test-scoped dependencies should be excluded from shading package relocation
> ---
>
> Key: BEAM-557
> URL: https://issues.apache.org/jira/browse/BEAM-557
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> Currently, guava-testlib is being relocated as part of the shading process, 
> but test-scope dependencies aren't bundled in the uber-jar. As a result, the 
> output JAR is unusable without recreating the same shading rules in a 
> consuming project.
> Note that this does not effect our maven test process because tests are run 
> on the unshaded JAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #832: [BEAM-557] Exclude guava-testlib from shad...

2016-08-15 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/832

[BEAM-557] Exclude guava-testlib from shading relocation

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Previously, guava-testlib guava-testlib was being relocated as part of
the shading process, but test-scope dependencies aren't bundled in the
uber-jar. As a result, the output JAR was unusable without recreating the
same shading rules in a consuming project.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam test-shading

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/832.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #832


commit 03aafacafa2b917d8469dbd7043caf87dfe99e5f
Author: Scott Wegner 
Date:   2016-08-15T22:39:34Z

Exclude guava-testlib from shading relocation

Previously, guava-testlib guava-testlib was being relocated as part of
the shading process, but test-scope dependencies aren't bundled in the
uber-jar. As a result, the output JAR was unusable without recreating the
same shading rules in a consuming project.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Improve error handling in gcsio.py

2016-08-15 Thread dhalperi
Improve error handling in gcsio.py


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/4d6da9cf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/4d6da9cf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/4d6da9cf

Branch: refs/heads/python-sdk
Commit: 4d6da9cf19373a4f5a6c9513f44e12341c985a97
Parents: 1d53e28
Author: Charles Chen 
Authored: Mon Aug 15 15:12:00 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 15:26:05 2016 -0700

--
 sdks/python/apache_beam/io/gcsio.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/4d6da9cf/sdks/python/apache_beam/io/gcsio.py
--
diff --git a/sdks/python/apache_beam/io/gcsio.py 
b/sdks/python/apache_beam/io/gcsio.py
index 7bb532c..4c733d9 100644
--- a/sdks/python/apache_beam/io/gcsio.py
+++ b/sdks/python/apache_beam/io/gcsio.py
@@ -29,6 +29,7 @@ import os
 import re
 import StringIO
 import threading
+import traceback
 
 from apitools.base.py.exceptions import HttpError
 import apitools.base.py.transfer as transfer
@@ -591,7 +592,8 @@ class GcsBufferedWriter(object):
   self.client.objects.Insert(self.insert_request, upload=self.upload)
 except Exception as e:  # pylint: disable=broad-except
   logging.error(
-  'Error in _start_upload while inserting file %s: %s', self.path, e)
+  'Error in _start_upload while inserting file %s: %s', self.path,
+  traceback.format_exc())
   self.upload_thread.last_error = e
 finally:
   self.child_conn.close()
@@ -623,6 +625,9 @@ class GcsBufferedWriter(object):
 self.closed = True
 self.conn.close()
 self.upload_thread.join()
+# Check for exception since the last _flush_write_buffer() call.
+if self.upload_thread.last_error:
+  raise self.upload_thread.last_error  # pylint: disable=raising-bad-type
 
   def __enter__(self):
 return self



[1/2] incubator-beam git commit: Closes #830

2016-08-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 1d53e28e9 -> 3b40945e9


Closes #830


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/3b40945e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/3b40945e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/3b40945e

Branch: refs/heads/python-sdk
Commit: 3b40945e93844245eb88aa5fb544eacf86509098
Parents: 1d53e28 4d6da9c
Author: Dan Halperin 
Authored: Mon Aug 15 15:26:05 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 15:26:05 2016 -0700

--
 sdks/python/apache_beam/io/gcsio.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)
--




[1/2] incubator-beam git commit: Revert the changes to the accidentaly reverted files

2016-08-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk b28e71979 -> 1d53e28e9


Revert the changes to the accidentaly reverted files


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a4760fe3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a4760fe3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a4760fe3

Branch: refs/heads/python-sdk
Commit: a4760fe3fcd5a0dda31d137aee29b3cd825de9cf
Parents: b28e719
Author: Ahmet Altay 
Authored: Mon Aug 15 15:00:32 2016 -0700
Committer: Ahmet Altay 
Committed: Mon Aug 15 15:00:32 2016 -0700

--
 sdks/python/apache_beam/io/fileio.py| 25 
 sdks/python/apache_beam/io/gcsio.py |  9 +--
 sdks/python/apache_beam/io/gcsio_test.py| 23 +-
 sdks/python/apache_beam/runners/common.pxd  |  2 +-
 sdks/python/apache_beam/runners/common.py   | 18 ++
 .../python/apache_beam/runners/direct_runner.py |  9 +++
 .../runners/inprocess/transform_evaluator.py|  8 +++
 .../apache_beam/transforms/aggregator_test.py   |  8 +++
 8 files changed, 75 insertions(+), 27 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a4760fe3/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index b1e091b..a6ce26a 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -283,7 +283,17 @@ class _CompressionType(object):
 self.identifier = identifier
 
   def __eq__(self, other):
-return self.identifier == other.identifier
+return (isinstance(other, _CompressionType) and
+self.identifier == other.identifier)
+
+  def __hash__(self):
+return hash(self.identifier)
+
+  def __ne__(self, other):
+return not self.__eq__(other)
+
+  def __repr__(self):
+return '_CompressionType(%s)' % self.identifier
 
 
 class CompressionTypes(object):
@@ -530,15 +540,22 @@ class FileSink(iobase.Sink):
 channel_factory.rename(old_name, final_name)
   except IOError as e:
 # May have already been copied.
-exists = channel_factory.exists(final_name)
+try:
+  exists = channel_factory.exists(final_name)
+except Exception as exists_e:  # pylint: disable=broad-except
+  logging.warning('Exception when checking if file %s exists: '
+  '%s', final_name, exists_e)
+  # Returning original exception after logging the exception from
+  # exists() call.
+  return (None, e)
 if not exists:
   logging.warning(('IOError in _rename_file. old_name: %s, '
'final_name: %s, err: %s'), old_name, final_name, e)
-  return(None, e)
+  return (None, e)
   except Exception as e:  # pylint: disable=broad-except
 logging.warning(('Exception in _rename_file. old_name: %s, '
  'final_name: %s, err: %s'), old_name, final_name, e)
-return(None, e)
+return (None, e)
   return (final_name, None)
 
 # ThreadPool crashes in old versions of Python (< 2.7.5) if created from a

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a4760fe3/sdks/python/apache_beam/io/gcsio.py
--
diff --git a/sdks/python/apache_beam/io/gcsio.py 
b/sdks/python/apache_beam/io/gcsio.py
index 88fcfb8..7bb532c 100644
--- a/sdks/python/apache_beam/io/gcsio.py
+++ b/sdks/python/apache_beam/io/gcsio.py
@@ -234,8 +234,13 @@ class GcsIO(object):
  object=object_path)
   self.client.objects.Get(request)  # metadata
   return True
-except IOError:
-  return False
+except HttpError as http_error:
+  if http_error.status_code == 404:
+# HTTP 404 indicates that the file did not exist
+return False
+  else:
+# We re-raise all other exceptions
+raise
 
   @retry.with_exponential_backoff(
   retry_filter=retry.retry_on_server_errors_and_timeout_filter)

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/a4760fe3/sdks/python/apache_beam/io/gcsio_test.py
--
diff --git a/sdks/python/apache_beam/io/gcsio_test.py 
b/sdks/python/apache_beam/io/gcsio_test.py
index 7b15ef3..99c99b3 100644
--- a/sdks/python/apache_beam/io/gcsio_test.py
+++ b/sdks/python/apache_beam/io/gcsio_test.py
@@ -29,6 +29,7 @@ from apitools.base.py.exceptions import HttpError
 from apache_beam.internal.clients import storage
 
 from apache_beam.io import 

[GitHub] incubator-beam pull request #829: Revert the changes to the accidentaly reve...

2016-08-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/829


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] incubator-beam git commit: Closes #829

2016-08-15 Thread dhalperi
Closes #829


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/1d53e28e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/1d53e28e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/1d53e28e

Branch: refs/heads/python-sdk
Commit: 1d53e28e94610530b87f750e976e162ec378293f
Parents: b28e719 a4760fe
Author: Dan Halperin 
Authored: Mon Aug 15 15:24:52 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 15:24:52 2016 -0700

--
 sdks/python/apache_beam/io/fileio.py| 25 
 sdks/python/apache_beam/io/gcsio.py |  9 +--
 sdks/python/apache_beam/io/gcsio_test.py| 23 +-
 sdks/python/apache_beam/runners/common.pxd  |  2 +-
 sdks/python/apache_beam/runners/common.py   | 18 ++
 .../python/apache_beam/runners/direct_runner.py |  9 +++
 .../runners/inprocess/transform_evaluator.py|  8 +++
 .../apache_beam/transforms/aggregator_test.py   |  8 +++
 8 files changed, 75 insertions(+), 27 deletions(-)
--




[GitHub] incubator-beam pull request #831: Exclude ParDoTest from Dataflow @RunnableO...

2016-08-15 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/incubator-beam/pull/831

Exclude ParDoTest from Dataflow @RunnableOnService

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---

Until we implement it for Dataflow runner.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/incubator-beam patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/831.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #831


commit 23a229ebfc33734319df80a4085470932f851ce8
Author: Daniel Halperin 
Date:   2016-08-15T22:21:15Z

Exclude ParDoTest from Dataflow @RunnableOnService

Until we implement it for Dataflow runner.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #960

2016-08-15 Thread Apache Jenkins Server
See 




[GitHub] incubator-beam pull request #830: Improve error handling in gcsio.py.

2016-08-15 Thread charlesccychen
GitHub user charlesccychen opened a pull request:

https://github.com/apache/incubator-beam/pull/830

Improve error handling in gcsio.py.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/charlesccychen/incubator-beam gcsio-fixes

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/830.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #830


commit a225d1677f59ce5aa50dc1b046ea2db004565873
Author: Charles Chen 
Date:   2016-08-15T22:12:00Z

Improve error handling in gcsio.py.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-557) Test-scoped dependencies should be excluded from shading package relocation

2016-08-15 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-557:
-

 Summary: Test-scoped dependencies should be excluded from shading 
package relocation
 Key: BEAM-557
 URL: https://issues.apache.org/jira/browse/BEAM-557
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Scott Wegner
Assignee: Scott Wegner
Priority: Minor


Currently, guava-testlib is being relocated as part of the shading process, but 
test-scope dependencies aren't bundled in the uber-jar. As a result, the output 
JAR is unusable without recreating the same shading rules in a consuming 
project.

Note that this does not effect our maven test process because tests are run on 
the unshaded JAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #829: Revert the changes to the accidentaly reve...

2016-08-15 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/incubator-beam/pull/829

Revert the changes to the accidentaly reverted files




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/incubator-beam und

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/829.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #829


commit a4760fe3fcd5a0dda31d137aee29b3cd825de9cf
Author: Ahmet Altay 
Date:   2016-08-15T22:00:32Z

Revert the changes to the accidentaly reverted files




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


incubator-beam git commit: Closes #537

2016-08-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 76de6109f -> b28e71979


Closes #537


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/b28e7197
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/b28e7197
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/b28e7197

Branch: refs/heads/python-sdk
Commit: b28e71979efafed64adba391db2abbcd7656d07d
Parents: 18ce31c 76de610
Author: Dan Halperin 
Authored: Mon Aug 15 14:36:32 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:36:32 2016 -0700

--
 pom.xml |   3 +-
 sdks/pom.xml|   4 +-
 sdks/python/MANIFEST.in |  19 +++
 sdks/python/apache_beam/io/fileio.py|  25 +--
 sdks/python/apache_beam/io/gcsio.py |   9 +-
 sdks/python/apache_beam/io/gcsio_test.py|  23 +--
 sdks/python/apache_beam/runners/common.pxd  |   2 +-
 sdks/python/apache_beam/runners/common.py   |  18 +-
 .../python/apache_beam/runners/direct_runner.py |   9 +-
 .../runners/inprocess/transform_evaluator.py|   8 +-
 .../apache_beam/transforms/aggregator_test.py   |   8 +-
 sdks/python/apache_beam/utils/dependency.py |   1 -
 .../python/apache_beam/utils/dependency_test.py |   1 -
 sdks/python/apache_beam/version.py  |  42 -
 sdks/python/pom.xml | 169 +++
 sdks/python/setup.cfg   |   2 +
 sdks/python/setup.py|   2 +-
 sdks/python/tox.ini |   3 +
 18 files changed, 266 insertions(+), 82 deletions(-)
--




incubator-beam git commit: [BEAM-378] integrate setuptools in Maven build

2016-08-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/python-sdk 18ce31c1d -> 76de6109f


[BEAM-378] integrate setuptools in Maven build

This PR provide an initial integration of the Python SDK in the Maven
build, relying on the exec-maven-plugin to call setuptools (BEAM-378).


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/76de6109
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/76de6109
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/76de6109

Branch: refs/heads/python-sdk
Commit: 76de6109fa19db34390edc593a62524a92f54ca0
Parents: 18ce31c
Author: Sergio Fernández 
Authored: Mon Aug 15 14:34:42 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:35:22 2016 -0700

--
 pom.xml |   3 +-
 sdks/pom.xml|   4 +-
 sdks/python/MANIFEST.in |  19 +++
 sdks/python/apache_beam/io/fileio.py|  25 +--
 sdks/python/apache_beam/io/gcsio.py |   9 +-
 sdks/python/apache_beam/io/gcsio_test.py|  23 +--
 sdks/python/apache_beam/runners/common.pxd  |   2 +-
 sdks/python/apache_beam/runners/common.py   |  18 +-
 .../python/apache_beam/runners/direct_runner.py |   9 +-
 .../runners/inprocess/transform_evaluator.py|   8 +-
 .../apache_beam/transforms/aggregator_test.py   |   8 +-
 sdks/python/apache_beam/utils/dependency.py |   1 -
 .../python/apache_beam/utils/dependency_test.py |   1 -
 sdks/python/apache_beam/version.py  |  42 -
 sdks/python/pom.xml | 169 +++
 sdks/python/setup.cfg   |   2 +
 sdks/python/setup.py|   2 +-
 sdks/python/tox.ini |   3 +
 18 files changed, 266 insertions(+), 82 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/76de6109/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 9e58ffe..afe24ee 100644
--- a/pom.xml
+++ b/pom.xml
@@ -778,7 +778,7 @@
 
   org.apache.rat
   apache-rat-plugin
-  0.11
+  0.12
   
 
${project.build.directory}/${project.build.finalName}.rat
 false
@@ -794,6 +794,7 @@
   **/test/resources/**/*.txt
   **/test/**/.placeholder
   .repository/**/*
+  **/nose-*.egg/**/*
 
   
   **/.checkstyle

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/76de6109/sdks/pom.xml
--
diff --git a/sdks/pom.xml b/sdks/pom.xml
index aa9cbed..fe37e96 100644
--- a/sdks/pom.xml
+++ b/sdks/pom.xml
@@ -34,6 +34,7 @@
 
   
 java
+python
   
 
   
@@ -53,8 +54,9 @@
 
   
 
+
   
 
   
 
-
\ No newline at end of file
+

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/76de6109/sdks/python/MANIFEST.in
--
diff --git a/sdks/python/MANIFEST.in b/sdks/python/MANIFEST.in
new file mode 100644
index 000..baa2fda
--- /dev/null
+++ b/sdks/python/MANIFEST.in
@@ -0,0 +1,19 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+# This file is used from Python to sync versions
+include pom.xml

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/76de6109/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index a6ce26a..b1e091b 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -283,17 +283,7 @@ class _CompressionType(object):
 self.identifier = identifier
 
   def __eq__(self, other):
-return (isinstance(other, _CompressionType) and
-self.identifier == other.identifier)
-
-  def __hash__(self):

[jira] [Commented] (BEAM-452) Implement DoFn per-instance setup and teardown methods

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421690#comment-15421690
 ] 

ASF GitHub Bot commented on BEAM-452:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/690


> Implement DoFn per-instance setup and teardown methods
> --
>
> Key: BEAM-452
> URL: https://issues.apache.org/jira/browse/BEAM-452
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, runner-direct, runner-flink, 
> runner-spark, sdk-java-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>
> https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit
> BEAM-38 permits DoFns to be reused across bundles. DoFn instances may need to 
> do per-instance setup and teardown, and to avoid redoing the work per-bundle, 
> the system should provide hooks to call before a DoFn is first used and after 
> it will no longer be used.
> DoFn#setup is called before any other calls to DoFn methods. DoFn#teardown is 
> called after any method throws an exception, or when the runner will no 
> longer use a DoFn instance (e.g. when it evicts it from a cache).
> Runners must call these methods appropriately in all cases (including if a 
> DoFn is used exactly once, for a single bundle, and discarded).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[1/5] incubator-beam git commit: Closes #690

2016-08-15 Thread dhalperi
Repository: incubator-beam
Updated Branches:
  refs/heads/master 0b1f66421 -> 7c680079b


Closes #690


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7c680079
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7c680079
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7c680079

Branch: refs/heads/master
Commit: 7c680079b5074ff31257d7f8fff77af1dd9ea62c
Parents: 0b1f664 29cbdce
Author: Dan Halperin 
Authored: Mon Aug 15 14:16:54 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:16:54 2016 -0700

--
 .../direct/BoundedReadEvaluatorFactory.java |   4 +
 .../beam/runners/direct/CloningThreadLocal.java |  43 --
 .../runners/direct/DoFnLifecycleManager.java| 106 +
 ...ecycleManagerRemovingTransformEvaluator.java |  79 
 .../runners/direct/DoFnLifecycleManagers.java   |  45 ++
 .../direct/ExecutorServiceParallelExecutor.java |   9 +-
 .../runners/direct/FlattenEvaluatorFactory.java |   3 +
 .../GroupAlsoByWindowEvaluatorFactory.java  |   6 +-
 .../direct/GroupByKeyOnlyEvaluatorFactory.java  |   4 +-
 .../direct/ParDoMultiEvaluatorFactory.java  |  55 ++-
 .../direct/ParDoSingleEvaluatorFactory.java |  42 +-
 ...readLocalInvalidatingTransformEvaluator.java |  63 ---
 .../direct/TransformEvaluatorFactory.java   |   8 +
 .../direct/TransformEvaluatorRegistry.java  |  41 ++
 .../direct/UnboundedReadEvaluatorFactory.java   |   3 +
 .../runners/direct/ViewEvaluatorFactory.java|   3 +
 .../runners/direct/WindowEvaluatorFactory.java  |   3 +
 .../runners/direct/CloningThreadLocalTest.java  |  92 
 ...leManagerRemovingTransformEvaluatorTest.java | 144 ++
 .../direct/DoFnLifecycleManagerTest.java| 168 +++
 .../direct/DoFnLifecycleManagersTest.java   | 142 ++
 ...LocalInvalidatingTransformEvaluatorTest.java | 135 --
 .../functions/FlinkDoFnFunction.java|  12 +-
 .../functions/FlinkMultiOutputDoFnFunction.java |  31 +-
 .../streaming/FlinkAbstractParDoWrapper.java|   2 +
 .../FlinkGroupAlsoByWindowWrapper.java  |   2 +
 runners/google-cloud-dataflow-java/pom.xml  |  10 +
 .../runners/spark/translation/DoFnFunction.java |  23 +-
 .../spark/translation/MultiDoFnFunction.java|   1 +
 .../spark/translation/SparkProcessContext.java  |  17 +
 .../org/apache/beam/sdk/transforms/DoFn.java|  31 +-
 .../beam/sdk/transforms/DoFnReflector.java  |  70 ++-
 .../org/apache/beam/sdk/transforms/OldDoFn.java |  25 ++
 .../org/apache/beam/sdk/transforms/ParDo.java   |  15 +-
 .../beam/sdk/transforms/DoFnReflectorTest.java  |  65 +++
 .../beam/sdk/transforms/ParDoLifecycleTest.java | 448 +++
 .../apache/beam/sdk/transforms/ParDoTest.java   |  15 +-
 37 files changed, 1553 insertions(+), 412 deletions(-)
--




[4/5] incubator-beam git commit: Move ParDo Lifecycle tests to their own file

2016-08-15 Thread dhalperi
Move ParDo Lifecycle tests to their own file

These tests are not yet functional in all runners, and this makes them
easier to ignore.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/29cbdceb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/29cbdceb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/29cbdceb

Branch: refs/heads/master
Commit: 29cbdceb5b78ce86ad0d90050d7542b0d5b45362
Parents: 12abb1b
Author: Thomas Groh 
Authored: Thu Aug 11 10:45:43 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:16:54 2016 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  10 +
 .../beam/sdk/transforms/ParDoLifecycleTest.java | 448 +++
 .../apache/beam/sdk/transforms/ParDoTest.java   | 405 -
 3 files changed, 458 insertions(+), 405 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/29cbdceb/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 86991b7..c32e184 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -60,6 +60,16 @@
 true
   
 
+
+  
+runnable-on-service-tests
+
+  
+
org/apache/beam/sdk/transforms/ParDoLifecycleTest.java
+  
+
+  
+
   
 
   

[2/5] incubator-beam git commit: Add DoFn @Setup and @Teardown

2016-08-15 Thread dhalperi
Add DoFn @Setup and @Teardown

Methods annotated with these annotations are used to perform expensive
setup work and clean up a DoFn after another method throws an exception
or the DoFn is discarded.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/12abb1b0
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/12abb1b0
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/12abb1b0

Branch: refs/heads/master
Commit: 12abb1b02246b8d36021c7b1a970daf1b64ba4b9
Parents: cf0bf3b
Author: Thomas Groh 
Authored: Thu Jul 14 14:51:02 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:16:54 2016 -0700

--
 .../runners/direct/DoFnLifecycleManager.java|  38 +-
 ...ecycleManagerRemovingTransformEvaluator.java |  39 +-
 .../runners/direct/DoFnLifecycleManagers.java   |  45 ++
 .../direct/ParDoMultiEvaluatorFactory.java  |   4 +-
 .../direct/ParDoSingleEvaluatorFactory.java |   4 +-
 .../direct/DoFnLifecycleManagerTest.java|  49 +++
 .../direct/DoFnLifecycleManagersTest.java   | 142 +++
 .../functions/FlinkDoFnFunction.java|  12 +-
 .../functions/FlinkMultiOutputDoFnFunction.java |  31 +-
 .../streaming/FlinkAbstractParDoWrapper.java|   2 +
 .../FlinkGroupAlsoByWindowWrapper.java  |   2 +
 .../runners/spark/translation/DoFnFunction.java |  23 +-
 .../spark/translation/MultiDoFnFunction.java|   1 +
 .../spark/translation/SparkProcessContext.java  |  17 +
 .../org/apache/beam/sdk/transforms/DoFn.java|  31 +-
 .../beam/sdk/transforms/DoFnReflector.java  |  70 +++-
 .../org/apache/beam/sdk/transforms/OldDoFn.java |  25 ++
 .../org/apache/beam/sdk/transforms/ParDo.java   |  15 +-
 .../beam/sdk/transforms/DoFnReflectorTest.java  |  65 +++
 .../apache/beam/sdk/transforms/ParDoTest.java   | 420 ++-
 20 files changed, 970 insertions(+), 65 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/12abb1b0/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
index 2783657..3f4f2c6 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
@@ -18,6 +18,7 @@
 
 package org.apache.beam.runners.direct;
 
+import org.apache.beam.sdk.runners.PipelineRunner;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.OldDoFn;
 import org.apache.beam.sdk.util.SerializableUtils;
@@ -26,6 +27,13 @@ import com.google.common.cache.CacheBuilder;
 import com.google.common.cache.CacheLoader;
 import com.google.common.cache.LoadingCache;
 
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Iterator;
+
 /**
  * Manages {@link DoFn} setup, teardown, and serialization.
  *
@@ -35,6 +43,8 @@ import com.google.common.cache.LoadingCache;
  * {@link DoFn DoFns}.
  */
 class DoFnLifecycleManager {
+  private static final Logger LOG = 
LoggerFactory.getLogger(DoFnLifecycleManager.class);
+
   public static DoFnLifecycleManager of(OldDoFn original) {
 return new DoFnLifecycleManager(original);
   }
@@ -52,14 +62,30 @@ class DoFnLifecycleManager {
 
   public void remove() throws Exception {
 Thread currentThread = Thread.currentThread();
-outstanding.invalidate(currentThread);
+OldDoFn fn = outstanding.asMap().remove(currentThread);
+fn.teardown();
   }
 
   /**
-   * Remove all {@link DoFn DoFns} from this {@link DoFnLifecycleManager}.
+   * Remove all {@link DoFn DoFns} from this {@link DoFnLifecycleManager}. 
Returns all exceptions
+   * that were thrown while calling the remove methods.
+   *
+   * If the returned Collection is nonempty, an exception was thrown from 
at least one
+   * {@link DoFn#teardown()} method, and the {@link PipelineRunner} should 
throw an exception.
*/
-  public void removeAll() throws Exception {
-outstanding.invalidateAll();
+  public Collection removeAll() throws Exception {
+Iterator> fns = outstanding.asMap().values().iterator();
+Collection thrown = new ArrayList<>();
+while (fns.hasNext()) {
+  OldDoFn fn = fns.next();
+  fns.remove();
+  try {
+fn.teardown();
+  } catch (Exception e) {
+thrown.add(e);
+  }
+}
+return thrown;
   }
 
   private class DeserializingCacheLoader extends CacheLoader> {
@@ -71,8 +97,10 @@ class DoFnLifecycleMan

[3/5] incubator-beam git commit: Add TransformEvaluatorFactory#cleanup

2016-08-15 Thread dhalperi
Add TransformEvaluatorFactory#cleanup

This cleans up any state stored within the Transform Evaluator Factory.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/12b19677
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/12b19677
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/12b19677

Branch: refs/heads/master
Commit: 12b19677280c11b0dca203ef266769b05c90937e
Parents: 0b1f664
Author: Thomas Groh 
Authored: Fri Jul 15 11:27:00 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:16:54 2016 -0700

--
 .../direct/BoundedReadEvaluatorFactory.java |  4 ++
 .../direct/ExecutorServiceParallelExecutor.java |  9 -
 .../runners/direct/FlattenEvaluatorFactory.java |  3 ++
 .../GroupAlsoByWindowEvaluatorFactory.java  |  6 ++-
 .../direct/GroupByKeyOnlyEvaluatorFactory.java  |  4 +-
 .../direct/ParDoMultiEvaluatorFactory.java  |  5 +++
 .../direct/ParDoSingleEvaluatorFactory.java |  5 +++
 .../direct/TransformEvaluatorFactory.java   |  8 
 .../direct/TransformEvaluatorRegistry.java  | 41 
 .../direct/UnboundedReadEvaluatorFactory.java   |  3 ++
 .../runners/direct/ViewEvaluatorFactory.java|  3 ++
 .../runners/direct/WindowEvaluatorFactory.java  |  3 ++
 12 files changed, 90 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/12b19677/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
index 2f4f86c..0c4b7fd 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/BoundedReadEvaluatorFactory.java
@@ -60,6 +60,10 @@ final class BoundedReadEvaluatorFactory implements 
TransformEvaluatorFactory {
 return getTransformEvaluator((AppliedPTransform) application, 
evaluationContext);
   }
 
+  @Override
+  public void cleanup() {
+  }
+
   /**
* Get a {@link TransformEvaluator} that produces elements for the provided 
application of
* {@link Bounded Read.Bounded}, initializing the queue of evaluators if 
required.

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/12b19677/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java
index a0a5ec0..8c6c6ed 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/ExecutorServiceParallelExecutor.java
@@ -447,13 +447,18 @@ final class ExecutorServiceParallelExecutor implements 
PipelineExecutor {
 private boolean shouldShutdown() {
   boolean shouldShutdown = exceptionThrown || evaluationContext.isDone();
   if (shouldShutdown) {
+LOG.debug("Pipeline has terminated. Shutting down.");
+executorService.shutdown();
+try {
+  registry.cleanup();
+} catch (Exception e) {
+  visibleUpdates.add(VisibleExecutorUpdate.fromThrowable(e));
+}
 if (evaluationContext.isDone()) {
-  LOG.debug("Pipeline is finished. Shutting down. {}");
   while (!visibleUpdates.offer(VisibleExecutorUpdate.finished())) {
 visibleUpdates.poll();
   }
 }
-executorService.shutdown();
   }
   return shouldShutdown;
 }

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/12b19677/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
index c84f620..5a0d31d 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/FlattenEvaluatorFactory.java
@@ -43,6 +43,9 @@ class FlattenEvaluatorFactory implements 
TransformEvaluatorFactory {
 return evaluator;
   }
 
+  @Ov

[GitHub] incubator-beam pull request #690: [BEAM-452] Add DoFn setup and teardown met...

2016-08-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/690


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[5/5] incubator-beam git commit: Replace CloningThreadLocal with DoFnLifecycleManager

2016-08-15 Thread dhalperi
Replace CloningThreadLocal with DoFnLifecycleManager

This is a more focused interface that interacts with a DoFn before it
is available for use and after it has completed and the reference is
lost. It is required to properly support setup and teardown, as the
fields in a ThreadLocal cannot all be cleaned up without additional
tracking.

Part of BEAM-452.


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/cf0bf3bf
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/cf0bf3bf
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/cf0bf3bf

Branch: refs/heads/master
Commit: cf0bf3bf9fcab2b01d69ff90d9ba3f602a8a5bd4
Parents: 12b1967
Author: Thomas Groh 
Authored: Tue Jul 19 11:03:15 2016 -0700
Committer: Dan Halperin 
Committed: Mon Aug 15 14:16:54 2016 -0700

--
 .../beam/runners/direct/CloningThreadLocal.java |  43 --
 .../runners/direct/DoFnLifecycleManager.java|  78 ++
 ...ecycleManagerRemovingTransformEvaluator.java |  80 +++
 .../direct/ParDoMultiEvaluatorFactory.java  |  56 
 .../direct/ParDoSingleEvaluatorFactory.java |  43 +++---
 ...readLocalInvalidatingTransformEvaluator.java |  63 
 .../runners/direct/CloningThreadLocalTest.java  |  92 
 ...leManagerRemovingTransformEvaluatorTest.java | 144 +++
 .../direct/DoFnLifecycleManagerTest.java| 119 +++
 ...LocalInvalidatingTransformEvaluatorTest.java | 135 -
 10 files changed, 475 insertions(+), 378 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/cf0bf3bf/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CloningThreadLocal.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CloningThreadLocal.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CloningThreadLocal.java
deleted file mode 100644
index b9dc4ca..000
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/CloningThreadLocal.java
+++ /dev/null
@@ -1,43 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.beam.runners.direct;
-
-import org.apache.beam.sdk.util.SerializableUtils;
-
-import java.io.Serializable;
-
-/**
- * A {@link ThreadLocal} that obtains the initial value by cloning an original 
value.
- */
-class CloningThreadLocal extends ThreadLocal {
-  public static  CloningThreadLocal of(T original) {
-return new CloningThreadLocal<>(original);
-  }
-
-  private final T original;
-
-  private CloningThreadLocal(T original) {
-this.original = original;
-  }
-
-  @Override
-  public T initialValue() {
-return SerializableUtils.clone(original);
-  }
-}

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/cf0bf3bf/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
new file mode 100644
index 000..2783657
--- /dev/null
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DoFnLifecycleManager.java
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BA

[jira] [Created] (BEAM-556) typo in documentation

2016-08-15 Thread Frank Yellin (JIRA)
Frank Yellin created BEAM-556:
-

 Summary: typo in documentation
 Key: BEAM-556
 URL: https://issues.apache.org/jira/browse/BEAM-556
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Frank Yellin
Assignee: Frances Perry
Priority: Trivial


transform.py:

ergument -> argument  

in documentation for parse_label_and_args



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-555) Documentation in BiqQueryIO.java has awkward cut-and-paste error.

2016-08-15 Thread Frank Yellin (JIRA)
Frank Yellin created BEAM-555:
-

 Summary: Documentation in BiqQueryIO.java has awkward 
cut-and-paste error.
 Key: BEAM-555
 URL: https://issues.apache.org/jira/browse/BEAM-555
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Frank Yellin
Assignee: Davor Bonaci
Priority: Trivial


Twice in the documentation, the sample code reads from samples.weather_stations 
and called the resulting TableRow "shakespeare".

I suspect that these lines of code were copied from a different example, and 
then only partially modified.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-554) Dataflow runner to support bounded writes in streaming mode.

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421523#comment-15421523
 ] 

ASF GitHub Bot commented on BEAM-554:
-

GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/828

[BEAM-554] Set Gcs upload buffer size to 1M in DataflowRunner streaming mode

This PR solves OOMs when DataflowRunner streaming mode writes hundreds 
files in parallel.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam gcs-buffer-size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/828.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #828


commit ee88e16d74d3e8e4ea566243769ae9ecc587c47c
Author: Pei He 
Date:   2016-08-15T19:22:11Z

Set Gcs upload buffer size to 1M in streaming mode in DataflowRunner




> Dataflow runner to support bounded writes in streaming mode.
> 
>
> Key: BEAM-554
> URL: https://issues.apache.org/jira/browse/BEAM-554
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Pei He
>Assignee: Pei He
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #828: [BEAM-554] Set Gcs upload buffer size to 1...

2016-08-15 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/incubator-beam/pull/828

[BEAM-554] Set Gcs upload buffer size to 1M in DataflowRunner streaming mode

This PR solves OOMs when DataflowRunner streaming mode writes hundreds 
files in parallel.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam gcs-buffer-size

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/828.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #828


commit ee88e16d74d3e8e4ea566243769ae9ecc587c47c
Author: Pei He 
Date:   2016-08-15T19:22:11Z

Set Gcs upload buffer size to 1M in streaming mode in DataflowRunner




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-554) Dataflow runner to support bounded writes in streaming mode.

2016-08-15 Thread Pei He (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pei He updated BEAM-554:

Assignee: Pei He  (was: Davor Bonaci)
 Summary: Dataflow runner to support bounded writes in streaming mode.  
(was: Dataflow runner to support bounded in streaming mode.)

> Dataflow runner to support bounded writes in streaming mode.
> 
>
> Key: BEAM-554
> URL: https://issues.apache.org/jira/browse/BEAM-554
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Pei He
>Assignee: Pei He
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-554) Dataflow runner to support bounded in streaming mode.

2016-08-15 Thread Pei He (JIRA)
Pei He created BEAM-554:
---

 Summary: Dataflow runner to support bounded in streaming mode.
 Key: BEAM-554
 URL: https://issues.apache.org/jira/browse/BEAM-554
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Pei He
Assignee: Davor Bonaci






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-522) Update FileSink.finalize_write() to be idempotent

2016-08-15 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-522.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> Update FileSink.finalize_write() to be idempotent
> -
>
> Key: BEAM-522
> URL: https://issues.apache.org/jira/browse/BEAM-522
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
> Fix For: Not applicable
>
>
> Currently FileSink.finelize_write() in fileio.py [1] performs following 
> operations.
> (1) Obtains a list of temporary files as a side input
> (2) Renames each temporary file to the location where final output should be 
> stored.
> iobase.Sink.finalize_write() operation should be idempotent since runner 
> implementations may call this operation multiple times due to task failures. 
> Current implementation is not idempotent because if we re-run the operation 
> after renaming a sub-set of files, the operations may fail due to not being 
> able to find some files at source location (for example, [2] for GCS files).
> We can fix this by checking if the destination file is already available 
> before performing the rename and not performing the rename for files that are 
> already available at the destination.
> [1] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L503
> [2] 
> https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/gcsio.py#L187
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-553) Add a custom text source

2016-08-15 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-553:
---

 Summary: Add a custom text source
 Key: BEAM-553
 URL: https://issues.apache.org/jira/browse/BEAM-553
 Project: Beam
  Issue Type: New Feature
  Components: sdk-py
Reporter: Chamikara Jayalath
Assignee: Chamikara Jayalath


Currently, the text source implementation available for Python SDK [1] is a 
Dataflow native source which only works efficiently for Dataflow runner. We 
should add a custom text source on top of custom file-based source framework 
[2] so that other runner implementations can potentially use the same text 
source implementation.

Custom text source implementation for Java SDK is at [3].

[1] 
https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L70
[2] 
https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/filebasedsource.py
[3] 
https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/TextIO.java#L745



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-552) TimeZone.getTimeZone has performance issue.

2016-08-15 Thread Satish Subhashrao Saley (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Satish Subhashrao Saley closed BEAM-552.

   Resolution: Invalid
Fix Version/s: Not applicable

Sorry, BEAM got auto selected as project. 

> TimeZone.getTimeZone has performance issue.
> ---
>
> Key: BEAM-552
> URL: https://issues.apache.org/jira/browse/BEAM-552
> Project: Beam
>  Issue Type: Bug
>Reporter: Satish Subhashrao Saley
> Fix For: Not applicable
>
>
> While preparing Json response, we calculate the time zone for time related 
> fields in response. Most of times user will not pass timezone. We can easily 
> avoid TimeZone.getTimeZone for GMT timezone. 
> {code}
> <0x0005c00876d8> (a java.lang.Class for java.util.TimeZone): 0 Thread(s) 
> sleeping, 28 Thread(s) waiting, 1 Thread(s) locking
> java.lang.Thread.State: BLOCKED (on object monitor)
>   at java.util.TimeZone.getTimeZone(TimeZone.java:516)
>   - locked <0x0005c00876d8> (a java.lang.Class for java.util.TimeZone)
>   at 
> org.apache.oozie.client.rest.JsonUtils.formatDateRfc822(JsonUtils.java:47)
>   at 
> org.apache.oozie.CoordinatorJobBean.toJSONObject(CoordinatorJobBean.java:1119)
>   at 
> org.apache.oozie.CoordinatorJobBean.toJSONArray(CoordinatorJobBean.java:1061)
>   at 
> org.apache.oozie.servlet.V1JobsServlet.getCoordinatorJobs(V1JobsServlet.java:359)
>   at 
> org.apache.oozie.servlet.V1JobsServlet.getJobs(V1JobsServlet.java:156)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-552) TimeZone.getTimeZone has performance issue.

2016-08-15 Thread Satish Subhashrao Saley (JIRA)
Satish Subhashrao Saley created BEAM-552:


 Summary: TimeZone.getTimeZone has performance issue.
 Key: BEAM-552
 URL: https://issues.apache.org/jira/browse/BEAM-552
 Project: Beam
  Issue Type: Bug
Reporter: Satish Subhashrao Saley


While preparing Json response, we calculate the time zone for time related 
fields in response. Most of times user will not pass timezone. We can easily 
avoid TimeZone.getTimeZone for GMT timezone. 
{code}
<0x0005c00876d8> (a java.lang.Class for java.util.TimeZone): 0 Thread(s) 
sleeping, 28 Thread(s) waiting, 1 Thread(s) locking

java.lang.Thread.State: BLOCKED (on object monitor)
at java.util.TimeZone.getTimeZone(TimeZone.java:516)
- locked <0x0005c00876d8> (a java.lang.Class for java.util.TimeZone)
at 
org.apache.oozie.client.rest.JsonUtils.formatDateRfc822(JsonUtils.java:47)
at 
org.apache.oozie.CoordinatorJobBean.toJSONObject(CoordinatorJobBean.java:1119)
at 
org.apache.oozie.CoordinatorJobBean.toJSONArray(CoordinatorJobBean.java:1061)
at 
org.apache.oozie.servlet.V1JobsServlet.getCoordinatorJobs(V1JobsServlet.java:359)
at 
org.apache.oozie.servlet.V1JobsServlet.getJobs(V1JobsServlet.java:156)

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-435) DirectRunner GBK -- task per key?

2016-08-15 Thread Thomas Groh (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421262#comment-15421262
 ] 

Thomas Groh commented on BEAM-435:
--

This is resolved w.r.t. Write with 
https://github.com/apache/incubator-beam/pull/651

Otherwise, internal sharding is in general an implementation detail of the 
runner

> DirectRunner GBK -- task per key?
> -
>
> Key: BEAM-435
> URL: https://issues.apache.org/jira/browse/BEAM-435
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 0.1.0-incubating, 0.2.0-incubating
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
>
> See [BEAM-434] -- is the direct runner producing a bundle per-key in the GBK 
> output?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-551) Support Dynamic PipelineOptions

2016-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15421067#comment-15421067
 ] 

ASF GitHub Bot commented on BEAM-551:
-

GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/827

[BEAM-551] Add ValueProvider class

Hi @lukecwik , can you please take a look?

Currently implemented for ValueProvider only.  Tests demonstrate 
intended functionality, once the questions below are answered:

- What's the best way to avoid threading issues with the static options in 
test?
- Is it reasonable to continue down the road of special-casing 
getter/setter validation for this class?
- What would be the preferred way to annotate T for ValueProvider to 
work around type erasure?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam sgmc/valueprovider

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #827


commit 867a080b98f585db2687d9aaf7d42de8c813
Author: sammcveety 
Date:   2016-08-14T00:45:58Z

Merge pull request #1 from apache/master

Update my fork.

commit bb7aea8db07b3a7ae988f75888716dd865e07e08
Author: sammcveety 
Date:   2016-08-14T00:48:18Z

Create ValueProvider.java

commit 5330800d9d5b54b897c53cb8ea40c6331fe3c36a
Author: sammcveety 
Date:   2016-08-14T00:48:54Z

Update ValueProvider.java

commit 9b9b17c2338612b4030d1c4a36f8fb4c5c179c2e
Author: sammcveety 
Date:   2016-08-14T00:50:56Z

Update ProxyInvocationHandler.java

commit b54832bd0c28691d1e6087b4901fb6ae11ebd2c9
Author: sammcveety 
Date:   2016-08-14T00:54:33Z

Create ValueProviderTest.java

commit e2d16a7ad66676a099d4160ecd4c31c369acd46e
Author: sammcveety 
Date:   2016-08-14T00:56:41Z

Update ValueProviderTest.java

commit afbe8157f711fca00ed37afbe461edd4ac78e30f
Author: sammcveety 
Date:   2016-08-14T16:34:23Z

Update ProxyInvocationHandler.java

commit e111045ae1125ed49ea1f26cdde31b3c1583ed2d
Author: sammcveety 
Date:   2016-08-14T16:58:29Z

Update ValueProviderTest.java

commit 0bff463a71c4f4b3ad13edd924fd03226a915a61
Author: sammcveety 
Date:   2016-08-14T20:11:49Z

Add tests.

commit c4f3acc427b828f0f0254f83e84690b14c0a0f98
Author: sammcveety 
Date:   2016-08-14T21:42:56Z

Iterations on tests.

commit f19435b688fed41d873b85ac7750d923025256a9
Author: sammcveety 
Date:   2016-08-15T01:32:37Z

Proposal for how to handle getters/setters.




> Support Dynamic PipelineOptions
> ---
>
> Key: BEAM-551
> URL: https://issues.apache.org/jira/browse/BEAM-551
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Sam McVeety
>Assignee: Frances Perry
>Priority: Minor
>
> During the graph construction phase, the given SDK generates an initial
> execution graph for the program.  At execution time, this graph is
> executed, either locally or by a service.  Currently, Beam only supports
> parameterization at graph construction time.  Both Flink and Spark supply
> functionality that allows a pre-compiled job to be run without SDK
> interaction with updated runtime parameters.
> In its current incarnation, Dataflow can read values of PipelineOptions at
> job submission time, but this requires the presence of an SDK to properly
> encode these values into the job.  We would like to build a common layer
> into the Beam model so that these dynamic options can be properly provided
> to jobs.
> Please see
> https://docs.google.com/document/d/1I-iIgWDYasb7ZmXbGBHdok_IK1r1YAJ90JG5Fz0_28o/edit
> for the high-level model, and
> https://docs.google.com/document/d/17I7HeNQmiIfOJi0aI70tgGMMkOSgGi8ZUH-MOnFatZ8/edit
> for
> the specific API proposal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request #827: [BEAM-551] Add ValueProvider class

2016-08-15 Thread sammcveety
GitHub user sammcveety opened a pull request:

https://github.com/apache/incubator-beam/pull/827

[BEAM-551] Add ValueProvider class

Hi @lukecwik , can you please take a look?

Currently implemented for ValueProvider only.  Tests demonstrate 
intended functionality, once the questions below are answered:

- What's the best way to avoid threading issues with the static options in 
test?
- Is it reasonable to continue down the road of special-casing 
getter/setter validation for this class?
- What would be the preferred way to annotate T for ValueProvider to 
work around type erasure?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sammcveety/incubator-beam sgmc/valueprovider

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/827.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #827


commit 867a080b98f585db2687d9aaf7d42de8c813
Author: sammcveety 
Date:   2016-08-14T00:45:58Z

Merge pull request #1 from apache/master

Update my fork.

commit bb7aea8db07b3a7ae988f75888716dd865e07e08
Author: sammcveety 
Date:   2016-08-14T00:48:18Z

Create ValueProvider.java

commit 5330800d9d5b54b897c53cb8ea40c6331fe3c36a
Author: sammcveety 
Date:   2016-08-14T00:48:54Z

Update ValueProvider.java

commit 9b9b17c2338612b4030d1c4a36f8fb4c5c179c2e
Author: sammcveety 
Date:   2016-08-14T00:50:56Z

Update ProxyInvocationHandler.java

commit b54832bd0c28691d1e6087b4901fb6ae11ebd2c9
Author: sammcveety 
Date:   2016-08-14T00:54:33Z

Create ValueProviderTest.java

commit e2d16a7ad66676a099d4160ecd4c31c369acd46e
Author: sammcveety 
Date:   2016-08-14T00:56:41Z

Update ValueProviderTest.java

commit afbe8157f711fca00ed37afbe461edd4ac78e30f
Author: sammcveety 
Date:   2016-08-14T16:34:23Z

Update ProxyInvocationHandler.java

commit e111045ae1125ed49ea1f26cdde31b3c1583ed2d
Author: sammcveety 
Date:   2016-08-14T16:58:29Z

Update ValueProviderTest.java

commit 0bff463a71c4f4b3ad13edd924fd03226a915a61
Author: sammcveety 
Date:   2016-08-14T20:11:49Z

Add tests.

commit c4f3acc427b828f0f0254f83e84690b14c0a0f98
Author: sammcveety 
Date:   2016-08-14T21:42:56Z

Iterations on tests.

commit f19435b688fed41d873b85ac7750d923025256a9
Author: sammcveety 
Date:   2016-08-15T01:32:37Z

Proposal for how to handle getters/setters.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-551) Support Dynamic PipelineOptions

2016-08-15 Thread Sam McVeety (JIRA)
Sam McVeety created BEAM-551:


 Summary: Support Dynamic PipelineOptions
 Key: BEAM-551
 URL: https://issues.apache.org/jira/browse/BEAM-551
 Project: Beam
  Issue Type: New Feature
  Components: beam-model
Reporter: Sam McVeety
Assignee: Frances Perry
Priority: Minor


During the graph construction phase, the given SDK generates an initial
execution graph for the program.  At execution time, this graph is
executed, either locally or by a service.  Currently, Beam only supports
parameterization at graph construction time.  Both Flink and Spark supply
functionality that allows a pre-compiled job to be run without SDK
interaction with updated runtime parameters.

In its current incarnation, Dataflow can read values of PipelineOptions at
job submission time, but this requires the presence of an SDK to properly
encode these values into the job.  We would like to build a common layer
into the Beam model so that these dynamic options can be properly provided
to jobs.

Please see
https://docs.google.com/document/d/1I-iIgWDYasb7ZmXbGBHdok_IK1r1YAJ90JG5Fz0_28o/edit
for the high-level model, and
https://docs.google.com/document/d/17I7HeNQmiIfOJi0aI70tgGMMkOSgGi8ZUH-MOnFatZ8/edit
for
the specific API proposal.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-492) Spark runner should call Pipeline.run() instead of SparkRunner.run()

2016-08-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela closed BEAM-492.
--
Resolution: Fixed

> Spark runner should call Pipeline.run() instead of SparkRunner.run()
> 
>
> Key: BEAM-492
> URL: https://issues.apache.org/jira/browse/BEAM-492
> Project: Beam
>  Issue Type: Bug
>Affects Versions: 0.1.0-incubating
>Reporter: Amit Sela
>Assignee: Amit Sela
> Fix For: 0.3.0-incubating
>
>
> Currently, in some places (streaming examples, etc.), the Pipeline executes 
> by calling SparkRunner.run() instead of calling Pipeline.run() which is kind 
> of "backwards" to the Beam API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-492) Spark runner should call Pipeline.run() instead of SparkRunner.run()

2016-08-15 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela updated BEAM-492:
---
Fix Version/s: 0.3.0-incubating

> Spark runner should call Pipeline.run() instead of SparkRunner.run()
> 
>
> Key: BEAM-492
> URL: https://issues.apache.org/jira/browse/BEAM-492
> Project: Beam
>  Issue Type: Bug
>Affects Versions: 0.1.0-incubating
>Reporter: Amit Sela
>Assignee: Amit Sela
> Fix For: 0.3.0-incubating
>
>
> Currently, in some places (streaming examples, etc.), the Pipeline executes 
> by calling SparkRunner.run() instead of calling Pipeline.run() which is kind 
> of "backwards" to the Beam API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)