[jira] [Closed] (BEAM-305) In Spark runner tests - When using Create.of use it's #withCoder method instead of the created PCollection's #setCoder

2016-05-25 Thread Amit Sela (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Sela closed BEAM-305.
--
Resolution: Fixed

> In Spark runner tests - When using Create.of use it's #withCoder method 
> instead of the created PCollection's #setCoder
> --
>
> Key: BEAM-305
> URL: https://issues.apache.org/jira/browse/BEAM-305
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Amit Sela
>Priority: Minor
>
> See [~kenn] comment here:
> https://github.com/apache/incubator-beam/pull/179#discussion_r60171526



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-22) DirectPipelineRunner: support for unbounded collections

2016-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-22?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15301256#comment-15301256
 ] 

ASF GitHub Bot commented on BEAM-22:


GitHub user tgroh opened a pull request:

https://github.com/apache/incubator-beam/pull/392

[BEAM-22][BEAM-243] Execute NeedsRunner tests in the DirectRunner module

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
This ensures that all tests that depend on a pipeline are executed.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/incubator-beam 
needs_runner_test_executions

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/392.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #392


commit b99e6ccd6c45c38ef8e14befb627dec71286b69e
Author: Thomas Groh 
Date:   2016-05-20T22:18:11Z

Execute NeedsRunner tests in the Direct Runner

commit 89ca5bb20394541dfbf0b1bbacef97f614657987
Author: Thomas Groh 
Date:   2016-05-26T00:45:26Z

Declare Dependencies on the Direct Runner

Changing the default Runner requires a runner to be available for
existing modules. Declare a dependency as appropriate.




> DirectPipelineRunner: support for unbounded collections
> ---
>
> Key: BEAM-22
> URL: https://issues.apache.org/jira/browse/BEAM-22
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Davor Bonaci
>Assignee: Thomas Groh
>
> DirectPipelineRunner currently runs over bounded PCollections only, and 
> implements only a portion of the Beam Model.
> We should improve it to faithfully implement the full Beam Model, such as add 
> ability to run over unbounded PCollections, and better resemble execution 
> model in a distributed system.
> This further enables features such as a testing source which may simulate 
> late data and test triggers in the pipeline. Finally, we may want to expose 
> an option to select between "debug" (single threaded), "chaos monkey" (test 
> as many model requirements as possible), and "performance" (multi-threaded).
> more testing (chaos monkey) 
> Once this is done, we should update this StackOverflow question:
> http://stackoverflow.com/questions/35350113/testing-triggers-with-processing-time/35401426#35401426



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-117) Implement the API for Static Display Metadata

2016-05-25 Thread Ben Chambers (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Chambers resolved BEAM-117.
---
   Resolution: Fixed
Fix Version/s: 0.1.0-incubating

> Implement the API for Static Display Metadata
> -
>
> Key: BEAM-117
> URL: https://issues.apache.org/jira/browse/BEAM-117
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Scott Wegner
> Fix For: 0.1.0-incubating
>
>
> As described in the following doc, we would like the SDK to allow associating 
> display metadata with PTransforms.
> https://docs.google.com/document/d/11enEB9JwVp6vO0uOYYTMYTGkr3TdNfELwWqoiUg5ZxM/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Jenkins build is still unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #418

2016-05-25 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-122) GlobalWindow and allowedLateness can cause inconsistent timer interpretation

2016-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300969#comment-15300969
 ] 

ASF GitHub Bot commented on BEAM-122:
-

GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/391

[BEAM-122][BEAM-311] Add GC hold if have data. Don't set timers beyond 
GlobalWindow.maxTim…

R: @kennknowles 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam beam-122

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/391.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #391


commit aed482670058ffc64ed37243a6ef5c4a21b2bd7c
Author: Mark Shields 
Date:   2016-05-24T20:52:41Z

Add GC hold if have data. Don't set timers beyond GlobalWindow.maxTimestamp




> GlobalWindow and allowedLateness can cause inconsistent timer interpretation 
> -
>
> Key: BEAM-122
> URL: https://issues.apache.org/jira/browse/BEAM-122
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>
> In ReduceFnRunner we have code such as
>window.getMaxTimestamp().plus(windowingStrategy.getAllowedLateness())
> If window is global then maxTimestamp will be 
> BoundedWindow.TIMESTAMP_MAX_VALUE.
> Meanwhile, timestamps beyond BoundedWindow.TIMESTAMP_MAX_VALUE will be 
> clipped in most runners.
> This could cause the time of an expected timer (eg for garbage collection) to 
> not match the actual time of a fired timer.
> We should either make non-zero allowedLateness on the Global window illegal 
> (probably obnoxious) or explicitly clip it to zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #388

2016-05-25 Thread kenn
This closes #388


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/d4c052c3
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/d4c052c3
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/d4c052c3

Branch: refs/heads/master
Commit: d4c052c32be8d5635b32fcdc214724a399b09c55
Parents: bde2a85 521bfff
Author: Kenneth Knowles 
Authored: Wed May 25 14:52:07 2016 -0700
Committer: Kenneth Knowles 
Committed: Wed May 25 14:52:07 2016 -0700

--
 .../java/org/apache/beam/sdk/io/PubsubUnboundedSource.java | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)
--




[GitHub] incubator-beam pull request: [BEAM-122][BEAM-311] Add GC hold if h...

2016-05-25 Thread mshields822
GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/391

[BEAM-122][BEAM-311] Add GC hold if have data. Don't set timers beyond 
GlobalWindow.maxTim…

R: @kennknowles 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam beam-122

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/391.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #391


commit aed482670058ffc64ed37243a6ef5c4a21b2bd7c
Author: Mark Shields 
Date:   2016-05-24T20:52:41Z

Add GC hold if have data. Don't set timers beyond GlobalWindow.maxTimestamp




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Closed] (BEAM-248) Register DisplayData from anonymous implementation PTransforms

2016-05-25 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-248.
-
Resolution: Fixed

Done

> Register DisplayData from anonymous implementation PTransforms
> --
>
> Key: BEAM-248
> URL: https://issues.apache.org/jira/browse/BEAM-248
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>
> Most SDK PTransforms are implemented in terms of lower-level PTransforms, 
> often with anonymous user-fn implementations at the leaf-level. Currently 
> display data is only being registered on the composite node and not within 
> the anonymous implementation. As a result, the details are lost.
> We should register display data both in the composite and internal leaf 
> nodes, particularly when the implementation is anonymous.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-117) Implement the API for Static Display Metadata

2016-05-25 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner updated BEAM-117:
--
Assignee: Ben Chambers  (was: Scott Wegner)

> Implement the API for Static Display Metadata
> -
>
> Key: BEAM-117
> URL: https://issues.apache.org/jira/browse/BEAM-117
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Ben Chambers
>
> As described in the following doc, we would like the SDK to allow associating 
> display metadata with PTransforms.
> https://docs.google.com/document/d/11enEB9JwVp6vO0uOYYTMYTGkr3TdNfELwWqoiUg5ZxM/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-238) Add link URLs for sources / sinks

2016-05-25 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-238.
-
Resolution: Fixed

Closing this.

> Add link URLs for sources / sinks
> -
>
> Key: BEAM-238
> URL: https://issues.apache.org/jira/browse/BEAM-238
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Scott Wegner
>Assignee: Jean-Baptiste Onofré
>
> Where applicable, annotate sources/sink display data with link urls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-250) Exclude TextSource.minBundleSize default value from display data.

2016-05-25 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-250.
-
Resolution: Fixed

Done

> Exclude TextSource.minBundleSize default value from display data.
> -
>
> Key: BEAM-250
> URL: https://issues.apache.org/jira/browse/BEAM-250
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> TextSource currently registers minBundleSize as display data. We should 
> exclude it when the value is default of 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-251) Exclude PipelineOptions with @JsonIgnore from display data

2016-05-25 Thread Scott Wegner (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Scott Wegner closed BEAM-251.
-
Resolution: Fixed

This is done.

> Exclude PipelineOptions with @JsonIgnore from display data
> --
>
> Key: BEAM-251
> URL: https://issues.apache.org/jira/browse/BEAM-251
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> JsonIgnore properties are generally not useful and often very long. We should 
> exclude them from display data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[2/2] incubator-beam git commit: This closes #361

2016-05-25 Thread kenn
This closes #361


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/bde2a856
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/bde2a856
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/bde2a856

Branch: refs/heads/master
Commit: bde2a856f1652f6ebab7aae2d5e78eeb4683a15a
Parents: 9f97ea0 928a302
Author: Kenneth Knowles 
Authored: Wed May 25 14:16:54 2016 -0700
Committer: Kenneth Knowles 
Committed: Wed May 25 14:16:54 2016 -0700

--
 sdks/java/core/pom.xml  |  2 +-
 .../apache/beam/sdk/testing/NeedsRunner.java| 28 
 .../beam/sdk/testing/RunnableOnService.java |  2 +-
 .../java/org/apache/beam/sdk/PipelineTest.java  |  2 ++
 .../apache/beam/sdk/coders/AvroCoderTest.java   |  3 +++
 .../beam/sdk/coders/CoderRegistryTest.java  |  4 +++
 .../beam/sdk/io/AvroIOGeneratedClassTest.java   |  9 +++
 .../java/org/apache/beam/sdk/io/AvroIOTest.java |  7 -
 .../org/apache/beam/sdk/io/BigQueryIOTest.java  | 11 +---
 .../beam/sdk/io/CompressedSourceTest.java   | 15 +++
 .../apache/beam/sdk/io/CountingInputTest.java   |  2 ++
 .../apache/beam/sdk/io/CountingSourceTest.java  |  3 +++
 .../apache/beam/sdk/io/FileBasedSourceTest.java |  5 +++-
 .../beam/sdk/io/PubsubUnboundedSinkTest.java|  5 
 .../java/org/apache/beam/sdk/io/TextIOTest.java | 15 +++
 .../java/org/apache/beam/sdk/io/WriteTest.java  |  6 +
 .../org/apache/beam/sdk/io/XmlSourceTest.java   | 10 ---
 .../beam/sdk/runners/TransformTreeTest.java |  3 +++
 .../transforms/ApproximateQuantilesTest.java|  7 -
 .../sdk/transforms/ApproximateUniqueTest.java   |  6 +
 .../apache/beam/sdk/transforms/CombineTest.java |  4 +--
 .../apache/beam/sdk/transforms/DoFnTest.java|  7 -
 .../sdk/transforms/DoFnWithContextTest.java |  7 -
 .../apache/beam/sdk/transforms/FilterTest.java  |  4 +++
 .../sdk/transforms/FlatMapElementsTest.java |  5 
 .../apache/beam/sdk/transforms/FlattenTest.java |  4 +++
 .../beam/sdk/transforms/GroupByKeyTest.java |  5 +++-
 .../IntraBundleParallelizationTest.java |  6 -
 .../beam/sdk/transforms/MapElementsTest.java|  7 -
 .../apache/beam/sdk/transforms/ParDoTest.java   | 21 +--
 .../beam/sdk/transforms/PartitionTest.java  |  5 +++-
 .../apache/beam/sdk/transforms/SampleTest.java  |  3 ++-
 .../org/apache/beam/sdk/transforms/TopTest.java |  5 
 .../apache/beam/sdk/transforms/ViewTest.java|  4 +++
 .../beam/sdk/transforms/WithKeysTest.java   |  5 
 .../beam/sdk/transforms/WithTimestampsTest.java |  3 +++
 .../sdk/transforms/join/CoGroupByKeyTest.java   |  2 ++
 .../sdk/transforms/windowing/WindowingTest.java |  3 +++
 .../org/apache/beam/sdk/values/PDoneTest.java   |  2 ++
 39 files changed, 225 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/bde2a856/sdks/java/core/src/test/java/org/apache/beam/sdk/io/BigQueryIOTest.java
--



Jenkins build became unstable: beam_PostCommit_RunnableOnService_GoogleCloudDataflow #417

2016-05-25 Thread Apache Jenkins Server
See 




[2/2] incubator-beam git commit: [BEAM-305] Replace usages of PCollection.setCoder with Create.of().withCoder in Spark Runner

2016-05-25 Thread amitsela
[BEAM-305] Replace usages of PCollection.setCoder with Create.of().withCoder in 
Spark Runner

This closes #386


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/9f97ea0a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/9f97ea0a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/9f97ea0a

Branch: refs/heads/master
Commit: 9f97ea0a7ddec61a3ff23841dbf74fa0a260dcae
Parents: a3fc40a 69da98a
Author: Sela 
Authored: Thu May 26 00:03:38 2016 +0300
Committer: Sela 
Committed: Thu May 26 00:03:38 2016 +0300

--
 .../test/java/org/apache/beam/runners/spark/DeDupTest.java   | 2 +-
 .../java/org/apache/beam/runners/spark/EmptyInputTest.java   | 2 +-
 .../org/apache/beam/runners/spark/SimpleWordCountTest.java   | 8 
 .../java/org/apache/beam/runners/spark/io/NumShardsTest.java | 2 +-
 .../beam/runners/spark/translation/CombineGloballyTest.java  | 2 +-
 .../beam/runners/spark/translation/CombinePerKeyTest.java| 2 +-
 .../runners/spark/translation/WindowedWordCountTest.java | 8 
 7 files changed, 13 insertions(+), 13 deletions(-)
--




[1/2] incubator-beam git commit: Use Create.of withCoder instead of setCoder on the created PCollection

2016-05-25 Thread amitsela
Repository: incubator-beam
Updated Branches:
  refs/heads/master a3fc40aa3 -> 9f97ea0a7


Use Create.of withCoder instead of setCoder on the created PCollection

One more left..


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/69da98a9
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/69da98a9
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/69da98a9

Branch: refs/heads/master
Commit: 69da98a93956add8e86d094cb866bf86c5626089
Parents: 78c8c52
Author: Ilya Ganelin 
Authored: Tue May 24 13:37:04 2016 -0700
Committer: Sela 
Committed: Wed May 25 23:59:15 2016 +0300

--
 .../test/java/org/apache/beam/runners/spark/DeDupTest.java   | 2 +-
 .../java/org/apache/beam/runners/spark/EmptyInputTest.java   | 2 +-
 .../org/apache/beam/runners/spark/SimpleWordCountTest.java   | 8 
 .../java/org/apache/beam/runners/spark/io/NumShardsTest.java | 2 +-
 .../beam/runners/spark/translation/CombineGloballyTest.java  | 2 +-
 .../beam/runners/spark/translation/CombinePerKeyTest.java| 2 +-
 .../runners/spark/translation/WindowedWordCountTest.java | 8 
 7 files changed, 13 insertions(+), 13 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/69da98a9/runners/spark/src/test/java/org/apache/beam/runners/spark/DeDupTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/DeDupTest.java 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/DeDupTest.java
index 0b48bed..285a2d6 100644
--- a/runners/spark/src/test/java/org/apache/beam/runners/spark/DeDupTest.java
+++ b/runners/spark/src/test/java/org/apache/beam/runners/spark/DeDupTest.java
@@ -51,7 +51,7 @@ public class DeDupTest {
 SparkPipelineOptions options = 
PipelineOptionsFactory.as(SparkPipelineOptions.class);
 options.setRunner(SparkPipelineRunner.class);
 Pipeline p = Pipeline.create(options);
-PCollection input = 
p.apply(Create.of(LINES)).setCoder(StringUtf8Coder.of());
+PCollection input = 
p.apply(Create.of(LINES).withCoder(StringUtf8Coder.of()));
 PCollection output = 
input.apply(RemoveDuplicates.create());
 
 PAssert.that(output).containsInAnyOrder(EXPECTED_SET);

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/69da98a9/runners/spark/src/test/java/org/apache/beam/runners/spark/EmptyInputTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/EmptyInputTest.java 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/EmptyInputTest.java
index 7b25e34..f227e94 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/EmptyInputTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/EmptyInputTest.java
@@ -46,7 +46,7 @@ public class EmptyInputTest {
 options.setRunner(SparkPipelineRunner.class);
 Pipeline p = Pipeline.create(options);
 List empty = Collections.emptyList();
-PCollection inputWords = 
p.apply(Create.of(empty)).setCoder(StringUtf8Coder.of());
+PCollection inputWords = 
p.apply(Create.of(empty).withCoder(StringUtf8Coder.of()));
 PCollection output = inputWords.apply(Combine.globally(new 
ConcatWords()));
 
 EvaluationResult res = SparkPipelineRunner.create().run(p);

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/69da98a9/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
index eee120e..61ad24f 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
@@ -66,8 +66,8 @@ public class SimpleWordCountTest {
 SparkPipelineOptions options = 
PipelineOptionsFactory.as(SparkPipelineOptions.class);
 options.setRunner(SparkPipelineRunner.class);
 Pipeline p = Pipeline.create(options);
-PCollection inputWords = 
p.apply(Create.of(WORDS)).setCoder(StringUtf8Coder
-.of());
+PCollection inputWords = 
p.apply(Create.of(WORDS).withCoder(StringUtf8Coder
+.of()));
 PCollection output = inputWords.apply(new CountWords());
 
 PAssert.that(output).containsInAnyOrder(EXPECTED_COUNT_SET);
@@ -84,8 +84,8 @@ public class SimpleWordCountTest {
 SparkPipelineOptions options = 
PipelineOptionsFactory.as(SparkPipelineOptions.class);
 

[2/2] incubator-beam git commit: This closes #390

2016-05-25 Thread bchambers
This closes #390


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/a3fc40aa
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/a3fc40aa
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/a3fc40aa

Branch: refs/heads/master
Commit: a3fc40aa368de8bf5a775096abf0d392208f7061
Parents: 78c8c52 7abedfa
Author: bchambers 
Authored: Wed May 25 12:47:51 2016 -0700
Committer: bchambers 
Committed: Wed May 25 12:47:51 2016 -0700

--
 .../apache/beam/runners/direct/InProcessPipelineRunnerTest.java   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--




[1/2] incubator-beam git commit: Mark test PipelineOptions interface public

2016-05-25 Thread bchambers
Repository: incubator-beam
Updated Branches:
  refs/heads/master 78c8c528e -> a3fc40aa3


Mark test PipelineOptions interface public


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/7abedfae
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/7abedfae
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/7abedfae

Branch: refs/heads/master
Commit: 7abedfae13319c8e85bfc6a7f3fef4c7edeae9dd
Parents: 78c8c52
Author: Scott Wegner 
Authored: Wed May 25 11:40:45 2016 -0700
Committer: Scott Wegner 
Committed: Wed May 25 11:40:45 2016 -0700

--
 .../apache/beam/runners/direct/InProcessPipelineRunnerTest.java   | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/7abedfae/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessPipelineRunnerTest.java
--
diff --git 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessPipelineRunnerTest.java
 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessPipelineRunnerTest.java
index e403019..5a92ce3 100644
--- 
a/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessPipelineRunnerTest.java
+++ 
b/runners/direct-java/src/test/java/org/apache/beam/runners/direct/InProcessPipelineRunnerTest.java
@@ -125,7 +125,8 @@ public class InProcessPipelineRunnerTest implements 
Serializable {
 p.run();
   }
 
-  interface ObjectPipelineOptions extends PipelineOptions {
+  /** {@link PipelineOptions} to inject bad object implementations. */
+  public interface ObjectPipelineOptions extends PipelineOptions {
 Object getValue();
 void setValue(Object value);
   }



[GitHub] incubator-beam pull request: [BEAM-308] Mark test PipelineOptions ...

2016-05-25 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/390


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-308) PipelineOptions throws when using package-private PipelineOptions interfaces from multiple packages

2016-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300778#comment-15300778
 ] 

ASF GitHub Bot commented on BEAM-308:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-beam/pull/390


> PipelineOptions throws when using package-private PipelineOptions interfaces 
> from multiple packages
> ---
>
> Key: BEAM-308
> URL: https://issues.apache.org/jira/browse/BEAM-308
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> If a PipelineOptions instance is used as multiple PipelineOptions 
> package-private interfaces from different packages, {{PipelineOptions.as}} 
> will throw an exception:
> {quote}
> java.lang.IllegalArgumentException: non-public interfaces from different 
> packages
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:652)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:592)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:244)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:141)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:455)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:405)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:620)
>   at 
> org.apache.beam.sdk.options.ProxyInvocationHandler.as(ProxyInvocationHandler.java:209)
>   at 
> org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:135)
>   at com.sun.proxy.$Proxy6.as(Unknown Source)
> {quote}
> This fails because ProxyInvocationHandler attempts to create a Java Proxy 
> object implementing the full set of interfaces, which has the 
> [restriction|https://docs.oracle.com/javase/7/docs/api/java/lang/reflect/Proxy.html#getProxyClass(java.lang.ClassLoader,%20java.lang.Class...)]:
>  
> bq. All non-public interfaces must be in the same package; otherwise, it 
> would not be possible for the proxy class to implement all of the interfaces, 
> regardless of what package it is defined in.
> This can be triggered in a couple edge-case scenarios:
> # {{PipelineOptions.as}} is called on the same PipelineOptions instance with 
> multiple package-private interfaces.
> # {{PipelineOptionsFactory.register}} is called with a package-private 
> interface, and then {{PipelineOptions.as}} is called with a different 
> package-private instance.
> We hit the second scenario in [DataflowSDK unit 
> tests|https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/286#issuecomment-221438063].
>  It's hard to trigger, but possible.
> 
> I propose we make the behavior for package-private options explicit:
> # Give a better exception message if we hit this issue in 
> {{PipelineOptions.as}} listing the non-public interfaces and what packages 
> they're in.
> # Explicitly reject non-public interfaces from 
> {{PipelineOptionsFactory.register}}, since this state is global and is easier 
> to cause issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-308] Mark test PipelineOptions ...

2016-05-25 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/390

[BEAM-308] Mark test PipelineOptions interface public

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam pipelineoptions-public

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/390.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #390


commit 7abedfae13319c8e85bfc6a7f3fef4c7edeae9dd
Author: Scott Wegner 
Date:   2016-05-25T18:40:45Z

Mark test PipelineOptions interface public




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-308) PipelineOptions throws when using package-private PipelineOptions interfaces from multiple packages

2016-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300644#comment-15300644
 ] 

ASF GitHub Bot commented on BEAM-308:
-

GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/390

[BEAM-308] Mark test PipelineOptions interface public

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam pipelineoptions-public

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/390.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #390


commit 7abedfae13319c8e85bfc6a7f3fef4c7edeae9dd
Author: Scott Wegner 
Date:   2016-05-25T18:40:45Z

Mark test PipelineOptions interface public




> PipelineOptions throws when using package-private PipelineOptions interfaces 
> from multiple packages
> ---
>
> Key: BEAM-308
> URL: https://issues.apache.org/jira/browse/BEAM-308
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> If a PipelineOptions instance is used as multiple PipelineOptions 
> package-private interfaces from different packages, {{PipelineOptions.as}} 
> will throw an exception:
> {quote}
> java.lang.IllegalArgumentException: non-public interfaces from different 
> packages
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:652)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:592)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:244)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:141)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:455)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:405)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:620)
>   at 
> org.apache.beam.sdk.options.ProxyInvocationHandler.as(ProxyInvocationHandler.java:209)
>   at 
> org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:135)
>   at com.sun.proxy.$Proxy6.as(Unknown Source)
> {quote}
> This fails because ProxyInvocationHandler attempts to create a Java Proxy 
> object implementing the full set of interfaces, which has the 
> [restriction|https://docs.oracle.com/javase/7/docs/api/java/lang/reflect/Proxy.html#getProxyClass(java.lang.ClassLoader,%20java.lang.Class...)]:
>  
> bq. All non-public interfaces must be in the same package; otherwise, it 
> would not be possible for the proxy class to implement all of the interfaces, 
> regardless of what package it is defined in.
> This can be triggered in a couple edge-case scenarios:
> # {{PipelineOptions.as}} is called on the same PipelineOptions instance with 
> multiple package-private interfaces.
> # {{PipelineOptionsFactory.register}} is called with a package-private 
> interface, and then {{PipelineOptions.as}} is called with a different 
> package-private instance.
> We hit the second scenario in [DataflowSDK unit 
> tests|https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/286#issuecomment-221438063].
>  It's hard to trigger, but possible.
> 
> I propose we make the behavior for package-private options explicit:
> # Give a better exception message if we hit this issue in 
> {{PipelineOptions.as}} listing the non-public interfaces and what packages 
> they're in.
> # Explicitly reject non-public interfaces from 
> {{PipelineOptionsFactory.register}}, since this state is global and is easier 
> to cause issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-310) Exercise splitIntoBundles/generateInitialSplits in the Direct Runner

2016-05-25 Thread Thomas Groh (JIRA)
Thomas Groh created BEAM-310:


 Summary: Exercise splitIntoBundles/generateInitialSplits in the 
Direct Runner
 Key: BEAM-310
 URL: https://issues.apache.org/jira/browse/BEAM-310
 Project: Beam
  Issue Type: Improvement
  Components: runner-direct
Reporter: Thomas Groh
Assignee: Thomas Groh


BoundedSource#splitIntoBundles and UnboundedSource#generateInitialSplits are 
the methods by which sources can be accessed in parallel. Exercising these 
methods allows reads (and all transforms downstream) to be executed in parallel 
both pre and post a GroupByKey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-306) Make java-only PubsubIO work in InProcessRunner

2016-05-25 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300583#comment-15300583
 ] 

Kenneth Knowles commented on BEAM-306:
--

I think there's a confusing ambiguity at [this line in 
UnboundedSource|https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/io/UnboundedSource.java#L213]
 in the spec for getting a checkpoint.

"The returned object should not be modified."

By whom? We can't say that it applies to everyone, since the interface for the 
checkpoint includes a finalization method. I think the most reasonable 
interpretation (and anyhow the one that seems obvious anyhow) is that once the 
object is returned, the source no longer owns it; the context owns it. So the 
source should not hang on to it, or at least should not do anything observable 
with it. But [~dhalp...@google.com] and [~mil...@google.com] might have a 
different idea.

> Make java-only PubsubIO work in InProcessRunner 
> 
>
> Key: BEAM-306
> URL: https://issues.apache.org/jira/browse/BEAM-306
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-309) Need an UnboundedSourceFromHell to test runner's respect of its implied contract

2016-05-25 Thread Mark Shields (JIRA)
Mark Shields created BEAM-309:
-

 Summary: Need an UnboundedSourceFromHell to test runner's respect 
of its implied contract
 Key: BEAM-309
 URL: https://issues.apache.org/jira/browse/BEAM-309
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Mark Shields
Assignee: Davor Bonaci


UnboundedSource has pretty specific requirements of its runner (as well as its 
derived classes). We really need a systematic test of all of this. Inspiration 
is BEAM-306 which only showed up when I tried to run an actual pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-309) Need an UnboundedSourceFromHell to test runner's respect of its implied contract

2016-05-25 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300573#comment-15300573
 ] 

Kenneth Knowles commented on BEAM-309:
--

I think step 1 here might be to take that implied contract and make it 
explicit, then the implied contract can be empty :-)

> Need an UnboundedSourceFromHell to test runner's respect of its implied 
> contract
> 
>
> Key: BEAM-309
> URL: https://issues.apache.org/jira/browse/BEAM-309
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Mark Shields
>Assignee: Davor Bonaci
>
> UnboundedSource has pretty specific requirements of its runner (as well as 
> its derived classes). We really need a systematic test of all of this. 
> Inspiration is BEAM-306 which only showed up when I tried to run an actual 
> pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-308) PipelineOptions throws when using package-private PipelineOptions interfaces from multiple packages

2016-05-25 Thread Scott Wegner (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300563#comment-15300563
 ] 

Scott Wegner commented on BEAM-308:
---

PR with failing test cases: 
https://github.com/apache/incubator-beam/pull/389/files

/cc [~bchambers] [~lcwik]

> PipelineOptions throws when using package-private PipelineOptions interfaces 
> from multiple packages
> ---
>
> Key: BEAM-308
> URL: https://issues.apache.org/jira/browse/BEAM-308
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Scott Wegner
>Priority: Minor
>
> If a PipelineOptions instance is used as multiple PipelineOptions 
> package-private interfaces from different packages, {{PipelineOptions.as}} 
> will throw an exception:
> {quote}
> java.lang.IllegalArgumentException: non-public interfaces from different 
> packages
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:652)
>   at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:592)
>   at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:244)
>   at java.lang.reflect.WeakCache.get(WeakCache.java:141)
>   at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:455)
>   at java.lang.reflect.Proxy.getProxyClass(Proxy.java:405)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:620)
>   at 
> org.apache.beam.sdk.options.ProxyInvocationHandler.as(ProxyInvocationHandler.java:209)
>   at 
> org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:135)
>   at com.sun.proxy.$Proxy6.as(Unknown Source)
> {quote}
> This fails because ProxyInvocationHandler attempts to create a Java Proxy 
> object implementing the full set of interfaces, which has the 
> [restriction|https://docs.oracle.com/javase/7/docs/api/java/lang/reflect/Proxy.html#getProxyClass(java.lang.ClassLoader,%20java.lang.Class...)]:
>  
> bq. All non-public interfaces must be in the same package; otherwise, it 
> would not be possible for the proxy class to implement all of the interfaces, 
> regardless of what package it is defined in.
> This can be triggered in a couple edge-case scenarios:
> # {{PipelineOptions.as}} is called on the same PipelineOptions instance with 
> multiple package-private interfaces.
> # {{PipelineOptionsFactory.register}} is called with a package-private 
> interface, and then {{PipelineOptions.as}} is called with a different 
> package-private instance.
> We hit the second scenario in [DataflowSDK unit 
> tests|https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/286#issuecomment-221438063].
>  It's hard to trigger, but possible.
> 
> I propose we make the behavior for package-private options explicit:
> # Give a better exception message if we hit this issue in 
> {{PipelineOptions.as}} listing the non-public interfaces and what packages 
> they're in.
> # Explicitly reject non-public interfaces from 
> {{PipelineOptionsFactory.register}}, since this state is global and is easier 
> to cause issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: Test case for package-private Pipelin...

2016-05-25 Thread swegner
GitHub user swegner opened a pull request:

https://github.com/apache/incubator-beam/pull/389

Test case for package-private PipelineOptions

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/swegner/incubator-beam 
pipelineoptions-interfaces

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/389.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #389


commit ebce2b71652a3ccff060abedfbf691b4c11a7836
Author: Scott Wegner 
Date:   2016-05-25T16:55:51Z

Test case for package-private PipelineOptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-308) PipelineOptions throws when using package-private PipelineOptions interfaces from multiple packages

2016-05-25 Thread Scott Wegner (JIRA)
Scott Wegner created BEAM-308:
-

 Summary: PipelineOptions throws when using package-private 
PipelineOptions interfaces from multiple packages
 Key: BEAM-308
 URL: https://issues.apache.org/jira/browse/BEAM-308
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Scott Wegner
Assignee: Scott Wegner
Priority: Minor


If a PipelineOptions instance is used as multiple PipelineOptions 
package-private interfaces from different packages, {{PipelineOptions.as}} will 
throw an exception:

{quote}
java.lang.IllegalArgumentException: non-public interfaces from different 
packages

at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:652)
at java.lang.reflect.Proxy$ProxyClassFactory.apply(Proxy.java:592)
at java.lang.reflect.WeakCache$Factory.get(WeakCache.java:244)
at java.lang.reflect.WeakCache.get(WeakCache.java:141)
at java.lang.reflect.Proxy.getProxyClass0(Proxy.java:455)
at java.lang.reflect.Proxy.getProxyClass(Proxy.java:405)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:620)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.as(ProxyInvocationHandler.java:209)
at 
org.apache.beam.sdk.options.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:135)
at com.sun.proxy.$Proxy6.as(Unknown Source)
{quote}

This fails because ProxyInvocationHandler attempts to create a Java Proxy 
object implementing the full set of interfaces, which has the 
[restriction|https://docs.oracle.com/javase/7/docs/api/java/lang/reflect/Proxy.html#getProxyClass(java.lang.ClassLoader,%20java.lang.Class...)]:
 

bq. All non-public interfaces must be in the same package; otherwise, it would 
not be possible for the proxy class to implement all of the interfaces, 
regardless of what package it is defined in.

This can be triggered in a couple edge-case scenarios:

# {{PipelineOptions.as}} is called on the same PipelineOptions instance with 
multiple package-private interfaces.
# {{PipelineOptionsFactory.register}} is called with a package-private 
interface, and then {{PipelineOptions.as}} is called with a different 
package-private instance.

We hit the second scenario in [DataflowSDK unit 
tests|https://github.com/GoogleCloudPlatform/DataflowJavaSDK/pull/286#issuecomment-221438063].
 It's hard to trigger, but possible.



I propose we make the behavior for package-private options explicit:

# Give a better exception message if we hit this issue in 
{{PipelineOptions.as}} listing the non-public interfaces and what packages 
they're in.
# Explicitly reject non-public interfaces from 
{{PipelineOptionsFactory.register}}, since this state is global and is easier 
to cause issues.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] incubator-beam pull request: [BEAM-306] Serialize/Deserialize chec...

2016-05-25 Thread mshields822
GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/388

[BEAM-306] Serialize/Deserialize checkpoints

R: @dhalperi @tgroh 

The PubsubUnboundendSource implementation has an assertion to confirm the 
checkpoint from which a fresh reader is instantiated has come via 
deserialization from an earlier finalized checkpoint. The in-process runner was 
reusing the checkpoint object directly, so the assertion failed. This adds the 
serialize/deserialize to the in-process runner, which I believe is the best 
solution since other UnboundedSources may be caught by the same issue. It also 
forces the user to exercise their checkpoint coder.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam pubsub-inproc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/388.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #388


commit 4f4b526495a887bbc1c9b782850e9785a86bddc9
Author: Mark Shields 
Date:   2016-05-25T16:13:07Z

Serialize/Deserialize checkpoints




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-306) Make java-only PubsubIO work in InProcessRunner

2016-05-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300312#comment-15300312
 ] 

ASF GitHub Bot commented on BEAM-306:
-

GitHub user mshields822 opened a pull request:

https://github.com/apache/incubator-beam/pull/388

[BEAM-306] Serialize/Deserialize checkpoints

R: @dhalperi @tgroh 

The PubsubUnboundendSource implementation has an assertion to confirm the 
checkpoint from which a fresh reader is instantiated has come via 
deserialization from an earlier finalized checkpoint. The in-process runner was 
reusing the checkpoint object directly, so the assertion failed. This adds the 
serialize/deserialize to the in-process runner, which I believe is the best 
solution since other UnboundedSources may be caught by the same issue. It also 
forces the user to exercise their checkpoint coder.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mshields822/incubator-beam pubsub-inproc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/388.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #388


commit 4f4b526495a887bbc1c9b782850e9785a86bddc9
Author: Mark Shields 
Date:   2016-05-25T16:13:07Z

Serialize/Deserialize checkpoints




> Make java-only PubsubIO work in InProcessRunner 
> 
>
> Key: BEAM-306
> URL: https://issues.apache.org/jira/browse/BEAM-306
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Mark Shields
>Assignee: Mark Shields
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-307) Upgrade/Test to Kafka 0.10

2016-05-25 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300250#comment-15300250
 ] 

Jean-Baptiste Onofré commented on BEAM-307:
---

Agree, as the KafkaIO uses version range, I would like first to check if it 
works as it is with Kafka 0.10. Then, it makes sense to create a KafkaIO branch 
(more than package) with its own release cycle. I already discussed with 
[~davor] about that: I think we have to think about that, several branches for 
each IO, each with its own release cycle, not part of the "main" release 
process. However, I will move this discussion on the mailing list soon ;)

> Upgrade/Test to Kafka 0.10
> --
>
> Key: BEAM-307
> URL: https://issues.apache.org/jira/browse/BEAM-307
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> I gonna test at least that the KafkaIO works fine with Kafka 0.10 (I'm 
> preparing new complete samples around that).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-307) Upgrade/Test to Kafka 0.10

2016-05-25 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15300147#comment-15300147
 ] 

Aljoscha Krettek commented on BEAM-307:
---

It might be that we have to create separate packages for Kafka 0.9 and Kafka 
0.10 because they have different features. For example, Kafka 0.10 has support 
for timestamps in elements.

> Upgrade/Test to Kafka 0.10
> --
>
> Key: BEAM-307
> URL: https://issues.apache.org/jira/browse/BEAM-307
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> I gonna test at least that the KafkaIO works fine with Kafka 0.10 (I'm 
> preparing new complete samples around that).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-307) Upgrade/Test to Kafka 0.10

2016-05-25 Thread JIRA
Jean-Baptiste Onofré created BEAM-307:
-

 Summary: Upgrade/Test to Kafka 0.10
 Key: BEAM-307
 URL: https://issues.apache.org/jira/browse/BEAM-307
 Project: Beam
  Issue Type: Task
  Components: sdk-java-extensions
Reporter: Jean-Baptiste Onofré
Assignee: Jean-Baptiste Onofré


I gonna test at least that the KafkaIO works fine with Kafka 0.10 (I'm 
preparing new complete samples around that).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)