[jira] [Commented] (BEAM-2256) mongodb sdk MongoDbIO.BoundedMongoDbSource.splitKeysToFilters incorrect

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007655#comment-16007655
 ] 

ASF GitHub Bot commented on BEAM-2256:
--

Github user jbonofre closed the pull request at:

https://github.com/apache/beam/pull/3093


> mongodb sdk MongoDbIO.BoundedMongoDbSource.splitKeysToFilters incorrect
> ---
>
> Key: BEAM-2256
> URL: https://issues.apache.org/jira/browse/BEAM-2256
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Affects Versions: Not applicable, 0.4.0, 0.5.0, 0.6.0
>Reporter: wang hongmin
>Assignee: Jean-Baptiste Onofré
>  Labels: easyfix
> Fix For: 2.0.0
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> When use beam-sdks-java-io-mongodb to count a large collection in MongoDB, I 
> always get a smaller number than actual size of the collection. It seems a 
> puzzle, until I dived into source code and trace the execution. The fault 
> exists in method MongoDbIO.BoundedMongoDbSource.splitKeysToFilters(), which 
> should return one more FILTER than size of input argument splitKeys. 
> I check all version of this file, it seems none was correct.
> Additionally, it seems no suitable [Component]. I choose a similar tag.:)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3093: [BEAM-2256] Add the last previous range filter

2017-05-11 Thread jbonofre
Github user jbonofre closed the pull request at:

https://github.com/apache/beam/pull/3093


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #3113

2017-05-11 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #2203

2017-05-11 Thread Apache Jenkins Server
See 


Changes:

[tgroh] Enable SerializableCoder to Serialize with Generic Types

[altay] Rename filesink to filebasedsink

[lcwik] Mark More values methods Internal

--
[...truncated 555.19 KB...]
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 306, in save
rv = reduce(self.proto)
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_flattened_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)'

==
ERROR: test_iterable_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)
--
Traceback (most recent call last):
  File 
"
 line 812, in run
test(orig)
  File 
"
 line 45, in __call__
return self.run(*arg, **kwarg)
  File 
"
 line 133, in run
self.runTest(result)
  File 
"
 line 151, in runTest
test(result)
  File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
return self.run(*args, **kwds)
  File "/usr/lib/python2.7/unittest/case.py", line 331, in run
testMethod()
  File 
"
 line 177, in test_iterable_side_input
pipeline.run()
  File 
"
 line 96, in run
result = super(TestPipeline, self).run()
  File 
"
 line 169, in run
if self._options.view_as(SetupOptions).save_main_session:
  File 
"
 line 227, in view_as
view = cls(self._flags)
  File 
"
 line 160, in __init__
self._visible_options, _ = parser.parse_known_args(flags)
  File "/usr/lib/python2.7/argparse.py", line 1722, in parse_known_args
namespace, args = self._parse_known_args(args, namespace)
  File "/usr/lib/python2.7/argparse.py", line 1928, in _parse_known_args
start_index = consume_optional(start_index)
  File "/usr/lib/python2.7/argparse.py", line 1868, in consume_optional
take_action(action, args, option_string)
  File "/usr/lib/python2.7/argparse.py", line 1780, in 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2797

2017-05-11 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2268) NullPointerException in com.datatorrent.netlet.util.Slice via ApexStateInternals

2017-05-11 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-2268.
---
   Resolution: Workaround
 Assignee: Kenneth Knowles  (was: Thomas Weise)
Fix Version/s: Not applicable

> NullPointerException in com.datatorrent.netlet.util.Slice via 
> ApexStateInternals
> 
>
> Key: BEAM-2268
> URL: https://issues.apache.org/jira/browse/BEAM-2268
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: Not applicable
>
>
> I've browsed the code, and this is a certain NPE in netlet 1.2.1 but our 
> dependencies are on 1.3.0.
> I have read over the dependency tree (both our project and generated 
> archetype) and deliberately added even tighter deps on 1.3.0. I have not 
> managed to force any dependency on 1.2.1, yet maven clearly logs its download 
> of 1.2.1 and it appears to be running against it.
> {code}
> java.lang.NullPointerException
>   at com.datatorrent.netlet.util.Slice.(Slice.java:54)
>   at 
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:445)
>   at 
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:427)
> {code}
> Jenkins obviously has a different configuration, so that is a place to look 
> next, perhaps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2268) NullPointerException in com.datatorrent.netlet.util.Slice via ApexStateInternals

2017-05-11 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007545#comment-16007545
 ] 

Kenneth Knowles commented on BEAM-2268:
---

Confirmed that more aggressive pruning solves it.

> NullPointerException in com.datatorrent.netlet.util.Slice via 
> ApexStateInternals
> 
>
> Key: BEAM-2268
> URL: https://issues.apache.org/jira/browse/BEAM-2268
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Kenneth Knowles
>Assignee: Thomas Weise
> Fix For: Not applicable
>
>
> I've browsed the code, and this is a certain NPE in netlet 1.2.1 but our 
> dependencies are on 1.3.0.
> I have read over the dependency tree (both our project and generated 
> archetype) and deliberately added even tighter deps on 1.3.0. I have not 
> managed to force any dependency on 1.2.1, yet maven clearly logs its download 
> of 1.2.1 and it appears to be running against it.
> {code}
> java.lang.NullPointerException
>   at com.datatorrent.netlet.util.Slice.(Slice.java:54)
>   at 
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:445)
>   at 
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:427)
> {code}
> Jenkins obviously has a different configuration, so that is a place to look 
> next, perhaps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #3786

2017-05-11 Thread Apache Jenkins Server
See 




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2796

2017-05-11 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1345) Mark @Experimental and @Internal where needed in user-facing bits of the codebase

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007529#comment-16007529
 ] 

ASF GitHub Bot commented on BEAM-1345:
--

Github user tgroh closed the pull request at:

https://github.com/apache/beam/pull/3105


> Mark @Experimental and @Internal where needed in user-facing bits of the 
> codebase
> -
>
> Key: BEAM-1345
> URL: https://issues.apache.org/jira/browse/BEAM-1345
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> A blanket JIRA to ensure that before a stable release we make sure to mark 
> those pieces that would be unwise to freeze yet and consider how best to 
> communicate this to users, who may just autocomplete those features in their 
> IDE anyhow. (conversely, put infrastructure in place to enforce freezing of 
> the rest)
> Not technically "backwards incompatible" with pre-Beam Dataflow SDKs, but 
> certainly needs to be on the burndown for the first stable release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3110: Cherry-pick #3109 into release-2.0.0

2017-05-11 Thread tgroh
Github user tgroh closed the pull request at:

https://github.com/apache/beam/pull/3110


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3105: [BEAM-1345] Mark More values methods Internal

2017-05-11 Thread tgroh
Github user tgroh closed the pull request at:

https://github.com/apache/beam/pull/3105


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: [BEAM-1345] Mark More values methods Internal

2017-05-11 Thread lcwik
[BEAM-1345] Mark More values methods Internal

This closes #3105


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a955b24b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a955b24b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a955b24b

Branch: refs/heads/release-2.0.0
Commit: a955b24b45e4b5947e6e9d3c0d727cb332d2354f
Parents: 905ebcc 1813a2e
Author: Luke Cwik 
Authored: Thu May 11 18:38:07 2017 -0700
Committer: Luke Cwik 
Committed: Thu May 11 18:38:07 2017 -0700

--
 .../main/java/org/apache/beam/sdk/values/PCollectionTuple.java | 6 +-
 .../src/main/java/org/apache/beam/sdk/values/TaggedPValue.java | 6 +-
 2 files changed, 10 insertions(+), 2 deletions(-)
--




[jira] [Commented] (BEAM-1345) Mark @Experimental and @Internal where needed in user-facing bits of the codebase

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007495#comment-16007495
 ] 

ASF GitHub Bot commented on BEAM-1345:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3104


> Mark @Experimental and @Internal where needed in user-facing bits of the 
> codebase
> -
>
> Key: BEAM-1345
> URL: https://issues.apache.org/jira/browse/BEAM-1345
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> A blanket JIRA to ensure that before a stable release we make sure to mark 
> those pieces that would be unwise to freeze yet and consider how best to 
> communicate this to users, who may just autocomplete those features in their 
> IDE anyhow. (conversely, put infrastructure in place to enforce freezing of 
> the rest)
> Not technically "backwards incompatible" with pre-Beam Dataflow SDKs, but 
> certainly needs to be on the burndown for the first stable release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Mark More values methods Internal

2017-05-11 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 36a6cd69a -> d6ac39a23


Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/db7d71ae
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/db7d71ae
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/db7d71ae

Branch: refs/heads/master
Commit: db7d71aec5d6c4b7a15474c7ef9db28460fb0043
Parents: 36a6cd6
Author: Thomas Groh 
Authored: Thu May 11 16:17:54 2017 -0700
Committer: Luke Cwik 
Committed: Thu May 11 18:33:39 2017 -0700

--
 .../main/java/org/apache/beam/sdk/values/PCollectionTuple.java | 6 +-
 .../src/main/java/org/apache/beam/sdk/values/TaggedPValue.java | 6 +-
 2 files changed, 10 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/db7d71ae/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
index d1bb6d7..793994f 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/PCollectionTuple.java
@@ -23,6 +23,7 @@ import java.util.LinkedHashMap;
 import java.util.Map;
 import java.util.Objects;
 import org.apache.beam.sdk.Pipeline;
+import org.apache.beam.sdk.annotations.Internal;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.ParDo;
 import org.apache.beam.sdk.values.PCollection.IsBounded;
@@ -189,11 +190,14 @@ public class PCollectionTuple implements PInput, POutput {
   }
 
   /**
-   * Returns a {@link PCollectionTuple} with each of the given tags mapping to 
a new
+   * For internal use only; no backwards-compatibility 
guarantees.
+   *
+   * Returns a {@link PCollectionTuple} with each of the given tags mapping 
to a new
* output {@link PCollection}.
*
* For use by primitive transformations only.
*/
+  @Internal
   public static PCollectionTuple ofPrimitiveOutputsInternal(
   Pipeline pipeline,
   TupleTagList outputTags,

http://git-wip-us.apache.org/repos/asf/beam/blob/db7d71ae/sdks/java/core/src/main/java/org/apache/beam/sdk/values/TaggedPValue.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/TaggedPValue.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/TaggedPValue.java
index 3b4d599..c1e0209 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/values/TaggedPValue.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/values/TaggedPValue.java
@@ -21,11 +21,15 @@ package org.apache.beam.sdk.values;
 
 import com.google.auto.value.AutoValue;
 import com.google.common.collect.Iterables;
+import org.apache.beam.sdk.annotations.Internal;
 
 /**
- * A (TupleTag, PValue) pair used in the expansion of a {@link PInput} or 
{@link POutput}.
+ * For internal use only; no backwards-compatibility guarantees.
+ *
+ * A (TupleTag, PValue) pair used in the expansion of a {@link PInput} or 
{@link POutput}.
  */
 @AutoValue
+@Internal
 public abstract class TaggedPValue {
   public static TaggedPValue of(TupleTag tag, PValue value) {
 return new AutoValue_TaggedPValue(tag, value);



[2/2] beam git commit: [BEAM-1345] Mark More values methods Internal

2017-05-11 Thread lcwik
[BEAM-1345] Mark More values methods Internal

This closes #3104


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/d6ac39a2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/d6ac39a2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/d6ac39a2

Branch: refs/heads/master
Commit: d6ac39a23e50e82c013e5052b48cb6ffbf929195
Parents: 36a6cd6 db7d71a
Author: Luke Cwik 
Authored: Thu May 11 18:34:15 2017 -0700
Committer: Luke Cwik 
Committed: Thu May 11 18:34:15 2017 -0700

--
 .../main/java/org/apache/beam/sdk/values/PCollectionTuple.java | 6 +-
 .../src/main/java/org/apache/beam/sdk/values/TaggedPValue.java | 6 +-
 2 files changed, 10 insertions(+), 2 deletions(-)
--




[jira] [Resolved] (BEAM-2275) SerializableCoder fails to serialize when used with a generic type token

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2275.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> SerializableCoder fails to serialize when used with a generic type token
> 
>
> Key: BEAM-2275
> URL: https://issues.apache.org/jira/browse/BEAM-2275
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: SerTest.java, stacktrace.txt
>
>
> The following code throws that the type descriptor is not serializable
> {code}
> SerializableCoder coder = SerializableCoder.of(new TypeDescriptor(){});
> CoderProperties.ensureSerializable(coder);
> {code}
> This is a regression since 0.6.0 since the type descriptor was never 
> serialized before.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2275) SerializableCoder fails to serialize when used with a generic type token

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2275:

Attachment: SerTest.java
stacktrace.txt

I tried the users supplied code before and after the fix and can confirm that 
the users issue is resolved.

> SerializableCoder fails to serialize when used with a generic type token
> 
>
> Key: BEAM-2275
> URL: https://issues.apache.org/jira/browse/BEAM-2275
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: 2.0.0
>
> Attachments: SerTest.java, stacktrace.txt
>
>
> The following code throws that the type descriptor is not serializable
> {code}
> SerializableCoder coder = SerializableCoder.of(new TypeDescriptor(){});
> CoderProperties.ensureSerializable(coder);
> {code}
> This is a regression since 0.6.0 since the type descriptor was never 
> serialized before.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2265) Python word count with DirectRunner gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2265:

Attachment: SerTest.java
stacktrace.txt

I attempted the users test before and after the fix and can confirm that fixing 
the SerializableCoder resolved the issue.

> Python word count with DirectRunner gets stuck during application termination 
> on Windows
> 
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Minor
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (BEAM-2265) Python word count with DirectRunner gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2265:

Comment: was deleted

(was: I attempted the users test before and after the fix and can confirm that 
fixing the SerializableCoder resolved the issue.)

> Python word count with DirectRunner gets stuck during application termination 
> on Windows
> 
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Minor
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2265) Python word count with DirectRunner gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2265:

Attachment: (was: SerTest.java)

> Python word count with DirectRunner gets stuck during application termination 
> on Windows
> 
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Minor
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2268) NullPointerException in com.datatorrent.netlet.util.Slice via ApexStateInternals

2017-05-11 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007473#comment-16007473
 ] 

Thomas Weise commented on BEAM-2268:


There was an email thread about it a while ago. Please remove the netlet 
dependency from your local maven repo and try again.

> NullPointerException in com.datatorrent.netlet.util.Slice via 
> ApexStateInternals
> 
>
> Key: BEAM-2268
> URL: https://issues.apache.org/jira/browse/BEAM-2268
> Project: Beam
>  Issue Type: Bug
>  Components: runner-apex
>Reporter: Kenneth Knowles
>Assignee: Thomas Weise
>
> I've browsed the code, and this is a certain NPE in netlet 1.2.1 but our 
> dependencies are on 1.3.0.
> I have read over the dependency tree (both our project and generated 
> archetype) and deliberately added even tighter deps on 1.3.0. I have not 
> managed to force any dependency on 1.2.1, yet maven clearly logs its download 
> of 1.2.1 and it appears to be running against it.
> {code}
> java.lang.NullPointerException
>   at com.datatorrent.netlet.util.Slice.(Slice.java:54)
>   at 
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:445)
>   at 
> org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:427)
> {code}
> Jenkins obviously has a different configuration, so that is a place to look 
> next, perhaps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3112: Rename filesink to filebasedsink

2017-05-11 Thread sb2nov
Github user sb2nov closed the pull request at:

https://github.com/apache/beam/pull/3112


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3112

2017-05-11 Thread altay
This closes #3112


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/905ebccf
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/905ebccf
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/905ebccf

Branch: refs/heads/release-2.0.0
Commit: 905ebccf257f74ce73b4af05611492ef3ee7f3fc
Parents: da2476d 717ab8c
Author: Ahmet Altay 
Authored: Thu May 11 18:20:06 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 18:20:06 2017 -0700

--
 sdks/python/apache_beam/io/__init__.py  |   2 +-
 sdks/python/apache_beam/io/avroio.py|   4 +-
 sdks/python/apache_beam/io/filebasedsink.py | 299 ++
 .../python/apache_beam/io/filebasedsink_test.py | 303 ++
 sdks/python/apache_beam/io/fileio.py| 304 ---
 sdks/python/apache_beam/io/fileio_test.py   | 303 --
 sdks/python/apache_beam/io/gcp/gcsio.py |   6 +-
 sdks/python/apache_beam/io/iobase.py|  12 +-
 sdks/python/apache_beam/io/textio.py|   4 +-
 sdks/python/apache_beam/io/tfrecordio.py|   6 +-
 .../apache_beam/testing/pipeline_verifiers.py   |   4 +-
 11 files changed, 623 insertions(+), 624 deletions(-)
--




[1/2] beam git commit: CP #3111 Rename filesink to filebasedsink

2017-05-11 Thread altay
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 da2476d0b -> 905ebccf2


CP #3111 Rename filesink to filebasedsink


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/717ab8c1
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/717ab8c1
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/717ab8c1

Branch: refs/heads/release-2.0.0
Commit: 717ab8c14cb8c166168c812d4bd16e603831af47
Parents: da2476d
Author: Sourabh Bajaj 
Authored: Thu May 11 17:23:40 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 18:19:55 2017 -0700

--
 sdks/python/apache_beam/io/__init__.py  |   2 +-
 sdks/python/apache_beam/io/avroio.py|   4 +-
 sdks/python/apache_beam/io/filebasedsink.py | 299 ++
 .../python/apache_beam/io/filebasedsink_test.py | 303 ++
 sdks/python/apache_beam/io/fileio.py| 304 ---
 sdks/python/apache_beam/io/fileio_test.py   | 303 --
 sdks/python/apache_beam/io/gcp/gcsio.py |   6 +-
 sdks/python/apache_beam/io/iobase.py|  12 +-
 sdks/python/apache_beam/io/textio.py|   4 +-
 sdks/python/apache_beam/io/tfrecordio.py|   6 +-
 .../apache_beam/testing/pipeline_verifiers.py   |   4 +-
 11 files changed, 623 insertions(+), 624 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/717ab8c1/sdks/python/apache_beam/io/__init__.py
--
diff --git a/sdks/python/apache_beam/io/__init__.py 
b/sdks/python/apache_beam/io/__init__.py
index 881ce68..6ea0efd 100644
--- a/sdks/python/apache_beam/io/__init__.py
+++ b/sdks/python/apache_beam/io/__init__.py
@@ -19,7 +19,7 @@
 
 # pylint: disable=wildcard-import
 from apache_beam.io.avroio import *
-from apache_beam.io.fileio import *
+from apache_beam.io.filebasedsink import *
 from apache_beam.io.iobase import Read
 from apache_beam.io.iobase import Sink
 from apache_beam.io.iobase import Write

http://git-wip-us.apache.org/repos/asf/beam/blob/717ab8c1/sdks/python/apache_beam/io/avroio.py
--
diff --git a/sdks/python/apache_beam/io/avroio.py 
b/sdks/python/apache_beam/io/avroio.py
index 1c08c68..e02e1f7 100644
--- a/sdks/python/apache_beam/io/avroio.py
+++ b/sdks/python/apache_beam/io/avroio.py
@@ -27,7 +27,7 @@ from avro import schema
 
 import apache_beam as beam
 from apache_beam.io import filebasedsource
-from apache_beam.io import fileio
+from apache_beam.io import filebasedsink
 from apache_beam.io import iobase
 from apache_beam.io.filesystem import CompressionTypes
 from apache_beam.io.iobase import Read
@@ -335,7 +335,7 @@ class WriteToAvro(beam.transforms.PTransform):
 return {'sink_dd': self._sink}
 
 
-class _AvroSink(fileio.FileSink):
+class _AvroSink(filebasedsink.FileBasedSink):
   """A sink to avro files."""
 
   def __init__(self,

http://git-wip-us.apache.org/repos/asf/beam/blob/717ab8c1/sdks/python/apache_beam/io/filebasedsink.py
--
diff --git a/sdks/python/apache_beam/io/filebasedsink.py 
b/sdks/python/apache_beam/io/filebasedsink.py
new file mode 100644
index 000..76c09fc
--- /dev/null
+++ b/sdks/python/apache_beam/io/filebasedsink.py
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""File-based sink."""
+
+from __future__ import absolute_import
+
+import logging
+import os
+import re
+import time
+import uuid
+
+from apache_beam.internal import util
+from apache_beam.io import iobase
+from apache_beam.io.filesystem import BeamIOError
+from apache_beam.io.filesystem import CompressionTypes
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.transforms.display import DisplayDataItem
+from apache_beam.options.value_provider import ValueProvider
+from apache_beam.options.value_provider import StaticValueProvider
+from apache_beam.options.value_provider import 

[1/2] beam git commit: Enable SerializableCoder to Serialize with Generic Types

2017-05-11 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 9582a1481 -> da2476d0b


Enable SerializableCoder to Serialize with Generic Types

A TypeToken that contains generics is not serializable. However, the
TypeDescriptor does not need to be transmitted via the serialized form,
so mark it as transient.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/34cbb090
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/34cbb090
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/34cbb090

Branch: refs/heads/release-2.0.0
Commit: 34cbb090077a7a9599c6e5e69acbfb145427a444
Parents: 9582a14
Author: Thomas Groh 
Authored: Thu May 11 17:16:53 2017 -0700
Committer: Luke Cwik 
Committed: Thu May 11 18:13:38 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml   | 10 ++
 .../org/apache/beam/sdk/coders/SerializableCoder.java |  5 -
 .../org/apache/beam/sdk/coders/SerializableCoderTest.java |  8 +++-
 3 files changed, 21 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/34cbb090/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
--
diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml 
b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
index 1db0e86..8ff0cb0 100644
--- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
+++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
@@ -74,6 +74,16 @@
   
 
   
+
+
+
+
+  
+
+  
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/34cbb090/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
index 9aa8493..6691876 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
@@ -107,7 +107,7 @@ public class SerializableCoder 
extends CustomCoder {
   }
 
   private final Class type;
-  private final TypeDescriptor typeDescriptor;
+  private transient TypeDescriptor typeDescriptor;
 
   protected SerializableCoder(Class type, TypeDescriptor typeDescriptor) 
{
 this.type = type;
@@ -166,6 +166,9 @@ public class SerializableCoder 
extends CustomCoder {
 
   @Override
   public TypeDescriptor getEncodedTypeDescriptor() {
+if (typeDescriptor == null) {
+  typeDescriptor = TypeDescriptor.of(type);
+}
 return typeDescriptor;
   }
 

http://git-wip-us.apache.org/repos/asf/beam/blob/34cbb090/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
index adb6652..dd4f6ca 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
@@ -125,12 +125,18 @@ public class SerializableCoderTest implements 
Serializable {
 assertEquals(coder.getRecordType(), MyRecord.class);
 CoderProperties.coderSerializable(coder);
 
-
 SerializableCoder decoded = SerializableUtils.clone(coder);
 assertThat(decoded.getRecordType(), 
Matchers.equalTo(MyRecord.class));
   }
 
   @Test
+  public  void 
testSerializableCoderIsSerializableWithGenericTypeToken()
+  throws Exception {
+SerializableCoder coder = SerializableCoder.of(new TypeDescriptor() 
{});
+CoderProperties.coderSerializable(coder);
+  }
+
+  @Test
   public void testNullEquals() {
 SerializableCoder coder = SerializableCoder.of(MyRecord.class);
 Assert.assertFalse(coder.equals(null));



[2/2] beam git commit: Cherry-pick #3109 into release-2.0.0

2017-05-11 Thread lcwik
Cherry-pick #3109 into release-2.0.0

This closes #3110


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/da2476d0
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/da2476d0
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/da2476d0

Branch: refs/heads/release-2.0.0
Commit: da2476d0b02867d4c018123dbe06aa9be436b9c3
Parents: 9582a14 34cbb09
Author: Luke Cwik 
Authored: Thu May 11 18:14:09 2017 -0700
Committer: Luke Cwik 
Committed: Thu May 11 18:14:09 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml   | 10 ++
 .../org/apache/beam/sdk/coders/SerializableCoder.java |  5 -
 .../org/apache/beam/sdk/coders/SerializableCoderTest.java |  8 +++-
 3 files changed, 21 insertions(+), 2 deletions(-)
--




Build failed in Jenkins: beam_PostCommit_Python_Verify #2202

2017-05-11 Thread Apache Jenkins Server
See 


Changes:

[sourabhbajaj] Remove unused test data

--
[...truncated 559.21 KB...]
  File "/usr/lib/python2.7/pickle.py", line 224, in dump
self.save(obj)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 600, in save_list
self._batch_appends(iter(obj))
  File "/usr/lib/python2.7/pickle.py", line 636, in _batch_appends
save(tmp[0])
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 600, in save_list
self._batch_appends(iter(obj))
  File "/usr/lib/python2.7/pickle.py", line 633, in _batch_appends
save(x)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 680, in _batch_setitems
save(k)
  File "/usr/lib/python2.7/pickle.py", line 269, in save
def save(self, obj):
  File 
"
 line 276, in signalhandler
raise TimedOutException()
TimedOutException: 'test_iterable_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)'

==
ERROR: test_multi_valued_singleton_side_input 
(apache_beam.transforms.sideinputs_test.SideInputsTest)

[jira] [Commented] (BEAM-2265) Python word count with DirectRunner gets stuck during application termination on Windows

2017-05-11 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007461#comment-16007461
 ] 

Ahmet Altay commented on BEAM-2265:
---

My summary of the last 2 comments is, DirectRunner on Windows stucks at the end 
for input that are not small. If this is the case, in my opinion it does not 
need to be a blocker for 2.0.0 release.

> Python word count with DirectRunner gets stuck during application termination 
> on Windows
> 
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Minor
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2265) Python word count gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2265:

Affects Version/s: 2.0.0
 Priority: Minor  (was: Major)

> Python word count gets stuck during application termination on Windows
> --
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Minor
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2265) Python word count with DirectRunner gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2265:

Summary: Python word count with DirectRunner gets stuck during application 
termination on Windows  (was: Python word count gets stuck during application 
termination on Windows)

> Python word count with DirectRunner gets stuck during application termination 
> on Windows
> 
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>Priority: Minor
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2265) Python word count gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2265:

Description: 
Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016

Example logs from DirectRunner:
{code}
(beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
apache_beam.examples.wordcount --input ".\input\*" --output l
ocal_counts
No handlers could be found for logger "oauth2client.contrib.multistore_file"
INFO:root:Missing pipeline option (runner). Executing pipeline using the 
default runner: DirectRunner.
INFO:root:Running pipeline with DirectRunner.
{code}

Application gets stuck here, pressing ctrl-z gets it unstuck and the remainder 
below is logged
{code}
INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
num_threads: 1
INFO:root:Renamed 1 shards in 0.14 seconds.
INFO:root:number of empty lines: 47851
INFO:root:average word length: 4
{code}

Output is correct, so it seems as though the bug is somewhere in shutdown.
Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
logging did not add any additional details.

  was:
Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016

Example logs from DirectRunner:
{code}
(beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
apache_beam.examples.wordcount --input ".\input\*" --output l
ocal_counts
No handlers could be found for logger "oauth2client.contrib.multistore_file"
INFO:root:Missing pipeline option (runner). Executing pipeline using the 
default runner: DirectRunner.
INFO:root:Running pipeline with DirectRunner.
{code}

Application gets stuck here, pressing ctrl-z gets it unstuck and the remainder 
below is logged
{code}
INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
num_threads: 1
INFO:root:Renamed 1 shards in 0.14 seconds.
INFO:root:number of empty lines: 47851
INFO:root:average word length: 4
{code}

Output is correct, so it seems as though the bug is somewhere in shutdown.
Happens when using a local or gs path with the DirectRunner or using 
DataflowRunner. Enabling DEBUG logging did not add any additional details.


> Python word count gets stuck during application termination on Windows
> --
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2265) Python word count gets stuck during application termination on Windows

2017-05-11 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007460#comment-16007460
 ] 

Luke Cwik commented on BEAM-2265:
-

Turns out I tried 3 more times and was unable to reproduce with DataflowRunner. 
DirectRunner I can reproduce reliably though.

> Python word count gets stuck during application termination on Windows
> --
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner. Enabling DEBUG 
> logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3112: Rename filesink to filebasedsink

2017-05-11 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/3112

Rename filesink to filebasedsink

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @aaltay PTAL

cc: @dhalperi

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-3111

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3112.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3112


commit 2fd1403cc31cc8fd80b8914ae64efc29e5d256ce
Author: Sourabh Bajaj 
Date:   2017-05-12T00:23:40Z

Rename filesink to filebasedsink




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3100: Cherry pick #3093: [BEAM-2256] Add the last previou...

2017-05-11 Thread lukecwik
Github user lukecwik closed the pull request at:

https://github.com/apache/beam/pull/3100


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3111

2017-05-11 Thread altay
This closes #3111


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/36a6cd69
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/36a6cd69
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/36a6cd69

Branch: refs/heads/master
Commit: 36a6cd69aa01b7e122e6e84ff1218adc7ba059c9
Parents: 652fcb7 2f17782
Author: Ahmet Altay 
Authored: Thu May 11 18:04:24 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 18:04:24 2017 -0700

--
 sdks/python/apache_beam/io/__init__.py  |   2 +-
 sdks/python/apache_beam/io/avroio.py|   4 +-
 sdks/python/apache_beam/io/filebasedsink.py | 299 ++
 .../python/apache_beam/io/filebasedsink_test.py | 303 ++
 sdks/python/apache_beam/io/fileio.py| 304 ---
 sdks/python/apache_beam/io/fileio_test.py   | 303 --
 sdks/python/apache_beam/io/gcp/gcsio.py |   6 +-
 sdks/python/apache_beam/io/iobase.py|  12 +-
 sdks/python/apache_beam/io/textio.py|   4 +-
 sdks/python/apache_beam/io/tfrecordio.py|   6 +-
 .../apache_beam/testing/pipeline_verifiers.py   |   4 +-
 11 files changed, 623 insertions(+), 624 deletions(-)
--




[GitHub] beam pull request #3111: Beam rename FileSink to FileBasedSink

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3111


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Rename filesink to filebasedsink

2017-05-11 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 652fcb7e9 -> 36a6cd69a


Rename filesink to filebasedsink


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2f177820
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2f177820
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2f177820

Branch: refs/heads/master
Commit: 2f177820c4e72fe92dc47d595847ddecdae7afcd
Parents: 652fcb7
Author: Sourabh Bajaj 
Authored: Thu May 11 17:23:40 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 18:04:18 2017 -0700

--
 sdks/python/apache_beam/io/__init__.py  |   2 +-
 sdks/python/apache_beam/io/avroio.py|   4 +-
 sdks/python/apache_beam/io/filebasedsink.py | 299 ++
 .../python/apache_beam/io/filebasedsink_test.py | 303 ++
 sdks/python/apache_beam/io/fileio.py| 304 ---
 sdks/python/apache_beam/io/fileio_test.py   | 303 --
 sdks/python/apache_beam/io/gcp/gcsio.py |   6 +-
 sdks/python/apache_beam/io/iobase.py|  12 +-
 sdks/python/apache_beam/io/textio.py|   4 +-
 sdks/python/apache_beam/io/tfrecordio.py|   6 +-
 .../apache_beam/testing/pipeline_verifiers.py   |   4 +-
 11 files changed, 623 insertions(+), 624 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2f177820/sdks/python/apache_beam/io/__init__.py
--
diff --git a/sdks/python/apache_beam/io/__init__.py 
b/sdks/python/apache_beam/io/__init__.py
index 881ce68..6ea0efd 100644
--- a/sdks/python/apache_beam/io/__init__.py
+++ b/sdks/python/apache_beam/io/__init__.py
@@ -19,7 +19,7 @@
 
 # pylint: disable=wildcard-import
 from apache_beam.io.avroio import *
-from apache_beam.io.fileio import *
+from apache_beam.io.filebasedsink import *
 from apache_beam.io.iobase import Read
 from apache_beam.io.iobase import Sink
 from apache_beam.io.iobase import Write

http://git-wip-us.apache.org/repos/asf/beam/blob/2f177820/sdks/python/apache_beam/io/avroio.py
--
diff --git a/sdks/python/apache_beam/io/avroio.py 
b/sdks/python/apache_beam/io/avroio.py
index 1c08c68..e02e1f7 100644
--- a/sdks/python/apache_beam/io/avroio.py
+++ b/sdks/python/apache_beam/io/avroio.py
@@ -27,7 +27,7 @@ from avro import schema
 
 import apache_beam as beam
 from apache_beam.io import filebasedsource
-from apache_beam.io import fileio
+from apache_beam.io import filebasedsink
 from apache_beam.io import iobase
 from apache_beam.io.filesystem import CompressionTypes
 from apache_beam.io.iobase import Read
@@ -335,7 +335,7 @@ class WriteToAvro(beam.transforms.PTransform):
 return {'sink_dd': self._sink}
 
 
-class _AvroSink(fileio.FileSink):
+class _AvroSink(filebasedsink.FileBasedSink):
   """A sink to avro files."""
 
   def __init__(self,

http://git-wip-us.apache.org/repos/asf/beam/blob/2f177820/sdks/python/apache_beam/io/filebasedsink.py
--
diff --git a/sdks/python/apache_beam/io/filebasedsink.py 
b/sdks/python/apache_beam/io/filebasedsink.py
new file mode 100644
index 000..76c09fc
--- /dev/null
+++ b/sdks/python/apache_beam/io/filebasedsink.py
@@ -0,0 +1,299 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""File-based sink."""
+
+from __future__ import absolute_import
+
+import logging
+import os
+import re
+import time
+import uuid
+
+from apache_beam.internal import util
+from apache_beam.io import iobase
+from apache_beam.io.filesystem import BeamIOError
+from apache_beam.io.filesystem import CompressionTypes
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.transforms.display import DisplayDataItem
+from apache_beam.options.value_provider import ValueProvider
+from apache_beam.options.value_provider import StaticValueProvider
+from apache_beam.options.value_provider import check_accessible
+

[GitHub] beam pull request #3109: Enable SerializableCoder to Serialize with Generic ...

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3109


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: This closes #3109

2017-05-11 Thread tgroh
Repository: beam
Updated Branches:
  refs/heads/master fc77ca7cb -> 652fcb7e9


This closes #3109


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/652fcb7e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/652fcb7e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/652fcb7e

Branch: refs/heads/master
Commit: 652fcb7e9e427b3ae44dbfcdd537298d9dd07cb5
Parents: fc77ca7 c39c9ae
Author: Thomas Groh 
Authored: Thu May 11 17:55:50 2017 -0700
Committer: Thomas Groh 
Committed: Thu May 11 17:55:50 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml   | 10 ++
 .../org/apache/beam/sdk/coders/SerializableCoder.java |  5 -
 .../org/apache/beam/sdk/coders/SerializableCoderTest.java |  8 +++-
 3 files changed, 21 insertions(+), 2 deletions(-)
--




[2/2] beam git commit: Enable SerializableCoder to Serialize with Generic Types

2017-05-11 Thread tgroh
Enable SerializableCoder to Serialize with Generic Types

A TypeToken that contains generics is not serializable. However, the
TypeDescriptor does not need to be transmitted via the serialized form,
so mark it as transient.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c39c9aec
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c39c9aec
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c39c9aec

Branch: refs/heads/master
Commit: c39c9aecf859095bfde87267e7eb9d9a4fd1682d
Parents: fc77ca7
Author: Thomas Groh 
Authored: Thu May 11 17:16:53 2017 -0700
Committer: Thomas Groh 
Committed: Thu May 11 17:55:50 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml   | 10 ++
 .../org/apache/beam/sdk/coders/SerializableCoder.java |  5 -
 .../org/apache/beam/sdk/coders/SerializableCoderTest.java |  8 +++-
 3 files changed, 21 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c39c9aec/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
--
diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml 
b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
index 1db0e86..8ff0cb0 100644
--- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
+++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
@@ -74,6 +74,16 @@
   
 
   
+
+
+
+
+  
+
+  
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/c39c9aec/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
index 9aa8493..6691876 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/SerializableCoder.java
@@ -107,7 +107,7 @@ public class SerializableCoder 
extends CustomCoder {
   }
 
   private final Class type;
-  private final TypeDescriptor typeDescriptor;
+  private transient TypeDescriptor typeDescriptor;
 
   protected SerializableCoder(Class type, TypeDescriptor typeDescriptor) 
{
 this.type = type;
@@ -166,6 +166,9 @@ public class SerializableCoder 
extends CustomCoder {
 
   @Override
   public TypeDescriptor getEncodedTypeDescriptor() {
+if (typeDescriptor == null) {
+  typeDescriptor = TypeDescriptor.of(type);
+}
 return typeDescriptor;
   }
 

http://git-wip-us.apache.org/repos/asf/beam/blob/c39c9aec/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
--
diff --git 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
index adb6652..dd4f6ca 100644
--- 
a/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
+++ 
b/sdks/java/core/src/test/java/org/apache/beam/sdk/coders/SerializableCoderTest.java
@@ -125,12 +125,18 @@ public class SerializableCoderTest implements 
Serializable {
 assertEquals(coder.getRecordType(), MyRecord.class);
 CoderProperties.coderSerializable(coder);
 
-
 SerializableCoder decoded = SerializableUtils.clone(coder);
 assertThat(decoded.getRecordType(), 
Matchers.equalTo(MyRecord.class));
   }
 
   @Test
+  public  void 
testSerializableCoderIsSerializableWithGenericTypeToken()
+  throws Exception {
+SerializableCoder coder = SerializableCoder.of(new TypeDescriptor() 
{});
+CoderProperties.coderSerializable(coder);
+  }
+
+  @Test
   public void testNullEquals() {
 SerializableCoder coder = SerializableCoder.of(MyRecord.class);
 Assert.assertFalse(coder.equals(null));



[jira] [Commented] (BEAM-2265) Python word count gets stuck during application termination on Windows

2017-05-11 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007442#comment-16007442
 ] 

Chamikara Jayalath commented on BEAM-2265:
--

I tried running word count on Windows as well. Word count on Windows for 
DirectRunner passes for small inputs but gets stuck for input 
gs://dataflow-samples/shakespeare/. Jobs did not get stuck when using 
DataflowRunner.

> Python word count gets stuck during application termination on Windows
> --
>
> Key: BEAM-2265
> URL: https://issues.apache.org/jira/browse/BEAM-2265
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Luke Cwik
>Assignee: Ahmet Altay
>
> Using virtualenv 15 + python 2.7.13 + pip 9.0.1 on Windows 2016
> Example logs from DirectRunner:
> {code}
> (beamRC2)PS C:\Users\lcwik\.virtualenvs\beamRC2> python -m 
> apache_beam.examples.wordcount --input ".\input\*" --output l
> ocal_counts
> No handlers could be found for logger "oauth2client.contrib.multistore_file"
> INFO:root:Missing pipeline option (runner). Executing pipeline using the 
> default runner: DirectRunner.
> INFO:root:Running pipeline with DirectRunner.
> {code}
> Application gets stuck here, pressing ctrl-z gets it unstuck and the 
> remainder below is logged
> {code}
> INFO:root:Starting finalize_write threads with num_shards: 1, batches: 1, 
> num_threads: 1
> INFO:root:Renamed 1 shards in 0.14 seconds.
> INFO:root:number of empty lines: 47851
> INFO:root:average word length: 4
> {code}
> Output is correct, so it seems as though the bug is somewhere in shutdown.
> Happens when using a local or gs path with the DirectRunner or using 
> DataflowRunner. Enabling DEBUG logging did not add any additional details.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2275) SerializableCoder fails to serialize when used with a generic type token

2017-05-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2275:

Priority: Blocker  (was: Major)

> SerializableCoder fails to serialize when used with a generic type token
> 
>
> Key: BEAM-2275
> URL: https://issues.apache.org/jira/browse/BEAM-2275
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Luke Cwik
>Assignee: Thomas Groh
>Priority: Blocker
>
> The following code throws that the type descriptor is not serializable
> {code}
> SerializableCoder coder = SerializableCoder.of(new TypeDescriptor(){});
> CoderProperties.ensureSerializable(coder);
> {code}
> This is a regression since 0.6.0 since the type descriptor was never 
> serialized before.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2275) SerializableCoder fails to serialize when used with a generic type token

2017-05-11 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2275:
---

 Summary: SerializableCoder fails to serialize when used with a 
generic type token
 Key: BEAM-2275
 URL: https://issues.apache.org/jira/browse/BEAM-2275
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Affects Versions: 2.0.0
Reporter: Luke Cwik
Assignee: Thomas Groh


The following code throws that the type descriptor is not serializable
{code}
SerializableCoder coder = SerializableCoder.of(new TypeDescriptor(){});
CoderProperties.ensureSerializable(coder);
{code}

This is a regression since 0.6.0 since the type descriptor was never serialized 
before.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3111: Beam rename FileSink to FileBasedSink

2017-05-11 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/3111

Beam rename FileSink to FileBasedSink

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @chamikaramj  @aaltay PTAL


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-rename-filesink

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3111.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3111


commit 82f7204bee41a644a9f6f1537764dbea49cb0f59
Author: Sourabh Bajaj 
Date:   2017-05-12T00:23:40Z

Rename filesink to filebasedsink

commit b394a2b5b9d6a98f6815c8d95c66a5426c910388
Author: Sourabh Bajaj 
Date:   2017-05-12T00:29:51Z

Rename fileio to filebasedsink




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3110: Cherry-pick #3109 into release-2.0.0

2017-05-11 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3110

Cherry-pick #3109 into release-2.0.0

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Enable SerializableCoder to Serialize with Generic Types

A TypeToken that contains generics is not serializable. However, the
TypeDescriptor does not need to be transmitted via the serialized form,
so mark it as transient.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam serializable_coder_generic

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3110.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3110


commit 22e87daf4c8a5fe520a3808186b9a50aadb7eebb
Author: Thomas Groh 
Date:   2017-05-12T00:16:53Z

Enable SerializableCoder to Serialize with Generic Types

A TypeToken that contains generics is not serializable. However, the
TypeDescriptor does not need to be transmitted via the serialized form,
so mark it as transient.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3109: Enable SerializableCoder to Serialize with Generic ...

2017-05-11 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3109

Enable SerializableCoder to Serialize with Generic Types

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
A TypeToken that contains generics is not serializable. However, the
TypeDescriptor does not need to be transmitted via the serialized form,
so mark it as transient.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam serializable_coder_generic_master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3109.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3109


commit 29956f71bf8e2002f12431d3bfaa2ea0fd6e5f4f
Author: Thomas Groh 
Date:   2017-05-12T00:16:53Z

Enable SerializableCoder to Serialize with Generic Types

A TypeToken that contains generics is not serializable. However, the
TypeDescriptor does not need to be transmitted via the serialized form,
so mark it as transient.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Remove unused test data

2017-05-11 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 1cc32c65e -> 9582a1481


Remove unused test data


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c26ed015
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c26ed015
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c26ed015

Branch: refs/heads/release-2.0.0
Commit: c26ed015777d5523c1eac7f183ee4a8c49c40852
Parents: 1cc32c6
Author: Sourabh Bajaj 
Authored: Thu May 11 16:41:06 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 17:17:17 2017 -0700

--
 sdks/python/apache_beam/testing/data/privatekey.p12 | Bin 2452 -> 0 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c26ed015/sdks/python/apache_beam/testing/data/privatekey.p12
--
diff --git a/sdks/python/apache_beam/testing/data/privatekey.p12 
b/sdks/python/apache_beam/testing/data/privatekey.p12
deleted file mode 100644
index c369ecb..000
Binary files a/sdks/python/apache_beam/testing/data/privatekey.p12 and 
/dev/null differ



[GitHub] beam pull request #3108: Remove unused test data

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3108


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3108

2017-05-11 Thread dhalperi
This closes #3108


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9582a148
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9582a148
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9582a148

Branch: refs/heads/release-2.0.0
Commit: 9582a1481a530345e4187eb9bd96f8aa38c2a1ae
Parents: 1cc32c6 c26ed01
Author: Dan Halperin 
Authored: Thu May 11 17:29:31 2017 -0700
Committer: Dan Halperin 
Committed: Thu May 11 17:29:31 2017 -0700

--
 sdks/python/apache_beam/testing/data/privatekey.p12 | Bin 2452 -> 0 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
--




Build failed in Jenkins: beam_PostCommit_Python_Verify #2201

2017-05-11 Thread Apache Jenkins Server
See 


Changes:

[robertwb] [BEAM-1340] Add __all__ tags to modules in package

[altay] Add internal comments to metrics

[robertwb] [BEAM-1345] Annotate public members of pvalue.

[robertwb] [BEAM-1345] Mark apache_beam/internal as internal.

[robertwb] [BEAM-1345] Clearly delineate public api in apache_beam/typehints.

[robertwb] Move assert_that, equal_to, is_empty to apache_beam.testing.util

[altay] Remove some internal details from the public API.

[dhalperi] Don't deploy jdk1.8-tests module

[robertwb] Fix due to GBKO name change.

--
[...truncated 579.50 KB...]
save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 600, in save_list
self._batch_appends(iter(obj))
  File "/usr/lib/python2.7/pickle.py", line 633, in _batch_appends
save(x)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
  File "/usr/lib/python2.7/pickle.py", line 419, in save_reduce
save(state)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 163, in new_save_module_dict
return old_save_module_dict(pickler, obj)
  File 
"
 line 841, in save_module_dict
StockPickler.save_dict(pickler, obj)
  File "/usr/lib/python2.7/pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
  File "/usr/lib/python2.7/pickle.py", line 681, in _batch_setitems
save(v)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 1311, in save_function
obj.__dict__), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 401, in save_reduce
save(args)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 562, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "/usr/lib/python2.7/pickle.py", line 548, in save_tuple
save(element)
  File "/usr/lib/python2.7/pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File 
"
 line 1057, in save_cell
pickler.save_reduce(_create_cell, (obj.cell_contents,), obj=obj)
  File "/usr/lib/python2.7/pickle.py", line 400, in save_reduce
save(func)
  File "/usr/lib/python2.7/pickle.py", line 279, in save
self.write(self.get(x[0]))
  File 

[GitHub] beam pull request #3108: Remove unused test data

2017-05-11 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/3108

Remove unused test data

R: @dhalperi 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam mcp3

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3108.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3108


commit c26ed015777d5523c1eac7f183ee4a8c49c40852
Author: Sourabh Bajaj 
Date:   2017-05-11T23:41:06Z

Remove unused test data




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2274) beam on spark runner run much slower than using spark

2017-05-11 Thread liyuntian (JIRA)
liyuntian created BEAM-2274:
---

 Summary: beam on spark runner run much slower than using spark
 Key: BEAM-2274
 URL: https://issues.apache.org/jira/browse/BEAM-2274
 Project: Beam
  Issue Type: Test
  Components: runner-spark
Reporter: liyuntian
Assignee: Amit Sela


I run a job,read hdfs files using Read.from(HDFSFileSource.from()) and do some 
ParDo.of functions.  and I also run the same job, read hdfs file using 
sc.textFile(file) and do some RDDs.but I find beam job is much slower than 
spark job.Is there something that beam should improve or something wrong with 
my system and my code?thank you.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[17/19] beam git commit: Move assert_that, equal_to, is_empty to apache_beam.testing.util

2017-05-11 Thread altay
Move assert_that, equal_to, is_empty to apache_beam.testing.util


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2070f118
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2070f118
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2070f118

Branch: refs/heads/release-2.0.0
Commit: 2070f1182d49e3b7b3e9ed8a35173cb165fa5bfb
Parents: d0da682
Author: Charles Chen 
Authored: Thu May 11 15:07:30 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:37 2017 -0700

--
 .../examples/complete/autocomplete_test.py  |   4 +-
 .../examples/complete/estimate_pi_test.py   |   4 +-
 .../complete/game/hourly_team_score_test.py |   4 +-
 .../examples/complete/game/user_score_test.py   |   4 +-
 .../apache_beam/examples/complete/tfidf_test.py |   4 +-
 .../complete/top_wikipedia_sessions_test.py |   4 +-
 .../cookbook/bigquery_side_input_test.py|   4 +-
 .../cookbook/bigquery_tornadoes_test.py |   6 +-
 .../examples/cookbook/coders_test.py|   4 +-
 .../examples/cookbook/combiners_test.py |   6 +-
 .../examples/cookbook/custom_ptransform_test.py |   4 +-
 .../examples/cookbook/filters_test.py   |  12 ++-
 .../examples/cookbook/mergecontacts.py  |  14 +--
 .../apache_beam/examples/snippets/snippets.py   |  17 +--
 .../examples/snippets/snippets_test.py  |  30 +++---
 .../apache_beam/examples/wordcount_debugging.py |   6 +-
 sdks/python/apache_beam/io/avroio_test.py   |   4 +-
 .../python/apache_beam/io/concat_source_test.py |   4 +-
 .../apache_beam/io/filebasedsource_test.py  |   4 +-
 sdks/python/apache_beam/io/sources_test.py  |   4 +-
 sdks/python/apache_beam/io/textio_test.py   |   5 +-
 sdks/python/apache_beam/io/tfrecordio_test.py   |  24 +++--
 sdks/python/apache_beam/pipeline_test.py|   4 +-
 .../portability/maptask_executor_runner_test.py |   6 +-
 sdks/python/apache_beam/runners/runner_test.py  |   4 +-
 sdks/python/apache_beam/testing/util.py | 107 +++
 sdks/python/apache_beam/testing/util_test.py|  50 +
 .../apache_beam/transforms/combiners_test.py|   2 +-
 .../apache_beam/transforms/create_test.py   |   3 +-
 .../apache_beam/transforms/ptransform_test.py   |   2 +-
 .../apache_beam/transforms/sideinputs_test.py   |   2 +-
 .../apache_beam/transforms/trigger_test.py  |   2 +-
 sdks/python/apache_beam/transforms/util.py  |  79 --
 sdks/python/apache_beam/transforms/util_test.py |  50 -
 .../apache_beam/transforms/window_test.py   |   2 +-
 .../transforms/write_ptransform_test.py |   2 +-
 .../typehints/typed_pipeline_test.py|   2 +-
 37 files changed, 271 insertions(+), 218 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2070f118/sdks/python/apache_beam/examples/complete/autocomplete_test.py
--
diff --git a/sdks/python/apache_beam/examples/complete/autocomplete_test.py 
b/sdks/python/apache_beam/examples/complete/autocomplete_test.py
index 438633a..378d222 100644
--- a/sdks/python/apache_beam/examples/complete/autocomplete_test.py
+++ b/sdks/python/apache_beam/examples/complete/autocomplete_test.py
@@ -22,8 +22,8 @@ import unittest
 import apache_beam as beam
 from apache_beam.examples.complete import autocomplete
 from apache_beam.testing.test_pipeline import TestPipeline
-from apache_beam.transforms.util import assert_that
-from apache_beam.transforms.util import equal_to
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import equal_to
 
 
 class AutocompleteTest(unittest.TestCase):

http://git-wip-us.apache.org/repos/asf/beam/blob/2070f118/sdks/python/apache_beam/examples/complete/estimate_pi_test.py
--
diff --git a/sdks/python/apache_beam/examples/complete/estimate_pi_test.py 
b/sdks/python/apache_beam/examples/complete/estimate_pi_test.py
index 12d8379..fd51309 100644
--- a/sdks/python/apache_beam/examples/complete/estimate_pi_test.py
+++ b/sdks/python/apache_beam/examples/complete/estimate_pi_test.py
@@ -22,8 +22,8 @@ import unittest
 
 from apache_beam.examples.complete import estimate_pi
 from apache_beam.testing.test_pipeline import TestPipeline
-from apache_beam.transforms.util import assert_that
-from apache_beam.transforms.util import BeamAssertException
+from apache_beam.testing.util import assert_that
+from apache_beam.testing.util import BeamAssertException
 
 
 def in_between(lower, upper):

http://git-wip-us.apache.org/repos/asf/beam/blob/2070f118/sdks/python/apache_beam/examples/complete/game/hourly_team_score_test.py

[10/19] beam git commit: [BEAM-1345] Clearly delineate public API in apache_beam/options

2017-05-11 Thread altay
[BEAM-1345] Clearly delineate public API in apache_beam/options


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6d77f958
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6d77f958
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6d77f958

Branch: refs/heads/release-2.0.0
Commit: 6d77f958de666ccf5f59e907c292efbc9272b49b
Parents: 0a0cc2d
Author: Charles Chen 
Authored: Thu May 11 13:31:18 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 .../apache_beam/options/pipeline_options.py | 20 +---
 .../options/pipeline_options_validator.py   |  2 ++
 .../apache_beam/options/value_provider.py   |  8 
 3 files changed, 27 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6d77f958/sdks/python/apache_beam/options/pipeline_options.py
--
diff --git a/sdks/python/apache_beam/options/pipeline_options.py 
b/sdks/python/apache_beam/options/pipeline_options.py
index b79d85d..983d128 100644
--- a/sdks/python/apache_beam/options/pipeline_options.py
+++ b/sdks/python/apache_beam/options/pipeline_options.py
@@ -25,6 +25,20 @@ from apache_beam.options.value_provider import 
RuntimeValueProvider
 from apache_beam.options.value_provider import ValueProvider
 
 
+__all__ = [
+'PipelineOptions',
+'StandardOptions',
+'TypeOptions',
+'DirectOptions',
+'GoogleCloudOptions',
+'WorkerOptions',
+'DebugOptions',
+'ProfilingOptions',
+'SetupOptions',
+'TestOptions',
+]
+
+
 def _static_value_provider_of(value_type):
   Helper function to plug a ValueProvider into argparse.
 
@@ -42,7 +56,7 @@ def _static_value_provider_of(value_type):
   return _f
 
 
-class BeamArgumentParser(argparse.ArgumentParser):
+class _BeamArgumentParser(argparse.ArgumentParser):
   """An ArgumentParser that supports ValueProvider options.
 
   Example Usage::
@@ -133,7 +147,7 @@ class PipelineOptions(HasDisplayData):
 """
 self._flags = flags
 self._all_options = kwargs
-parser = BeamArgumentParser()
+parser = _BeamArgumentParser()
 
 for cls in type(self).mro():
   if cls == PipelineOptions:
@@ -187,7 +201,7 @@ class PipelineOptions(HasDisplayData):
 # TODO(BEAM-1319): PipelineOption sub-classes in the main session might be
 # repeated. Pick last unique instance of each subclass to avoid conflicts.
 subset = {}
-parser = BeamArgumentParser()
+parser = _BeamArgumentParser()
 for cls in PipelineOptions.__subclasses__():
   subset[str(cls)] = cls
 for cls in subset.values():

http://git-wip-us.apache.org/repos/asf/beam/blob/6d77f958/sdks/python/apache_beam/options/pipeline_options_validator.py
--
diff --git a/sdks/python/apache_beam/options/pipeline_options_validator.py 
b/sdks/python/apache_beam/options/pipeline_options_validator.py
index 5c1ce2a..24d2e55 100644
--- a/sdks/python/apache_beam/options/pipeline_options_validator.py
+++ b/sdks/python/apache_beam/options/pipeline_options_validator.py
@@ -16,6 +16,8 @@
 #
 
 """Pipeline options validator.
+
+For internal use only; no backwards-compatibility guarantees.
 """
 import re
 

http://git-wip-us.apache.org/repos/asf/beam/blob/6d77f958/sdks/python/apache_beam/options/value_provider.py
--
diff --git a/sdks/python/apache_beam/options/value_provider.py 
b/sdks/python/apache_beam/options/value_provider.py
index c00d7bc..40bddba 100644
--- a/sdks/python/apache_beam/options/value_provider.py
+++ b/sdks/python/apache_beam/options/value_provider.py
@@ -24,6 +24,14 @@ from functools import wraps
 from apache_beam import error
 
 
+__all__ = [
+'ValueProvider',
+'StaticValueProvider',
+'RuntimeValueProvider',
+'check_accessible',
+]
+
+
 class ValueProvider(object):
   def is_accessible(self):
 raise NotImplementedError(



[03/19] beam git commit: [BEAM-1345] Mark Pipeline as public.

2017-05-11 Thread altay
[BEAM-1345] Mark Pipeline as public.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0ce25430
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0ce25430
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0ce25430

Branch: refs/heads/release-2.0.0
Commit: 0ce25430ffec49fb2d94271e4af6225cee20388c
Parents: aeeefc1
Author: Robert Bradshaw 
Authored: Thu May 11 13:30:32 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/pipeline.py | 7 +++
 1 file changed, 7 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0ce25430/sdks/python/apache_beam/pipeline.py
--
diff --git a/sdks/python/apache_beam/pipeline.py 
b/sdks/python/apache_beam/pipeline.py
index 83c7287..ec8dde4 100644
--- a/sdks/python/apache_beam/pipeline.py
+++ b/sdks/python/apache_beam/pipeline.py
@@ -67,6 +67,9 @@ from apache_beam.options.pipeline_options_validator import 
PipelineOptionsValida
 from apache_beam.utils.annotations import deprecated
 
 
+__all__ = ['Pipeline']
+
+
 class Pipeline(object):
   """A pipeline object that manages a DAG of PValues and their PTransforms.
 
@@ -182,6 +185,8 @@ class Pipeline(object):
   def visit(self, visitor):
 """Visits depth-first every node of a pipeline's DAG.
 
+Runner-internal implementation detail; no backwards-compatibility 
guarantees
+
 Args:
   visitor: PipelineVisitor object whose callbacks will be called for each
 node visited. See PipelineVisitor comments.
@@ -333,6 +338,7 @@ class Pipeline(object):
 return Visitor.ok
 
   def to_runner_api(self):
+"""For internal use only; no backwards-compatibility guarantees."""
 from apache_beam.runners import pipeline_context
 from apache_beam.runners.api import beam_runner_api_pb2
 context = pipeline_context.PipelineContext()
@@ -346,6 +352,7 @@ class Pipeline(object):
 
   @staticmethod
   def from_runner_api(proto, runner, options):
+"""For internal use only; no backwards-compatibility guarantees."""
 p = Pipeline(runner=runner, options=options)
 from apache_beam.runners import pipeline_context
 context = pipeline_context.PipelineContext(proto.components)



[13/19] beam git commit: Remove some internal details from the public API.

2017-05-11 Thread altay
Remove some internal details from the public API.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e12cf0df
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e12cf0df
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e12cf0df

Branch: refs/heads/release-2.0.0
Commit: e12cf0dfd810ec26965ae98e5a2256f560c6ff6d
Parents: 2070f11
Author: Robert Bradshaw 
Authored: Wed May 10 15:55:46 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:37 2017 -0700

--
 sdks/python/apache_beam/pipeline.py |  4 ++--
 .../runners/dataflow/dataflow_runner.py |  6 +++---
 .../runners/dataflow/dataflow_runner_test.py|  6 +++---
 .../apache_beam/runners/direct/executor.py  |  2 +-
 .../runners/direct/transform_evaluator.py   | 10 -
 sdks/python/apache_beam/transforms/__init__.py  |  2 +-
 sdks/python/apache_beam/transforms/core.py  | 22 ++--
 .../python/apache_beam/transforms/ptransform.py |  6 +++---
 .../apache_beam/transforms/ptransform_test.py   | 12 +--
 sdks/python/apache_beam/typehints/typecheck.py  |  2 +-
 10 files changed, 36 insertions(+), 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/e12cf0df/sdks/python/apache_beam/pipeline.py
--
diff --git a/sdks/python/apache_beam/pipeline.py 
b/sdks/python/apache_beam/pipeline.py
index 79480d7..5048534 100644
--- a/sdks/python/apache_beam/pipeline.py
+++ b/sdks/python/apache_beam/pipeline.py
@@ -77,8 +77,8 @@ class Pipeline(object):
   the PValues are the edges.
 
   All the transforms applied to the pipeline must have distinct full labels.
-  If same transform instance needs to be applied then a clone should be created
-  with a new label (e.g., transform.clone('new label')).
+  If same transform instance needs to be applied then the right shift operator
+  should be used to designate new names (e.g. `input | "label" >> 
my_tranform`).
   """
 
   def __init__(self, runner=None, options=None, argv=None):

http://git-wip-us.apache.org/repos/asf/beam/blob/e12cf0df/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index da8de9d..0ecd22a 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -160,7 +160,7 @@ class DataflowRunner(PipelineRunner):
 
 class GroupByKeyInputVisitor(PipelineVisitor):
   """A visitor that replaces `Any` element type for input `PCollection` of
-  a `GroupByKey` or `GroupByKeyOnly` with a `KV` type.
+  a `GroupByKey` or `_GroupByKeyOnly` with a `KV` type.
 
   TODO(BEAM-115): Once Python SDk is compatible with the new Runner API,
   we could directly replace the coder instead of mutating the element type.
@@ -169,8 +169,8 @@ class DataflowRunner(PipelineRunner):
   def visit_transform(self, transform_node):
 # Imported here to avoid circular dependencies.
 # pylint: disable=wrong-import-order, wrong-import-position
-from apache_beam.transforms.core import GroupByKey, GroupByKeyOnly
-if isinstance(transform_node.transform, (GroupByKey, GroupByKeyOnly)):
+from apache_beam.transforms.core import GroupByKey, _GroupByKeyOnly
+if isinstance(transform_node.transform, (GroupByKey, _GroupByKeyOnly)):
   pcoll = transform_node.inputs[0]
   input_type = pcoll.element_type
   # If input_type is not specified, then treat it as `Any`.

http://git-wip-us.apache.org/repos/asf/beam/blob/e12cf0df/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
index ac9b028..ff4b51d 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
@@ -37,7 +37,7 @@ from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRuntimeExceptio
 from apache_beam.runners.dataflow.internal.clients import dataflow as 
dataflow_api
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.transforms.display import DisplayDataItem
-from apache_beam.transforms.core import GroupByKeyOnly
+from apache_beam.transforms.core import _GroupByKeyOnly
 from apache_beam.typehints import typehints
 
 # Protect against environments where apitools library is not 

[05/19] beam git commit: Add internal usage only comments to util/

2017-05-11 Thread altay
Add internal usage only comments to util/


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9730f56e
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9730f56e
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9730f56e

Branch: refs/heads/release-2.0.0
Commit: 9730f56eaf62c8164a517d5fe0cc03650c39a732
Parents: 65aa0ff
Author: Ahmet Altay 
Authored: Thu May 11 11:34:59 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/utils/__init__.py| 5 -
 sdks/python/apache_beam/utils/annotations.py | 4 +++-
 sdks/python/apache_beam/utils/counters.py| 5 -
 sdks/python/apache_beam/utils/processes.py   | 6 +-
 sdks/python/apache_beam/utils/profiler.py| 5 -
 sdks/python/apache_beam/utils/proto_utils.py | 2 ++
 sdks/python/apache_beam/utils/retry.py   | 2 ++
 sdks/python/apache_beam/utils/timestamp.py   | 5 -
 sdks/python/apache_beam/utils/urns.py| 2 ++
 9 files changed, 30 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/9730f56e/sdks/python/apache_beam/utils/__init__.py
--
diff --git a/sdks/python/apache_beam/utils/__init__.py 
b/sdks/python/apache_beam/utils/__init__.py
index 74cf45d..635c80f 100644
--- a/sdks/python/apache_beam/utils/__init__.py
+++ b/sdks/python/apache_beam/utils/__init__.py
@@ -15,4 +15,7 @@
 # limitations under the License.
 #
 
-"""A package containing utilities."""
+"""A package containing internal utilities.
+
+For internal use only; no backwards-compatibility guarantees.
+"""

http://git-wip-us.apache.org/repos/asf/beam/blob/9730f56e/sdks/python/apache_beam/utils/annotations.py
--
diff --git a/sdks/python/apache_beam/utils/annotations.py 
b/sdks/python/apache_beam/utils/annotations.py
index 92318b1..017dd6b 100644
--- a/sdks/python/apache_beam/utils/annotations.py
+++ b/sdks/python/apache_beam/utils/annotations.py
@@ -15,7 +15,9 @@
 # limitations under the License.
 #
 
-""" Deprecated and experimental annotations.
+"""Deprecated and experimental annotations.
+
+For internal use only; no backwards-compatibility guarantees.
 
 Annotations come in two flavors: deprecated and experimental
 

http://git-wip-us.apache.org/repos/asf/beam/blob/9730f56e/sdks/python/apache_beam/utils/counters.py
--
diff --git a/sdks/python/apache_beam/utils/counters.py 
b/sdks/python/apache_beam/utils/counters.py
index e41d732..b379461 100644
--- a/sdks/python/apache_beam/utils/counters.py
+++ b/sdks/python/apache_beam/utils/counters.py
@@ -18,7 +18,10 @@
 # cython: profile=False
 # cython: overflowcheck=True
 
-"""Counters collect the progress of the Worker for reporting to the service."""
+"""Counters collect the progress of the Worker for reporting to the service.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 
 import threading
 from apache_beam.transforms import cy_combiners

http://git-wip-us.apache.org/repos/asf/beam/blob/9730f56e/sdks/python/apache_beam/utils/processes.py
--
diff --git a/sdks/python/apache_beam/utils/processes.py 
b/sdks/python/apache_beam/utils/processes.py
index e089090..e5fd9c8 100644
--- a/sdks/python/apache_beam/utils/processes.py
+++ b/sdks/python/apache_beam/utils/processes.py
@@ -14,7 +14,11 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
-"""Cross-platform utilities for creating subprocesses."""
+
+"""Cross-platform utilities for creating subprocesses.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 
 import platform
 import subprocess

http://git-wip-us.apache.org/repos/asf/beam/blob/9730f56e/sdks/python/apache_beam/utils/profiler.py
--
diff --git a/sdks/python/apache_beam/utils/profiler.py 
b/sdks/python/apache_beam/utils/profiler.py
index 852b659..a2c3f6a 100644
--- a/sdks/python/apache_beam/utils/profiler.py
+++ b/sdks/python/apache_beam/utils/profiler.py
@@ -15,7 +15,10 @@
 # limitations under the License.
 #
 
-"""A profiler context manager based on cProfile.Profile objects."""
+"""A profiler context manager based on cProfile.Profile objects.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 
 import cProfile
 import logging

http://git-wip-us.apache.org/repos/asf/beam/blob/9730f56e/sdks/python/apache_beam/utils/proto_utils.py
--
diff --git 

[04/19] beam git commit: [BEAM-1345] Clearly delineate public api in runners package.

2017-05-11 Thread altay
[BEAM-1345] Clearly delineate public api in runners package.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/aeeefc17
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/aeeefc17
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/aeeefc17

Branch: refs/heads/release-2.0.0
Commit: aeeefc1725bca11f765f57bf2833d606a0e6d6ba
Parents: 6d77f95
Author: Robert Bradshaw 
Authored: Thu May 11 13:41:24 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/runners/api/__init__.py | 4 +++-
 sdks/python/apache_beam/runners/common.py   | 5 -
 sdks/python/apache_beam/runners/dataflow/__init__.py| 9 +
 sdks/python/apache_beam/runners/dataflow/dataflow_runner.py | 3 +++
 .../apache_beam/runners/dataflow/test_dataflow_runner.py| 3 +++
 sdks/python/apache_beam/runners/direct/__init__.py  | 6 +-
 sdks/python/apache_beam/runners/direct/direct_runner.py | 3 +++
 sdks/python/apache_beam/runners/pipeline_context.py | 6 ++
 sdks/python/apache_beam/runners/portability/__init__.py | 2 ++
 sdks/python/apache_beam/runners/runner.py   | 3 +++
 sdks/python/apache_beam/runners/worker/__init__.py  | 2 ++
 11 files changed, 43 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/aeeefc17/sdks/python/apache_beam/runners/api/__init__.py
--
diff --git a/sdks/python/apache_beam/runners/api/__init__.py 
b/sdks/python/apache_beam/runners/api/__init__.py
index e94673c..bf95208 100644
--- a/sdks/python/apache_beam/runners/api/__init__.py
+++ b/sdks/python/apache_beam/runners/api/__init__.py
@@ -15,7 +15,9 @@
 # limitations under the License.
 #
 
-"""Checked in to avoid protoc dependency for Python development.
+"""For internal use only; no backwards-compatibility guarantees.
+
+Checked in to avoid protoc dependency for Python development.
 
 Regenerate files with::
 

http://git-wip-us.apache.org/repos/asf/beam/blob/aeeefc17/sdks/python/apache_beam/runners/common.py
--
diff --git a/sdks/python/apache_beam/runners/common.py 
b/sdks/python/apache_beam/runners/common.py
index 86db711..8453569 100644
--- a/sdks/python/apache_beam/runners/common.py
+++ b/sdks/python/apache_beam/runners/common.py
@@ -17,7 +17,10 @@
 
 # cython: profile=True
 
-"""Worker operations executor."""
+"""Worker operations executor.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 
 import sys
 import traceback

http://git-wip-us.apache.org/repos/asf/beam/blob/aeeefc17/sdks/python/apache_beam/runners/dataflow/__init__.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/__init__.py 
b/sdks/python/apache_beam/runners/dataflow/__init__.py
index cce3aca..6674ba5 100644
--- a/sdks/python/apache_beam/runners/dataflow/__init__.py
+++ b/sdks/python/apache_beam/runners/dataflow/__init__.py
@@ -14,3 +14,12 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
+
+"""The DataflowRunner executes pipelines on Google Cloud Dataflow.
+
+Anything in this package not imported here is an internal implementation detail
+with no backwards-compatibility guarantees.
+"""
+
+from apache_beam.runners.dataflow.dataflow_runner import DataflowRunner
+from apache_beam.runners.dataflow.test_dataflow_runner import 
TestDataflowRunner

http://git-wip-us.apache.org/repos/asf/beam/blob/aeeefc17/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index 796a67b..3d8437c 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -46,6 +46,9 @@ from apache_beam.typehints import typehints
 from apache_beam.options.pipeline_options import StandardOptions
 
 
+__all__ = ['DataflowRunner']
+
+
 class DataflowRunner(PipelineRunner):
   """A runner that creates job graphs and submits them for remote execution.
 

http://git-wip-us.apache.org/repos/asf/beam/blob/aeeefc17/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py
index 4fd1026..b339882 100644
--- 

[11/19] beam git commit: [BEAM-1345] Mark windowed value as experimental

2017-05-11 Thread altay
[BEAM-1345] Mark windowed value as experimental


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5b095118
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5b095118
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5b095118

Branch: refs/heads/release-2.0.0
Commit: 5b09511873136cc6aff8c75f3767c2d81e288940
Parents: 9730f56
Author: Sourabh Bajaj 
Authored: Thu May 11 11:58:08 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/utils/windowed_value.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/5b095118/sdks/python/apache_beam/utils/windowed_value.py
--
diff --git a/sdks/python/apache_beam/utils/windowed_value.py 
b/sdks/python/apache_beam/utils/windowed_value.py
index 87c26d1..be27854 100644
--- a/sdks/python/apache_beam/utils/windowed_value.py
+++ b/sdks/python/apache_beam/utils/windowed_value.py
@@ -16,10 +16,12 @@
 #
 
 """Core windowing data structures.
+
+This module is experimental. No backwards-compatibility guarantees.
 """
 
 # This module is carefully crafted to have optimal performance when
-# compiled whiel still being valid Python.  Care needs to be taken when
+# compiled while still being valid Python.  Care needs to be taken when
 # editing this file as WindowedValues are created for every element for
 # every step in a Beam pipeline.
 



[09/19] beam git commit: [BEAM-1340] Adds __all__ tags to classes in package apache_beam/io.

2017-05-11 Thread altay
[BEAM-1340] Adds __all__ tags to classes in package apache_beam/io.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0c784f95
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0c784f95
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0c784f95

Branch: refs/heads/release-2.0.0
Commit: 0c784f958f11eb0e448cc448d2b9d36fd90e9972
Parents: dec27d8
Author: chamik...@google.com 
Authored: Thu May 11 11:46:46 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/io/concat_source.py  | 4 +++-
 sdks/python/apache_beam/io/filebasedsource.py| 2 ++
 sdks/python/apache_beam/io/fileio.py | 7 +++
 sdks/python/apache_beam/io/filesystem.py | 3 +++
 sdks/python/apache_beam/io/filesystems.py| 2 ++
 sdks/python/apache_beam/io/gcp/gcsfilesystem.py  | 2 ++
 sdks/python/apache_beam/io/gcp/gcsio.py  | 3 +++
 sdks/python/apache_beam/io/gcp/pubsub.py | 2 ++
 sdks/python/apache_beam/io/gcp/tests/bigquery_matcher.py | 3 +++
 sdks/python/apache_beam/io/iobase.py | 2 ++
 sdks/python/apache_beam/io/localfilesystem.py| 2 ++
 sdks/python/apache_beam/io/range_trackers.py | 7 ++-
 sdks/python/apache_beam/io/source_test_utils.py  | 8 
 13 files changed, 45 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0c784f95/sdks/python/apache_beam/io/concat_source.py
--
diff --git a/sdks/python/apache_beam/io/concat_source.py 
b/sdks/python/apache_beam/io/concat_source.py
index dfd1695..56c4cca 100644
--- a/sdks/python/apache_beam/io/concat_source.py
+++ b/sdks/python/apache_beam/io/concat_source.py
@@ -15,7 +15,9 @@
 # limitations under the License.
 #
 
-"""Concat Source, which reads the union of several other sources.
+"""For internal use only; no backwards-compatibility guarantees.
+
+Concat Source, which reads the union of several other sources.
 """
 
 import bisect

http://git-wip-us.apache.org/repos/asf/beam/blob/0c784f95/sdks/python/apache_beam/io/filebasedsource.py
--
diff --git a/sdks/python/apache_beam/io/filebasedsource.py 
b/sdks/python/apache_beam/io/filebasedsource.py
index 215e015..bb9efc4 100644
--- a/sdks/python/apache_beam/io/filebasedsource.py
+++ b/sdks/python/apache_beam/io/filebasedsource.py
@@ -38,6 +38,8 @@ from apache_beam.options.value_provider import 
check_accessible
 
 MAX_NUM_THREADS_FOR_SIZE_ESTIMATION = 25
 
+__all__ = ['FileBasedSource']
+
 
 class FileBasedSource(iobase.BoundedSource):
   """A ``BoundedSource`` for reading a file glob of a given type."""

http://git-wip-us.apache.org/repos/asf/beam/blob/0c784f95/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index ca3a759..aa18093 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -37,6 +37,8 @@ from apache_beam.options.value_provider import 
check_accessible
 
 DEFAULT_SHARD_NAME_TEMPLATE = '-S-of-N'
 
+__all__ = ['FileBasedSink']
+
 
 class FileSink(iobase.Sink):
   """A sink to a GCS or local files.
@@ -280,6 +282,11 @@ class FileSink(iobase.Sink):
 return type(self) == type(other) and self.__dict__ == other.__dict__
 
 
+# Using FileBasedSink for the public API to be symmetric with FileBasedSource.
+# TODO: move code from FileSink to here and delete that class.
+FileBasedSink = FileSink
+
+
 class FileSinkWriter(iobase.Writer):
   """The writer for FileSink.
   """

http://git-wip-us.apache.org/repos/asf/beam/blob/0c784f95/sdks/python/apache_beam/io/filesystem.py
--
diff --git a/sdks/python/apache_beam/io/filesystem.py 
b/sdks/python/apache_beam/io/filesystem.py
index 3d35f3e..db6a1d0 100644
--- a/sdks/python/apache_beam/io/filesystem.py
+++ b/sdks/python/apache_beam/io/filesystem.py
@@ -30,6 +30,9 @@ logger = logging.getLogger(__name__)
 
 DEFAULT_READ_BUFFER_SIZE = 16 * 1024 * 1024
 
+__all__ = ['CompressionTypes', 'CompressedFile', 'FileMetadata', 'FileSystem',
+   'MatchResult']
+
 
 class CompressionTypes(object):
   """Enum-like class representing known compression types."""

http://git-wip-us.apache.org/repos/asf/beam/blob/0c784f95/sdks/python/apache_beam/io/filesystems.py
--
diff --git a/sdks/python/apache_beam/io/filesystems.py 

[18/19] beam git commit: Add __all__ tags to modules in package apache_beam/testing

2017-05-11 Thread altay
Add __all__ tags to modules in package apache_beam/testing


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0f910b43
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0f910b43
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0f910b43

Branch: refs/heads/release-2.0.0
Commit: 0f910b43cbccca73d3b492560b9fc0d3f0901e21
Parents: a6f5888
Author: Charles Chen 
Authored: Wed May 10 23:20:20 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:54:12 2017 -0700

--
 sdks/python/apache_beam/testing/pipeline_verifiers.py |  8 
 sdks/python/apache_beam/testing/test_pipeline.py  |  5 +
 sdks/python/apache_beam/testing/test_stream.py| 14 +-
 sdks/python/apache_beam/testing/test_utils.py |  6 +-
 4 files changed, 31 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0f910b43/sdks/python/apache_beam/testing/pipeline_verifiers.py
--
diff --git a/sdks/python/apache_beam/testing/pipeline_verifiers.py 
b/sdks/python/apache_beam/testing/pipeline_verifiers.py
index 5a6082a..a08eb54 100644
--- a/sdks/python/apache_beam/testing/pipeline_verifiers.py
+++ b/sdks/python/apache_beam/testing/pipeline_verifiers.py
@@ -32,6 +32,14 @@ from apache_beam.runners.runner import PipelineState
 from apache_beam.testing import test_utils as utils
 from apache_beam.utils import retry
 
+
+__all__ = [
+'PipelineStateMatcher',
+'FileChecksumMatcher',
+'retry_on_io_error_and_server_error',
+]
+
+
 try:
   from apitools.base.py.exceptions import HttpError
 except ImportError:

http://git-wip-us.apache.org/repos/asf/beam/blob/0f910b43/sdks/python/apache_beam/testing/test_pipeline.py
--
diff --git a/sdks/python/apache_beam/testing/test_pipeline.py 
b/sdks/python/apache_beam/testing/test_pipeline.py
index 20f4839..13b1639 100644
--- a/sdks/python/apache_beam/testing/test_pipeline.py
+++ b/sdks/python/apache_beam/testing/test_pipeline.py
@@ -27,6 +27,11 @@ from apache_beam.options.pipeline_options import 
PipelineOptions
 from nose.plugins.skip import SkipTest
 
 
+__all__ = [
+'TestPipeline',
+]
+
+
 class TestPipeline(Pipeline):
   """TestPipeline class is used inside of Beam tests that can be configured to
   run against pipeline runner.

http://git-wip-us.apache.org/repos/asf/beam/blob/0f910b43/sdks/python/apache_beam/testing/test_stream.py
--
diff --git a/sdks/python/apache_beam/testing/test_stream.py 
b/sdks/python/apache_beam/testing/test_stream.py
index 7ae27b7..a06bcd0 100644
--- a/sdks/python/apache_beam/testing/test_stream.py
+++ b/sdks/python/apache_beam/testing/test_stream.py
@@ -15,7 +15,10 @@
 # limitations under the License.
 #
 
-"""Provides TestStream for verifying streaming runner semantics."""
+"""Provides TestStream for verifying streaming runner semantics.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 
 from abc import ABCMeta
 from abc import abstractmethod
@@ -28,6 +31,15 @@ from apache_beam.utils import timestamp
 from apache_beam.utils.windowed_value import WindowedValue
 
 
+__all__ = [
+'Event',
+'ElementEvent',
+'WatermarkEvent',
+'ProcessingTimeEvent',
+'TestStream',
+]
+
+
 class Event(object):
   """Test stream event to be emitted during execution of a TestStream."""
 

http://git-wip-us.apache.org/repos/asf/beam/blob/0f910b43/sdks/python/apache_beam/testing/test_utils.py
--
diff --git a/sdks/python/apache_beam/testing/test_utils.py 
b/sdks/python/apache_beam/testing/test_utils.py
index 666207e..9feb80e 100644
--- a/sdks/python/apache_beam/testing/test_utils.py
+++ b/sdks/python/apache_beam/testing/test_utils.py
@@ -15,7 +15,10 @@
 # limitations under the License.
 #
 
-"""Utility methods for testing"""
+"""Utility methods for testing
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 
 import hashlib
 import imp
@@ -23,6 +26,7 @@ from mock import Mock, patch
 
 from apache_beam.utils import retry
 
+
 DEFAULT_HASHING_ALG = 'sha1'
 
 



[12/19] beam git commit: Mark internal modules in python datastoreio

2017-05-11 Thread altay
Mark internal modules in python datastoreio


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/0a0cc2d6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/0a0cc2d6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/0a0cc2d6

Branch: refs/heads/release-2.0.0
Commit: 0a0cc2d6bd5c3c0e6ab9d1388d5d3c8dc5ca7760
Parents: 5b09511
Author: Vikas Kedigehalli 
Authored: Thu May 11 11:33:24 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py | 7 +--
 sdks/python/apache_beam/io/gcp/datastore/v1/helper.py | 5 -
 2 files changed, 9 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/0a0cc2d6/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
--
diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py 
b/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
index bc4d07f..0caf6d6 100644
--- a/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
+++ b/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
@@ -15,7 +15,11 @@
 # limitations under the License.
 #
 
-"""Fake datastore used for unit testing."""
+"""Fake datastore used for unit testing.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
+
 import uuid
 
 # Protect against environments where datastore library is not available.
@@ -27,7 +31,6 @@ except ImportError:
   pass
 # pylint: enable=wrong-import-order, wrong-import-position
 
-
 def create_run_query(entities, batch_size):
   """A fake datastore run_query method that returns entities in batches.
 

http://git-wip-us.apache.org/repos/asf/beam/blob/0a0cc2d6/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
--
diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py 
b/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
index a61884f..9e2c053 100644
--- a/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
+++ b/sdks/python/apache_beam/io/gcp/datastore/v1/helper.py
@@ -15,7 +15,10 @@
 # limitations under the License.
 #
 
-"""Cloud Datastore helper functions."""
+"""Cloud Datastore helper functions.
+
+For internal use only; no backwards-compatibility guarantees.
+"""
 import sys
 
 # Protect against environments where datastore library is not available.



[02/19] beam git commit: fix lint error in fake_datastore.py

2017-05-11 Thread altay
fix lint error in fake_datastore.py


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/dec27d8f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/dec27d8f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/dec27d8f

Branch: refs/heads/release-2.0.0
Commit: dec27d8f205f7a27e427ae9e451c4c875fa8cf04
Parents: 7fd012b
Author: Vikas Kedigehalli 
Authored: Thu May 11 14:39:55 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py | 1 +
 1 file changed, 1 insertion(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/dec27d8f/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
--
diff --git a/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py 
b/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
index 0caf6d6..2332579 100644
--- a/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
+++ b/sdks/python/apache_beam/io/gcp/datastore/v1/fake_datastore.py
@@ -31,6 +31,7 @@ except ImportError:
   pass
 # pylint: enable=wrong-import-order, wrong-import-position
 
+
 def create_run_query(entities, batch_size):
   """A fake datastore run_query method that returns entities in batches.
 



[07/19] beam git commit: Add internal comments to metrics

2017-05-11 Thread altay
Add internal comments to metrics


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a6543abb
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a6543abb
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a6543abb

Branch: refs/heads/release-2.0.0
Commit: a6543abb37e65fe6c7afcaf3ea670c75aee1f7d0
Parents: 6d02da0
Author: Ahmet Altay 
Authored: Thu May 11 14:50:33 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:36 2017 -0700

--
 sdks/python/apache_beam/metrics/__init__.py   | 1 +
 sdks/python/apache_beam/metrics/cells.py  | 2 ++
 sdks/python/apache_beam/metrics/execution.py  | 6 ++
 sdks/python/apache_beam/metrics/metric.py | 4 
 sdks/python/apache_beam/metrics/metricbase.py | 2 ++
 5 files changed, 11 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a6543abb/sdks/python/apache_beam/metrics/__init__.py
--
diff --git a/sdks/python/apache_beam/metrics/__init__.py 
b/sdks/python/apache_beam/metrics/__init__.py
index 164d1a8..8ce7bbb 100644
--- a/sdks/python/apache_beam/metrics/__init__.py
+++ b/sdks/python/apache_beam/metrics/__init__.py
@@ -15,3 +15,4 @@
 # limitations under the License.
 #
 from apache_beam.metrics.metric import Metrics
+from apache_beam.metrics.metric import MetricsFilter

http://git-wip-us.apache.org/repos/asf/beam/blob/a6543abb/sdks/python/apache_beam/metrics/cells.py
--
diff --git a/sdks/python/apache_beam/metrics/cells.py 
b/sdks/python/apache_beam/metrics/cells.py
index fbe3ad3..ba840f7 100644
--- a/sdks/python/apache_beam/metrics/cells.py
+++ b/sdks/python/apache_beam/metrics/cells.py
@@ -29,6 +29,8 @@ import threading
 from apache_beam.metrics.metricbase import Counter
 from apache_beam.metrics.metricbase import Distribution
 
+__all__ = ['DistributionResult']
+
 
 class CellCommitState(object):
   """For internal use only; no backwards-compatibility guarantees.

http://git-wip-us.apache.org/repos/asf/beam/blob/a6543abb/sdks/python/apache_beam/metrics/execution.py
--
diff --git a/sdks/python/apache_beam/metrics/execution.py 
b/sdks/python/apache_beam/metrics/execution.py
index a06ec0c..675e49c 100644
--- a/sdks/python/apache_beam/metrics/execution.py
+++ b/sdks/python/apache_beam/metrics/execution.py
@@ -24,7 +24,7 @@ Available classes:
 
 - MetricKey - Internal key for a metric.
 - MetricResult - Current status of a metric's updates/commits.
-- MetricsEnvironment - Keeps track of MetricsContainer and other metrics
+- _MetricsEnvironment - Keeps track of MetricsContainer and other metrics
 information for every single execution working thread.
 - MetricsContainer - Holds the metrics of a single step and a single
 unit-of-commit (bundle).
@@ -36,9 +36,7 @@ from apache_beam.metrics.cells import CounterCell, 
DistributionCell
 
 
 class MetricKey(object):
-  """
-
-  Key used to identify instance of metric cell.
+  """Key used to identify instance of metric cell.
 
   Metrics are internally keyed by the step name they associated with and
   the name of the metric.

http://git-wip-us.apache.org/repos/asf/beam/blob/a6543abb/sdks/python/apache_beam/metrics/metric.py
--
diff --git a/sdks/python/apache_beam/metrics/metric.py 
b/sdks/python/apache_beam/metrics/metric.py
index 33db4e1..f99c0c4 100644
--- a/sdks/python/apache_beam/metrics/metric.py
+++ b/sdks/python/apache_beam/metrics/metric.py
@@ -30,6 +30,8 @@ from apache_beam.metrics.execution import MetricsEnvironment
 from apache_beam.metrics.metricbase import Counter, Distribution
 from apache_beam.metrics.metricbase import MetricName
 
+__all__ = ['Metrics', 'MetricsFilter']
+
 
 class Metrics(object):
   """Lets users create/access metric objects during pipeline execution."""
@@ -146,6 +148,8 @@ class MetricResults(object):
 class MetricsFilter(object):
   """Simple object to filter metrics results.
 
+  This class is experimental. No backwards-compatibility guarantees.
+
   If filters by matching a result's step-namespace-name with three internal
   sets. No execution/matching logic is added to this object, so that it may
   be used to construct arguments as an RPC request. It is left for runners

http://git-wip-us.apache.org/repos/asf/beam/blob/a6543abb/sdks/python/apache_beam/metrics/metricbase.py
--
diff --git a/sdks/python/apache_beam/metrics/metricbase.py 
b/sdks/python/apache_beam/metrics/metricbase.py
index fa0ca75..699f29c 100644
--- 

[14/19] beam git commit: [BEAM-1345] Mark apache_beam/internal as internal.

2017-05-11 Thread altay
[BEAM-1345] Mark apache_beam/internal as internal.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2bb95fd6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2bb95fd6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2bb95fd6

Branch: refs/heads/release-2.0.0
Commit: 2bb95fd68635ea1932f3a7d7cc59fcef1644deea
Parents: 36fcd36
Author: Robert Bradshaw 
Authored: Thu May 11 12:09:28 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 16:20:37 2017 -0700

--
 sdks/python/apache_beam/internal/__init__.py | 2 ++
 sdks/python/apache_beam/internal/gcp/__init__.py | 2 ++
 2 files changed, 4 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2bb95fd6/sdks/python/apache_beam/internal/__init__.py
--
diff --git a/sdks/python/apache_beam/internal/__init__.py 
b/sdks/python/apache_beam/internal/__init__.py
index cce3aca..0bce5d6 100644
--- a/sdks/python/apache_beam/internal/__init__.py
+++ b/sdks/python/apache_beam/internal/__init__.py
@@ -14,3 +14,5 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
+
+"""For internal use only; no backwards-compatibility guarantees."""

http://git-wip-us.apache.org/repos/asf/beam/blob/2bb95fd6/sdks/python/apache_beam/internal/gcp/__init__.py
--
diff --git a/sdks/python/apache_beam/internal/gcp/__init__.py 
b/sdks/python/apache_beam/internal/gcp/__init__.py
index cce3aca..0bce5d6 100644
--- a/sdks/python/apache_beam/internal/gcp/__init__.py
+++ b/sdks/python/apache_beam/internal/gcp/__init__.py
@@ -14,3 +14,5 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 #
+
+"""For internal use only; no backwards-compatibility guarantees."""



[2/2] beam git commit: This closes #3106

2017-05-11 Thread altay
This closes #3106


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fc77ca7c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fc77ca7c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fc77ca7c

Branch: refs/heads/master
Commit: fc77ca7cb18f5dc93dca9427112a3eaaf1494788
Parents: 07e5536 a45d63f
Author: Ahmet Altay 
Authored: Thu May 11 17:01:38 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 17:01:38 2017 -0700

--
 sdks/python/apache_beam/testing/data/privatekey.p12 | Bin 2452 -> 0 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
--




[GitHub] beam pull request #3106: Remove unused test data

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3106


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Remove unused test data

2017-05-11 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 07e5536bc -> fc77ca7cb


Remove unused test data


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a45d63f6
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a45d63f6
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a45d63f6

Branch: refs/heads/master
Commit: a45d63f6362e88d034704a4749cef7eedc76a691
Parents: 07e5536
Author: Sourabh Bajaj 
Authored: Thu May 11 16:41:06 2017 -0700
Committer: Sourabh Bajaj 
Committed: Thu May 11 16:41:06 2017 -0700

--
 sdks/python/apache_beam/testing/data/privatekey.p12 | Bin 2452 -> 0 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a45d63f6/sdks/python/apache_beam/testing/data/privatekey.p12
--
diff --git a/sdks/python/apache_beam/testing/data/privatekey.p12 
b/sdks/python/apache_beam/testing/data/privatekey.p12
deleted file mode 100644
index c369ecb..000
Binary files a/sdks/python/apache_beam/testing/data/privatekey.p12 and 
/dev/null differ



[jira] [Created] (BEAM-2273) mvn clean doesn't fully clean up archetypes.

2017-05-11 Thread Jason Kuster (JIRA)
Jason Kuster created BEAM-2273:
--

 Summary: mvn clean doesn't fully clean up archetypes.
 Key: BEAM-2273
 URL: https://issues.apache.org/jira/browse/BEAM-2273
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Jason Kuster
Assignee: Jason Kuster
 Fix For: 2.0.0






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2273) mvn clean doesn't fully clean up archetypes.

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007389#comment-16007389
 ] 

ASF GitHub Bot commented on BEAM-2273:
--

GitHub user jasonkuster opened a pull request:

https://github.com/apache/beam/pull/3107

[BEAM-2273] Fully clean up archetypes when running mvn clean

Signed-off-by: Jason Kuster 

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
R: @dhalperi @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jasonkuster/beam clean-up

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3107.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3107


commit 131f757996630762ecc0cce8c5b568441c651945
Author: Jason Kuster 
Date:   2017-05-11T23:46:25Z

Fully clean up archetypes when running mvn clean

Signed-off-by: Jason Kuster 




> mvn clean doesn't fully clean up archetypes.
> 
>
> Key: BEAM-2273
> URL: https://issues.apache.org/jira/browse/BEAM-2273
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Jason Kuster
>Assignee: Jason Kuster
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3107: [BEAM-2273] Fully clean up archetypes when running ...

2017-05-11 Thread jasonkuster
GitHub user jasonkuster opened a pull request:

https://github.com/apache/beam/pull/3107

[BEAM-2273] Fully clean up archetypes when running mvn clean

Signed-off-by: Jason Kuster 

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
R: @dhalperi @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jasonkuster/beam clean-up

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3107.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3107


commit 131f757996630762ecc0cce8c5b568441c651945
Author: Jason Kuster 
Date:   2017-05-11T23:46:25Z

Fully clean up archetypes when running mvn clean

Signed-off-by: Jason Kuster 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3106: Remove unused test data

2017-05-11 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/3106

Remove unused test data

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

R: @aaltay PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-remove-unused-test-data

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3106.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3106


commit a45d63f6362e88d034704a4749cef7eedc76a691
Author: Sourabh Bajaj 
Date:   2017-05-11T23:41:06Z

Remove unused test data




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1345) Mark @Experimental and @Internal where needed in user-facing bits of the codebase

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007380#comment-16007380
 ] 

ASF GitHub Bot commented on BEAM-1345:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3105

[BEAM-1345] Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam documentation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3105


commit 2042d7e3ff0d36ebae66e8363976276760d468d9
Author: Thomas Groh 
Date:   2017-05-11T23:17:54Z

Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.




> Mark @Experimental and @Internal where needed in user-facing bits of the 
> codebase
> -
>
> Key: BEAM-1345
> URL: https://issues.apache.org/jira/browse/BEAM-1345
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> A blanket JIRA to ensure that before a stable release we make sure to mark 
> those pieces that would be unwise to freeze yet and consider how best to 
> communicate this to users, who may just autocomplete those features in their 
> IDE anyhow. (conversely, put infrastructure in place to enforce freezing of 
> the rest)
> Not technically "backwards incompatible" with pre-Beam Dataflow SDKs, but 
> certainly needs to be on the burndown for the first stable release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3105: [BEAM-1345] Mark More values methods Internal

2017-05-11 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3105

[BEAM-1345] Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam documentation

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3105


commit 2042d7e3ff0d36ebae66e8363976276760d468d9
Author: Thomas Groh 
Date:   2017-05-11T23:17:54Z

Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1345) Mark @Experimental and @Internal where needed in user-facing bits of the codebase

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007379#comment-16007379
 ] 

ASF GitHub Bot commented on BEAM-1345:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3104

[BEAM-1345] Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam documents_master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3104


commit 207d459dd535c8abb944cf4d0ca47676fc530117
Author: Thomas Groh 
Date:   2017-05-11T23:17:54Z

Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.




> Mark @Experimental and @Internal where needed in user-facing bits of the 
> codebase
> -
>
> Key: BEAM-1345
> URL: https://issues.apache.org/jira/browse/BEAM-1345
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions, sdk-java-gcp
>Reporter: Kenneth Knowles
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> A blanket JIRA to ensure that before a stable release we make sure to mark 
> those pieces that would be unwise to freeze yet and consider how best to 
> communicate this to users, who may just autocomplete those features in their 
> IDE anyhow. (conversely, put infrastructure in place to enforce freezing of 
> the rest)
> Not technically "backwards incompatible" with pre-Beam Dataflow SDKs, but 
> certainly needs to be on the burndown for the first stable release.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3104: [BEAM-1345] Mark More values methods Internal

2017-05-11 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3104

[BEAM-1345] Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam documents_master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3104.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3104


commit 207d459dd535c8abb944cf4d0ca47676fc530117
Author: Thomas Groh 
Date:   2017-05-11T23:17:54Z

Mark More values methods Internal

PCollectionTuple#ofPrimitiveOutputsInternal is internal.

TaggedPValue is internal.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-1158) Create HiveIO

2017-05-11 Thread Seshadri Raghunathan (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007375#comment-16007375
 ] 

Seshadri Raghunathan commented on BEAM-1158:


Based on  design review comments from community and on further discussion / 
follow-up with Ismaël Mejía, we have updated the design document.

Below are the details -
Design proposal - 
https://docs.google.com/document/d/1aeQRLXjVr38Z03_zWkHO9YQhtnj0jHoCfhsSNm-wxtA/edit
Draft implementation - 
https://github.com/seshadri-cr/beam/commit/78cdf8772f2cd5bb9cd018b1c99c3ad0854157c1

Looking forward for further review from wider community to move forward on this 
proposal.

Thanks,
Seshadri

> Create HiveIO
> -
>
> Key: BEAM-1158
> URL: https://issues.apache.org/jira/browse/BEAM-1158
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: Not applicable
>Reporter: Ismaël Mejía
>Assignee: Seshadri Raghunathan
>Priority: Minor
>
> Support for reading and writing to Hive. The HiveIO will directly use the 
> native API and HiveQL



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2793

2017-05-11 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3103: Cherry-pick #3070 #3087 #3088 #3095 #3096 #3090 #30...

2017-05-11 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/3103

Cherry-pick #3070 #3087 #3088 #3095 #3096 #3090 #3098 #3089 #3075 #3099 
#3094 #3086 #3065 #3101

R: @dhalperi 
cc: @davorbonaci @sb2nov @charlesccychen @robertwb @chamikaramj

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam mcp1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3103


commit 7fd012b63f7ff8b10d80bce53a860ad7102673d6
Author: Robert Bradshaw 
Date:   2017-05-11T00:13:06Z

[BEAM-1345] Clearly delineate public api in apache_beam/coders.

commit 6d02da03981f08037538e63e9246efef63a36ea0
Author: Charles Chen 
Date:   2017-05-11T06:06:36Z

[BEAM-1340] Add __all__ tags to modules in package apache_beam/transforms

commit 0a0cc2d6bd5c3c0e6ab9d1388d5d3c8dc5ca7760
Author: Vikas Kedigehalli 
Date:   2017-05-11T18:33:24Z

Mark internal modules in python datastoreio

commit 9730f56eaf62c8164a517d5fe0cc03650c39a732
Author: Ahmet Altay 
Date:   2017-05-11T18:34:59Z

Add internal usage only comments to util/

commit 0c784f958f11eb0e448cc448d2b9d36fd90e9972
Author: chamik...@google.com 
Date:   2017-05-11T18:46:46Z

[BEAM-1340] Adds __all__ tags to classes in package apache_beam/io.

commit 5b09511873136cc6aff8c75f3767c2d81e288940
Author: Sourabh Bajaj 
Date:   2017-05-11T18:58:08Z

[BEAM-1345] Mark windowed value as experimental

commit 36fcd36cb779592bc3964d4949a70df5a1cc1428
Author: Robert Bradshaw 
Date:   2017-05-11T19:54:13Z

[BEAM-1345] Clearly delineate public api in apache_beam/typehints.

commit 0ce25430ffec49fb2d94271e4af6225cee20388c
Author: Robert Bradshaw 
Date:   2017-05-11T20:30:32Z

[BEAM-1345] Mark Pipeline as public.

commit 6d77f958de666ccf5f59e907c292efbc9272b49b
Author: Charles Chen 
Date:   2017-05-11T20:31:18Z

[BEAM-1345] Clearly delineate public API in apache_beam/options

commit aeeefc1725bca11f765f57bf2833d606a0e6d6ba
Author: Robert Bradshaw 
Date:   2017-05-11T20:41:24Z

[BEAM-1345] Clearly delineate public api in runners package.

commit dec27d8f205f7a27e427ae9e451c4c875fa8cf04
Author: Vikas Kedigehalli 
Date:   2017-05-11T21:39:55Z

fix lint error in fake_datastore.py

commit a6543abb37e65fe6c7afcaf3ea670c75aee1f7d0
Author: Ahmet Altay 
Date:   2017-05-11T21:50:33Z

Add internal comments to metrics

commit e12cf0dfd810ec26965ae98e5a2256f560c6ff6d
Author: Robert Bradshaw 
Date:   2017-05-10T22:55:46Z

Remove some internal details from the public API.

commit d0da682d5e981217c90571febdff728b8abbbc14
Author: Robert Bradshaw 
Date:   2017-05-11T19:07:00Z

[BEAM-1345] Annotate public members of pvalue.

commit 2bb95fd68635ea1932f3a7d7cc59fcef1644deea
Author: Robert Bradshaw 
Date:   2017-05-11T19:09:28Z

[BEAM-1345] Mark apache_beam/internal as internal.

commit 2070f1182d49e3b7b3e9ed8a35173cb165fa5bfb
Author: Charles Chen 
Date:   2017-05-11T22:07:30Z

Move assert_that, equal_to, is_empty to apache_beam.testing.util

commit a6f5888e42ef34eacfe6384c0de9df9371ae968c
Author: Robert Bradshaw 
Date:   2017-05-11T22:47:54Z

Fix due to GBKO name change.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2272) Test scope & runtime dependencies need reevaluation

2017-05-11 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-2272:
-

 Summary: Test scope & runtime dependencies need reevaluation
 Key: BEAM-2272
 URL: https://issues.apache.org/jira/browse/BEAM-2272
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow, runner-direct, sdk-java-core
Reporter: Daniel Halperin
Assignee: Thomas Groh
 Fix For: 2.1.0


Now that SDK core has been thinned out, many of the modules that have it in 
their {{dependenciesToScan}} will need to update their test-scoped dependencies 
for things that moved out of SDK.

Known examples include protobuf and XML and perhaps others.

We should re-enable dependency analysis on test scope and manually validate and 
ignore specific known runtime/test-runtime dependencies.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2271) Release guide or pom.xml needs update to avoid releasing Python binary artifacts

2017-05-11 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-2271:
-

 Summary: Release guide or pom.xml needs update to avoid releasing 
Python binary artifacts
 Key: BEAM-2271
 URL: https://issues.apache.org/jira/browse/BEAM-2271
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Daniel Halperin
Assignee: Ahmet Altay
 Fix For: 2.1.0


The following directories (and children) were discovered in 2.0.0-RC2 and were 
present in 0.6.0.

{code}
sdks/python: build   dist.eggs   nose-1.3.7-py2.7.egg  (and child contents)
{code}

Ideally, these artifacts, which are created during setup and testing, would get 
created in the {{sdks/python/target/}} subfolder where they will automatically 
get ignored. More info below.

For 2.0.0, we will manually remove these files from the source release RC3+. 
This should be fixed before the next release.

Here is a list of other paths that get excluded, should they be useful.

{code}



%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/).*${project.build.directory}.*]


 


%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?maven-eclipse\.xml]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.project]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.classpath]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iws]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.idea(/.*)?]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?out(/.*)?]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.ipr]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iml]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.settings(/.*)?]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.externalToolBuilders(/.*)?]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.deployables(/.*)?]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.wtpmodules(/.*)?]



%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?cobertura\.ser]



%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?pom\.xml\.releaseBackup]

%regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?release\.properties]
  
{code}

This list is stored inside of this jar, which you can find by tracking 
maven-assembly-plugin from the root apache pom: 
https://mvnrepository.com/artifact/org.apache.apache.resources/apache-source-release-assembly-descriptor/1.0.6


http://svn.apache.org/repos/asf/maven/pom/tags/apache-18/pom.xml



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2270) Examples archetype bundles Hadoop 2.6 in its jar for ApexRunner; cannot run on Hadoop 2.7?

2017-05-11 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2270:
-

 Summary: Examples archetype bundles Hadoop 2.6 in its jar for 
ApexRunner; cannot run on Hadoop 2.7?
 Key: BEAM-2270
 URL: https://issues.apache.org/jira/browse/BEAM-2270
 Project: Beam
  Issue Type: Bug
  Components: examples-java, runner-apex, sdk-java-extensions
Reporter: Kenneth Knowles
Assignee: Thomas Weise


In an instantiated examples archetype, with {{-P apex-runner}}, Apex depends on 
Hadoop 2.6.0 and this is bundles into the examples jar.

In order to get this to run on Hadoop 2.7.3 I added this to the profile:

{code}
  
2.7.3
  
 
  

  org.apache.hadoop
  hadoop-yarn-client
  ${hadoop.version}


  org.apache.hadoop
  hadoop-common
  ${hadoop.version}

  
{code}

It is not clear to me what the best path is, here. Clearly the way we bundle is 
brittle and probably not the recommended best practice. But also perhaps the 
deps of the runner can be modified to {{provided}}.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3101: Fix due to GBKO name change.

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3101


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: Closes #3101

2017-05-11 Thread robertwb
Closes #3101


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/07e5536b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/07e5536b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/07e5536b

Branch: refs/heads/master
Commit: 07e5536bc128a324e213eec1a6bd97beb950dcc6
Parents: 37e0b43 fc3e28d
Author: Robert Bradshaw 
Authored: Thu May 11 15:50:35 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu May 11 15:50:35 2017 -0700

--
 .../apache_beam/runners/portability/maptask_executor_runner.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[1/2] beam git commit: Fix due to GBKO name change.

2017-05-11 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master 37e0b437e -> 07e5536bc


Fix due to GBKO name change.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fc3e28de
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fc3e28de
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fc3e28de

Branch: refs/heads/master
Commit: fc3e28de37acc9de597518cf9fe793d1ddebcf2a
Parents: 37e0b43
Author: Robert Bradshaw 
Authored: Thu May 11 15:47:54 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu May 11 15:50:31 2017 -0700

--
 .../apache_beam/runners/portability/maptask_executor_runner.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/fc3e28de/sdks/python/apache_beam/runners/portability/maptask_executor_runner.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/maptask_executor_runner.py 
b/sdks/python/apache_beam/runners/portability/maptask_executor_runner.py
index 077871e..ddfc4cc 100644
--- a/sdks/python/apache_beam/runners/portability/maptask_executor_runner.py
+++ b/sdks/python/apache_beam/runners/portability/maptask_executor_runner.py
@@ -243,7 +243,7 @@ class MapTaskExecutorRunner(PipelineRunner):
 (label, write_sideinput_op))
 return output_buffer
 
-  def run_GroupByKeyOnly(self, transform_node):
+  def run__GroupByKeyOnly(self, transform_node):
 map_task_index, producer_index, output_index = self.outputs[
 transform_node.inputs[0]]
 grouped_element_coder = self._get_coder(transform_node.outputs[None],



[jira] [Created] (BEAM-2269) Apex Runner unable to launch local cluster on Windows

2017-05-11 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2269:
---

 Summary: Apex Runner unable to launch local cluster on Windows
 Key: BEAM-2269
 URL: https://issues.apache.org/jira/browse/BEAM-2269
 Project: Beam
  Issue Type: Bug
  Components: runner-apex
Affects Versions: 2.0.0
 Environment: Windows 2016 against Apache Beam 2.0.0 RC2
JDK 8u131
Maven 3.5.0
Reporter: Luke Cwik


Command:
{code}
mvn compile exec:java -P apex-runner -D exec.mainClass=beamrc.WordCount -D 
exec.args="--runner=ApexRunner --output=output/apex-counts"
{code}

Fails with exception:
{code}
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:293)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Error creating local cluster
at 
org.apache.apex.engine.EmbeddedAppLauncherImpl.getController(EmbeddedAppLauncherImpl.java:122)
at 
org.apache.apex.engine.EmbeddedAppLauncherImpl.launchApp(EmbeddedAppLauncherImpl.java:71)
at 
org.apache.apex.engine.EmbeddedAppLauncherImpl.launchApp(EmbeddedAppLauncherImpl.java:46)
at org.apache.beam.runners.apex.ApexRunner.run(ApexRunner.java:175)
at org.apache.beam.runners.apex.ApexRunner.run(ApexRunner.java:73)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
at beamrc.WordCount.main(WordCount.java:184)
... 6 more
Caused by: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:808)
at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
at 
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:656)
at 
org.apache.hadoop.fs.RawLocalFileSystem.create(RawLocalFileSystem.java:298)
at org.apache.hadoop.fs.FileSystem.primitiveCreate(FileSystem.java:1014)
at 
org.apache.hadoop.fs.DelegateToFileSystem.createInternal(DelegateToFileSystem.java:85)
at 
org.apache.hadoop.fs.ChecksumFs$ChecksumFSOutputSummer.(ChecksumFs.java:347)
at org.apache.hadoop.fs.ChecksumFs.createInternal(ChecksumFs.java:394)
at 
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:680)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:676)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.create(FileContext.java:676)
at 
com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:119)
at 
com.datatorrent.common.util.AsyncFSStorageAgent.flush(AsyncFSStorageAgent.java:156)
at 
com.datatorrent.stram.plan.physical.PhysicalPlan.initCheckpoint(PhysicalPlan.java:1230)
at 
com.datatorrent.stram.plan.physical.PhysicalPlan.(PhysicalPlan.java:495)
at 
com.datatorrent.stram.StreamingContainerManager.(StreamingContainerManager.java:425)
at 
com.datatorrent.stram.StreamingContainerManager.(StreamingContainerManager.java:413)
at 
com.datatorrent.stram.StramLocalCluster.(StramLocalCluster.java:314)
at 
org.apache.apex.engine.EmbeddedAppLauncherImpl.getController(EmbeddedAppLauncherImpl.java:120)
... 13 more
{code}

If Apex doesn't support Windows for local cluster execution then this should 
either be closed as an unsupported use case or changed to be a feature request.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Don't deploy jdk1.8-tests module

2017-05-11 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/release-2.0.0 cf1ce7b24 -> 65aa0ffd3


Don't deploy jdk1.8-tests module


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7422dedb
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7422dedb
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7422dedb

Branch: refs/heads/release-2.0.0
Commit: 7422dedbc5605191cb0a79c6120d61eb95fa2b09
Parents: cf1ce7b
Author: Dan Halperin 
Authored: Thu May 11 14:03:08 2017 -0700
Committer: Dan Halperin 
Committed: Thu May 11 15:50:41 2017 -0700

--
 pom.xml  | 6 ++
 sdks/java/io/hadoop/jdk1.8-tests/pom.xml | 7 +++
 2 files changed, 13 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7422dedb/pom.xml
--
diff --git a/pom.xml b/pom.xml
index d192c8b..3d02096 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1179,6 +1179,12 @@
 
 
   org.apache.maven.plugins
+  maven-deploy-plugin
+  2.8.2
+
+
+
+  org.apache.maven.plugins
   maven-jar-plugin
   3.0.2
   

http://git-wip-us.apache.org/repos/asf/beam/blob/7422dedb/sdks/java/io/hadoop/jdk1.8-tests/pom.xml
--
diff --git a/sdks/java/io/hadoop/jdk1.8-tests/pom.xml 
b/sdks/java/io/hadoop/jdk1.8-tests/pom.xml
index c1731a8..ec83bbb 100644
--- a/sdks/java/io/hadoop/jdk1.8-tests/pom.xml
+++ b/sdks/java/io/hadoop/jdk1.8-tests/pom.xml
@@ -80,6 +80,13 @@
   none
 
   
+  
+org.apache.maven.plugins
+maven-deploy-plugin
+
+  true
+
+  
 
   
   

[GitHub] beam pull request #3102: Cherry-pick #3097 into release-2.0.0 branch

2017-05-11 Thread dhalperi
Github user dhalperi closed the pull request at:

https://github.com/apache/beam/pull/3102


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3102

2017-05-11 Thread dhalperi
This closes #3102


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/65aa0ffd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/65aa0ffd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/65aa0ffd

Branch: refs/heads/release-2.0.0
Commit: 65aa0ffd331b8d9262cb55074b5426e4a1b222ff
Parents: cf1ce7b 7422ded
Author: Dan Halperin 
Authored: Thu May 11 15:51:50 2017 -0700
Committer: Dan Halperin 
Committed: Thu May 11 15:51:50 2017 -0700

--
 pom.xml  | 6 ++
 sdks/java/io/hadoop/jdk1.8-tests/pom.xml | 7 +++
 2 files changed, 13 insertions(+)
--




[GitHub] beam pull request #3102: Cherry-pick #3097 into release-2.0.0 branch

2017-05-11 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/3102

Cherry-pick #3097 into release-2.0.0 branch

Don't deploy jdk1.8-tests module

R: @davorbonaci 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam cp-3097

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3102.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3102


commit 7422dedbc5605191cb0a79c6120d61eb95fa2b09
Author: Dan Halperin 
Date:   2017-05-11T21:03:08Z

Don't deploy jdk1.8-tests module




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3097

2017-05-11 Thread dhalperi
This closes #3097


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/37e0b437
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/37e0b437
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/37e0b437

Branch: refs/heads/master
Commit: 37e0b437e14991b6627efa8c3fc5b215e8925f81
Parents: fe62567 d76349b
Author: Dan Halperin 
Authored: Thu May 11 15:49:39 2017 -0700
Committer: Dan Halperin 
Committed: Thu May 11 15:49:39 2017 -0700

--
 pom.xml  | 6 ++
 sdks/java/io/hadoop/jdk1.8-tests/pom.xml | 7 +++
 2 files changed, 13 insertions(+)
--




[GitHub] beam pull request #3101: Fix due to GBKO name change.

2017-05-11 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/3101

Fix due to GBKO name change.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam gbko

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3101.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3101


commit 792ef35d967b1505c9f23378d1753cc498bbb926
Author: Robert Bradshaw 
Date:   2017-05-11T22:47:54Z

Fix due to GBKO name change.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3065: [BEAM-1345] Make a couple more things private in th...

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3065


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3065

2017-05-11 Thread altay
This closes #3065


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/fe625678
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/fe625678
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/fe625678

Branch: refs/heads/master
Commit: fe625678f55d9c9cc288b644749d7916015cf080
Parents: 1b31167 98e685d
Author: Ahmet Altay 
Authored: Thu May 11 15:43:21 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 15:43:21 2017 -0700

--
 sdks/python/apache_beam/pipeline.py |  4 ++--
 .../runners/dataflow/dataflow_runner.py |  6 +++---
 .../runners/dataflow/dataflow_runner_test.py|  6 +++---
 .../apache_beam/runners/direct/executor.py  |  2 +-
 .../runners/direct/transform_evaluator.py   | 10 -
 sdks/python/apache_beam/transforms/__init__.py  |  2 +-
 sdks/python/apache_beam/transforms/core.py  | 22 ++--
 .../python/apache_beam/transforms/ptransform.py |  6 +++---
 .../apache_beam/transforms/ptransform_test.py   | 12 +--
 sdks/python/apache_beam/typehints/typecheck.py  |  2 +-
 10 files changed, 36 insertions(+), 36 deletions(-)
--




[jira] [Created] (BEAM-2268) NullPointerException in com.datatorrent.netlet.util.Slice via ApexStateInternals

2017-05-11 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2268:
-

 Summary: NullPointerException in com.datatorrent.netlet.util.Slice 
via ApexStateInternals
 Key: BEAM-2268
 URL: https://issues.apache.org/jira/browse/BEAM-2268
 Project: Beam
  Issue Type: Bug
  Components: runner-apex
Reporter: Kenneth Knowles
Assignee: Thomas Weise


I've browsed the code, and this is a certain NPE in netlet 1.2.1 but our 
dependencies are on 1.3.0.

I have read over the dependency tree (both our project and generated archetype) 
and deliberately added even tighter deps on 1.3.0. I have not managed to force 
any dependency on 1.2.1, yet maven clearly logs its download of 1.2.1 and it 
appears to be running against it.

{code}
java.lang.NullPointerException
at com.datatorrent.netlet.util.Slice.(Slice.java:54)
at 
org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:445)
at 
org.apache.beam.runners.apex.translation.utils.ApexStateInternals$ApexStateInternalsFactory.stateInternalsForKey(ApexStateInternals.java:427)
{code}

Jenkins obviously has a different configuration, so that is a place to look 
next, perhaps.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Remove some internal details from the public API.

2017-05-11 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 1b31167fb -> fe625678f


Remove some internal details from the public API.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/98e685d8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/98e685d8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/98e685d8

Branch: refs/heads/master
Commit: 98e685d85a4cf75faee8156516d7d3aff60077f3
Parents: 1b31167
Author: Robert Bradshaw 
Authored: Wed May 10 15:55:46 2017 -0700
Committer: Ahmet Altay 
Committed: Thu May 11 15:43:17 2017 -0700

--
 sdks/python/apache_beam/pipeline.py |  4 ++--
 .../runners/dataflow/dataflow_runner.py |  6 +++---
 .../runners/dataflow/dataflow_runner_test.py|  6 +++---
 .../apache_beam/runners/direct/executor.py  |  2 +-
 .../runners/direct/transform_evaluator.py   | 10 -
 sdks/python/apache_beam/transforms/__init__.py  |  2 +-
 sdks/python/apache_beam/transforms/core.py  | 22 ++--
 .../python/apache_beam/transforms/ptransform.py |  6 +++---
 .../apache_beam/transforms/ptransform_test.py   | 12 +--
 sdks/python/apache_beam/typehints/typecheck.py  |  2 +-
 10 files changed, 36 insertions(+), 36 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/98e685d8/sdks/python/apache_beam/pipeline.py
--
diff --git a/sdks/python/apache_beam/pipeline.py 
b/sdks/python/apache_beam/pipeline.py
index 79480d7..5048534 100644
--- a/sdks/python/apache_beam/pipeline.py
+++ b/sdks/python/apache_beam/pipeline.py
@@ -77,8 +77,8 @@ class Pipeline(object):
   the PValues are the edges.
 
   All the transforms applied to the pipeline must have distinct full labels.
-  If same transform instance needs to be applied then a clone should be created
-  with a new label (e.g., transform.clone('new label')).
+  If same transform instance needs to be applied then the right shift operator
+  should be used to designate new names (e.g. `input | "label" >> 
my_tranform`).
   """
 
   def __init__(self, runner=None, options=None, argv=None):

http://git-wip-us.apache.org/repos/asf/beam/blob/98e685d8/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
index da8de9d..0ecd22a 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py
@@ -160,7 +160,7 @@ class DataflowRunner(PipelineRunner):
 
 class GroupByKeyInputVisitor(PipelineVisitor):
   """A visitor that replaces `Any` element type for input `PCollection` of
-  a `GroupByKey` or `GroupByKeyOnly` with a `KV` type.
+  a `GroupByKey` or `_GroupByKeyOnly` with a `KV` type.
 
   TODO(BEAM-115): Once Python SDk is compatible with the new Runner API,
   we could directly replace the coder instead of mutating the element type.
@@ -169,8 +169,8 @@ class DataflowRunner(PipelineRunner):
   def visit_transform(self, transform_node):
 # Imported here to avoid circular dependencies.
 # pylint: disable=wrong-import-order, wrong-import-position
-from apache_beam.transforms.core import GroupByKey, GroupByKeyOnly
-if isinstance(transform_node.transform, (GroupByKey, GroupByKeyOnly)):
+from apache_beam.transforms.core import GroupByKey, _GroupByKeyOnly
+if isinstance(transform_node.transform, (GroupByKey, _GroupByKeyOnly)):
   pcoll = transform_node.inputs[0]
   input_type = pcoll.element_type
   # If input_type is not specified, then treat it as `Any`.

http://git-wip-us.apache.org/repos/asf/beam/blob/98e685d8/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
--
diff --git a/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py 
b/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
index ac9b028..ff4b51d 100644
--- a/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
+++ b/sdks/python/apache_beam/runners/dataflow/dataflow_runner_test.py
@@ -37,7 +37,7 @@ from apache_beam.runners.dataflow.dataflow_runner import 
DataflowRuntimeExceptio
 from apache_beam.runners.dataflow.internal.clients import dataflow as 
dataflow_api
 from apache_beam.testing.test_pipeline import TestPipeline
 from apache_beam.transforms.display import DisplayDataItem
-from apache_beam.transforms.core import GroupByKeyOnly
+from apache_beam.transforms.core import _GroupByKeyOnly
 from apache_beam.typehints import typehints

[jira] [Commented] (BEAM-1340) Remove or make private public bits of the SDK that shouldn't be public

2017-05-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007309#comment-16007309
 ] 

ASF GitHub Bot commented on BEAM-1340:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3086


> Remove or make private public bits of the SDK that shouldn't be public
> --
>
> Key: BEAM-1340
> URL: https://issues.apache.org/jira/browse/BEAM-1340
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core, sdk-java-extensions
>Reporter: Kenneth Knowles
>Priority: Blocker
>  Labels: backward-incompatible
> Fix For: 2.0.0
>
>
> This JIRA is for the many small changes that do not merit their own JIRA 
> towards getting the SDK's API surface right. For example, removal of 
> `DoFn.InputProvider` and `DoFn.OutputReceiver`.
> While the above is not quite backwards incompatible, succeeding at this task 
> surely will be.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #3086: [BEAM-1340] Move assert_that, equal_to, is_empty to...

2017-05-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3086


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: Closes #3086

2017-05-11 Thread robertwb
Closes #3086


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1b31167f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1b31167f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1b31167f

Branch: refs/heads/master
Commit: 1b31167fb8bac634d40e5ab5b86b0e928e00f309
Parents: 4a62fab 0b7356e
Author: Robert Bradshaw 
Authored: Thu May 11 15:27:59 2017 -0700
Committer: Robert Bradshaw 
Committed: Thu May 11 15:27:59 2017 -0700

--
 .../examples/complete/autocomplete_test.py  |   4 +-
 .../examples/complete/estimate_pi_test.py   |   4 +-
 .../complete/game/hourly_team_score_test.py |   4 +-
 .../examples/complete/game/user_score_test.py   |   4 +-
 .../apache_beam/examples/complete/tfidf_test.py |   4 +-
 .../complete/top_wikipedia_sessions_test.py |   4 +-
 .../cookbook/bigquery_side_input_test.py|   4 +-
 .../cookbook/bigquery_tornadoes_test.py |   6 +-
 .../examples/cookbook/coders_test.py|   4 +-
 .../examples/cookbook/combiners_test.py |   6 +-
 .../examples/cookbook/custom_ptransform_test.py |   4 +-
 .../examples/cookbook/filters_test.py   |  12 ++-
 .../examples/cookbook/mergecontacts.py  |  14 +--
 .../apache_beam/examples/snippets/snippets.py   |  17 +--
 .../examples/snippets/snippets_test.py  |  30 +++---
 .../apache_beam/examples/wordcount_debugging.py |   6 +-
 sdks/python/apache_beam/io/avroio_test.py   |   4 +-
 .../python/apache_beam/io/concat_source_test.py |   4 +-
 .../apache_beam/io/filebasedsource_test.py  |   4 +-
 sdks/python/apache_beam/io/sources_test.py  |   4 +-
 sdks/python/apache_beam/io/textio_test.py   |   5 +-
 sdks/python/apache_beam/io/tfrecordio_test.py   |  24 +++--
 sdks/python/apache_beam/pipeline_test.py|   4 +-
 .../portability/maptask_executor_runner_test.py |   6 +-
 sdks/python/apache_beam/runners/runner_test.py  |   4 +-
 sdks/python/apache_beam/testing/util.py | 107 +++
 sdks/python/apache_beam/testing/util_test.py|  50 +
 .../apache_beam/transforms/combiners_test.py|   2 +-
 .../apache_beam/transforms/create_test.py   |   3 +-
 .../apache_beam/transforms/ptransform_test.py   |   2 +-
 .../apache_beam/transforms/sideinputs_test.py   |   2 +-
 .../apache_beam/transforms/trigger_test.py  |   2 +-
 sdks/python/apache_beam/transforms/util.py  |  79 --
 sdks/python/apache_beam/transforms/util_test.py |  50 -
 .../apache_beam/transforms/window_test.py   |   2 +-
 .../transforms/write_ptransform_test.py |   2 +-
 .../typehints/typed_pipeline_test.py|   2 +-
 37 files changed, 271 insertions(+), 218 deletions(-)
--




  1   2   3   >