date:20170320

[jira] [Comment Edited] (BEAM-1765) Remove Aggregators from Spark runner

2017-03-20 Thread Aviem Zur (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934063#comment-15934063
 ] 

Aviem Zur edited comment on BEAM-1765 at 3/21/17 4:57 AM:
--

Awesome to see we are moving forward on removing aggregators!

How can this be removed from the runners independently from Java SDK?

Most uses of aggregators in Spark runner is to pass these aggregators to SDK 
methods and constructors which require them. For example: {{ReduceFnRunner}}.

Another use are the {{PAssert}} success and counter aggregators which allow 
runners to ensure that all assertions in tests actually happened and not pass 
tests that should have failed.

The only other use is the support of aggregators in {{SparkPipelineResult}} 
which can be removed independently.


was (Author: aviemzur):
Awesome to hear we are moving forward on removing aggregators!

How can this be removed from the runners independently from Java SDK?

Most uses of aggregators in Spark runner is to pass these aggregators to SDK 
methods and constructors which require them. For example: {{ReduceFnRunner}}.

Another use are the {{PAssert}} success and counter aggregators which allow 
runners to ensure that all assertions in tests actually happened and not pass 
tests that should have failed.

The only other use is the support of aggregators in {{SparkPipelineResult}} 
which can be removed independently.

> Remove Aggregators from Spark runner
> 
>
> Key: BEAM-1765
> URL: https://issues.apache.org/jira/browse/BEAM-1765
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Pablo Estrada
>Assignee: Amit Sela
>
> I have started removing aggregators from the Java SDK, but runners use them 
> in different ways that I can't figure out well. This is to track the 
> independent effort in Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Issue Comment Deleted] (BEAM-1735) Retry 403:rateLimitExceeded in GCS

2017-03-20 Thread Rafael Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafael Fernandez updated BEAM-1735:
---
Comment: was deleted

(was: Customer issue -- see more in buganizer.

Thanks!
r

On Tue, Mar 21, 2017 at 3:59 AM Daniel Halperin (JIRA) 
wrote:

[
https://issues.apache.org/jira/browse/BEAM-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934060#comment-15934060
]

Daniel Halperin commented on BEAM-1735:
---

Can you provide more information about this bug, [~rfernand]?

Most importantly: Can you provide a minimal reproduction?

If not, some additional Qs:
* How do you know this is not retried?
* Do you have a link to specific affected code?
* Which part of the codebase is this in? There are many places in which the
GCP-related code interacts with GCS.

should be retried exponentially. We currently do not retry it.

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
)

> Retry 403:rateLimitExceeded in GCS
> --
>
> Key: BEAM-1735
> URL: https://issues.apache.org/jira/browse/BEAM-1735
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Rafael Fernandez
>Assignee: Daniel Halperin
>
> The GCS documentation [1] states that rateLimitExceeded, a 403 error, should 
> be retried exponentially. We currently do not retry it.
> [1] https://cloud.google.com/storage/docs/json_api/v1/status-codes 

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Dataflow #2599

2017-03-20 Thread Apache Jenkins Server

See

[jira] [Updated] (BEAM-1735) Retry 403:rateLimitExceeded in GCS

2017-03-20 Thread Rafael Fernandez (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rafael Fernandez updated BEAM-1735:
---

Customer issue -- see more in buganizer.

Thanks!
r

On Tue, Mar 21, 2017 at 3:59 AM Daniel Halperin (JIRA) 
wrote:

[
https://issues.apache.org/jira/browse/BEAM-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934060#comment-15934060
]

Daniel Halperin commented on BEAM-1735:
---

Can you provide more information about this bug, [~rfernand]?

Most importantly: Can you provide a minimal reproduction?

If not, some additional Qs:
* How do you know this is not retried?
* Do you have a link to specific affected code?
* Which part of the codebase is this in? There are many places in which the
GCP-related code interacts with GCS.

should be retried exponentially. We currently do not retry it.

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

> Retry 403:rateLimitExceeded in GCS
> --
>
> Key: BEAM-1735
> URL: https://issues.apache.org/jira/browse/BEAM-1735
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Rafael Fernandez
>Assignee: Daniel Halperin
>
> The GCS documentation [1] states that rateLimitExceeded, a 403 error, should 
> be retried exponentially. We currently do not retry it.
> [1] https://cloud.google.com/storage/docs/json_api/v1/status-codes 

--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1765) Remove Aggregators from Spark runner

2017-03-20 Thread Aviem Zur (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934063#comment-15934063
 ] 

Aviem Zur commented on BEAM-1765:
-

How can this be removed from the runners independently from Java SDK?

Most uses of aggregators in Spark runner is to pass these aggregators to SDK 
methods and constructors which require them. For example: {{ReduceFnRunner}}.

Another use are the {{PAssert}} success and counter aggregators which allow 
runners to ensure that all assertions in tests actually happened and not pass 
tests that should have failed.

The only other use is the support of aggregators in {{SparkPipelineResult}} 
which can be removed independently.

> Remove Aggregators from Spark runner
> 
>
> Key: BEAM-1765
> URL: https://issues.apache.org/jira/browse/BEAM-1765
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Pablo Estrada
>Assignee: Amit Sela
>
> I have started removing aggregators from the Java SDK, but runners use them 
> in different ways that I can't figure out well. This is to track the 
> independent effort in Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (BEAM-1765) Remove Aggregators from Spark runner

2017-03-20 Thread Aviem Zur (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934063#comment-15934063
 ] 

Aviem Zur edited comment on BEAM-1765 at 3/21/17 4:04 AM:
--

Awesome to hear we are moving forward on removing aggregators!

How can this be removed from the runners independently from Java SDK?

Most uses of aggregators in Spark runner is to pass these aggregators to SDK 
methods and constructors which require them. For example: {{ReduceFnRunner}}.

Another use are the {{PAssert}} success and counter aggregators which allow 
runners to ensure that all assertions in tests actually happened and not pass 
tests that should have failed.

The only other use is the support of aggregators in {{SparkPipelineResult}} 
which can be removed independently.


was (Author: aviemzur):
How can this be removed from the runners independently from Java SDK?

Most uses of aggregators in Spark runner is to pass these aggregators to SDK 
methods and constructors which require them. For example: {{ReduceFnRunner}}.

Another use are the {{PAssert}} success and counter aggregators which allow 
runners to ensure that all assertions in tests actually happened and not pass 
tests that should have failed.

The only other use is the support of aggregators in {{SparkPipelineResult}} 
which can be removed independently.

> Remove Aggregators from Spark runner
> 
>
> Key: BEAM-1765
> URL: https://issues.apache.org/jira/browse/BEAM-1765
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Pablo Estrada
>Assignee: Amit Sela
>
> I have started removing aggregators from the Java SDK, but runners use them 
> in different ways that I can't figure out well. This is to track the 
> independent effort in Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (BEAM-1518) Support deflate (zlib) in CompressedSource and FileBasedSink

2017-03-20 Thread Neville Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neville Li closed BEAM-1518.

Resolution: Fixed

> Support deflate (zlib) in CompressedSource and FileBasedSink
> 
>
> Key: BEAM-1518
> URL: https://issues.apache.org/jira/browse/BEAM-1518
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 0.5.0
>Reporter: Neville Li
>Assignee: Neville Li
>Priority: Minor
> Fix For: 0.6.0
>
>
> `.deflate` files are quite common in Hadoop and also supported by TensorFlow 
> in TFRecord file format.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (BEAM-1518) Support deflate (zlib) in CompressedSource and FileBasedSink

2017-03-20 Thread Neville Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neville Li updated BEAM-1518:
-
Fix Version/s: 0.6.0

> Support deflate (zlib) in CompressedSource and FileBasedSink
> 
>
> Key: BEAM-1518
> URL: https://issues.apache.org/jira/browse/BEAM-1518
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 0.5.0
>Reporter: Neville Li
>Assignee: Neville Li
>Priority: Minor
> Fix For: 0.6.0
>
>
> `.deflate` files are quite common in Hadoop and also supported by TensorFlow 
> in TFRecord file format.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1735) Retry 403:rateLimitExceeded in GCS

2017-03-20 Thread Daniel Halperin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934060#comment-15934060
 ] 

Daniel Halperin commented on BEAM-1735:
---

Can you provide more information about this bug, [~rfernand]?

Most importantly: Can you provide a minimal reproduction?

If not, some additional Qs:
* How do you know this is not retried?
* Do you have a link to specific affected code?
* Which part of the codebase is this in? There are many places in which the 
GCP-related code interacts with GCS.

> Retry 403:rateLimitExceeded in GCS
> --
>
> Key: BEAM-1735
> URL: https://issues.apache.org/jira/browse/BEAM-1735
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Rafael Fernandez
>Assignee: Daniel Halperin
>
> The GCS documentation [1] states that rateLimitExceeded, a 403 error, should 
> be retried exponentially. We currently do not retry it.
> [1] https://cloud.google.com/storage/docs/json_api/v1/status-codes 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1771) Clean up dataflow/google references/URLs in examples

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934025#comment-15934025
 ] 

ASF GitHub Bot commented on BEAM-1771:
--

GitHub user melap opened a pull request:

https://github.com/apache/beam/pull/2281

[BEAM-1771] Clean up dataflow/google references/URLs in examples

R: @aaltay @davorbonaci 

In examples/java/README, I separated and created a section for each runner 
so it's easy for other runners to add their commands also. Please verify I 
didn't mess up the command lines when doing that, as I'm not familiar with the 
bundling. Long term, I am wondering if most of the README content would be 
better off in the Word Count walkthrough on the website? as it's similar to 
what's there already, and then we can use tabs for Java/Python and the 
different runners. It'd also mean the user no longer needs to go back and forth 
between the walkthrough and the README.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/melap/beam examples

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2281.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2281


commit 81bcbb4d19843a9f404ed9c8bffb64f66e41fb6d
Author: melissa 
Date:   2017-03-21T02:35:22Z

[BEAM-1771] Clean up dataflow/google references/URLs in examples




> Clean up dataflow/google references/URLs in examples
> 
>
> Key: BEAM-1771
> URL: https://issues.apache.org/jira/browse/BEAM-1771
> Project: Beam
>  Issue Type: Improvement
>  Components: examples-java
>Reporter: Melissa Pashniak
>Assignee: Melissa Pashniak
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam pull request #2281: [BEAM-1771] Clean up dataflow/google references/URL...

2017-03-20 Thread melap

GitHub user melap opened a pull request:

https://github.com/apache/beam/pull/2281

[BEAM-1771] Clean up dataflow/google references/URLs in examples

R: @aaltay @davorbonaci 

In examples/java/README, I separated and created a section for each runner 
so it's easy for other runners to add their commands also. Please verify I 
didn't mess up the command lines when doing that, as I'm not familiar with the 
bundling. Long term, I am wondering if most of the README content would be 
better off in the Word Count walkthrough on the website? as it's similar to 
what's there already, and then we can use tabs for Java/Python and the 
different runners. It'd also mean the user no longer needs to go back and forth 
between the walkthrough and the README.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/melap/beam examples

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2281.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2281


commit 81bcbb4d19843a9f404ed9c8bffb64f66e41fb6d
Author: melissa 
Date:   2017-03-21T02:35:22Z

[BEAM-1771] Clean up dataflow/google references/URLs in examples




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (BEAM-632) Dataflow runner does not correctly flatten duplicate inputs

2017-03-20 Thread Daniel Halperin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-632:


Assignee: Thomas Groh  (was: Kenneth Knowles)

> Dataflow runner does not correctly flatten duplicate inputs
> ---
>
> Key: BEAM-632
> URL: https://issues.apache.org/jira/browse/BEAM-632
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
>Priority: Critical
> Fix For: First stable release
>
>
> https://github.com/apache/incubator-beam/pull/960
> Builds #1148+ are failing the new test that [~tgroh] added in that PR.
> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_GoogleCloudDataflow/changes



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-632) Dataflow runner does not correctly flatten duplicate inputs

2017-03-20 Thread Daniel Halperin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15934019#comment-15934019
 ] 

Daniel Halperin commented on BEAM-632:
--

Thomas, would you mind taking a pass on this one?

> Dataflow runner does not correctly flatten duplicate inputs
> ---
>
> Key: BEAM-632
> URL: https://issues.apache.org/jira/browse/BEAM-632
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
>Priority: Critical
> Fix For: First stable release
>
>
> https://github.com/apache/incubator-beam/pull/960
> Builds #1148+ are failing the new test that [~tgroh] added in that PR.
> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_GoogleCloudDataflow/changes



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (BEAM-1771) Clean up dataflow/google references/URLs in examples

2017-03-20 Thread Melissa Pashniak (JIRA)

Melissa Pashniak created BEAM-1771:
--

 Summary: Clean up dataflow/google references/URLs in examples
 Key: BEAM-1771
 URL: https://issues.apache.org/jira/browse/BEAM-1771
 Project: Beam
  Issue Type: Improvement
  Components: examples-java
Reporter: Melissa Pashniak
Assignee: Melissa Pashniak






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (BEAM-741) Values transform does not use the correct output coder when values is an Iterable

2017-03-20 Thread Kenneth Knowles (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15581093#comment-15581093
 ] 

Kenneth Knowles edited comment on BEAM-741 at 3/21/17 2:21 AM:
---

Great investigation. I actually think the SDK should also always prefer the 
transform's coder. But, also, for input of type {{KV}}, the expected 
behavior is for the registry to associate the type {{V}} with the value coder 
and thus in this context provide exactly the same coder. So I'm going to reopen 
and see about both of these.

I am struck by this conflict: the transform has some more detailed information 
about its output, but also if the user sets a coder on the input PCollection, 
they have even more information than a transform with a type variable, like 
{{Values}}. Maybe they know something about the data distribution. If both the 
registry and each transform try to adhere to the rule of propagating the user's 
intent, I think they should end up largely equivalent.


was (Author: kenn):
Great investigation. I actually think the SDK should also always prefer the 
transform's coder. But, also, for input of type {{KV}}, the expected 
behavior is for the registry to associate the type {{V}} with the value coder 
and thus in this context provide exactly the same coder. So I'm going to reopen 
and see about both of these.

I am struck by this conflict: the transform has some more detailed information 
about its output, but also if the user sets a coder on the input PCollection, 
they have even more information that a transform with a type variable, like 
{{Values}}. Maybe they know something about the data distribution. If both the 
registry and each transform try to adhere to the rule of propagating the user's 
intent, I think they should end up largely equivalent.

> Values transform does not use the correct output coder when values is an 
> Iterable
> 
>
> Key: BEAM-741
> URL: https://issues.apache.org/jira/browse/BEAM-741
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Andrew Martin
>Assignee: Kenneth Knowles
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (BEAM-1770) DoFn javadoc claims no runner supports state or timers

2017-03-20 Thread Kenneth Knowles (JIRA)

Kenneth Knowles created BEAM-1770:
-

 Summary: DoFn javadoc claims no runner supports state or timers
 Key: BEAM-1770
 URL: https://issues.apache.org/jira/browse/BEAM-1770
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1218) De-Googlify Python SDK

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933987#comment-15933987
 ] 

ASF GitHub Bot commented on BEAM-1218:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2278


> De-Googlify Python SDK
> --
>
> Key: BEAM-1218
> URL: https://issues.apache.org/jira/browse/BEAM-1218
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Mark Liu
>Assignee: Ahmet Altay
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (BEAM-1647) Dataflow RunnableOnService tests timing out reliably

2017-03-20 Thread Kenneth Knowles (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-1647.
---
   Resolution: Fixed
Fix Version/s: Not applicable

> Dataflow RunnableOnService tests timing out reliably
> 
>
> Key: BEAM-1647
> URL: https://issues.apache.org/jira/browse/BEAM-1647
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, testing
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: Not applicable
>
>
> Since [build 
> 2487|https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_RunnableOnService_Dataflow/2487/]
>  our build time seems to have pushed up past 1h40m and is timing out.
> It was just barely under that anyhow, so I don't expect this is a 
> catastrophic change but probably something small. Some builds (at HEAD) seem 
> to sneak under the limit.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[1/2] beam git commit: Clean source files from dataflow/google references

2017-03-20 Thread altay

Repository: beam
Updated Branches:
  refs/heads/master 4ffd43ed7 -> f7855842d


Clean source files from dataflow/google references


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/be224134
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/be224134
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/be224134

Branch: refs/heads/master
Commit: be2241346669777862e20eb6def8c50d67fb8249
Parents: 4ffd43e
Author: Ahmet Altay 
Authored: Mon Mar 20 18:14:31 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Mar 20 19:19:17 2017 -0700

--
 sdks/python/apache_beam/__init__.py   |  2 +-
 sdks/python/apache_beam/io/iobase.py  | 11 ---
 sdks/python/apache_beam/transforms/trigger.py |  2 +-
 3 files changed, 6 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/be224134/sdks/python/apache_beam/__init__.py
--
diff --git a/sdks/python/apache_beam/__init__.py 
b/sdks/python/apache_beam/__init__.py
index e680c9e..9921498 100644
--- a/sdks/python/apache_beam/__init__.py
+++ b/sdks/python/apache_beam/__init__.py
@@ -17,7 +17,7 @@
 
 """Apache Beam SDK for Python.
 
-Apache Beam 
+Apache Beam 
 provides a simple, powerful programming model for building both batch
 and streaming parallel data processing pipelines.
 

http://git-wip-us.apache.org/repos/asf/beam/blob/be224134/sdks/python/apache_beam/io/iobase.py
--
diff --git a/sdks/python/apache_beam/io/iobase.py 
b/sdks/python/apache_beam/io/iobase.py
index bd40a3e..057f853 100644
--- a/sdks/python/apache_beam/io/iobase.py
+++ b/sdks/python/apache_beam/io/iobase.py
@@ -559,9 +559,9 @@ class RangeTracker(object):
 
 
 class Sink(HasDisplayData):
-  """A resource that can be written to using the ``df.io.Write`` transform.
+  """A resource that can be written to using the ``beam.io.Write`` transform.
 
-  Here ``df`` stands for Dataflow Python code imported in following manner.
+  Here ``beam`` stands for Apache Beam Python code imported in following 
manner.
   ``import apache_beam as beam``.
 
   A parallel write to an ``iobase.Sink`` consists of three phases:
@@ -572,9 +572,6 @@ class Sink(HasDisplayData):
   3. A sequential *finalization* phase (e.g., committing the writes, merging
  output files, etc.)
 
-  For exact definition of a Dataflow bundle please see
-  https://cloud.google.com/dataflow/faq.
-
   Implementing a new sink requires extending two classes.
 
   1. iobase.Sink
@@ -594,7 +591,7 @@ class Sink(HasDisplayData):
   single record from the bundle and ``close()`` which is called once
   at the end of writing a bundle.
 
-  See also ``df.io.fileio.FileSink`` which provides a simpler API for writing
+  See also ``beam.io.fileio.FileSink`` which provides a simpler API for writing
   sinks that produce files.
 
   **Execution of the Write transform**
@@ -692,7 +689,7 @@ class Sink(HasDisplayData):
 
   For more information on creating new sinks please refer to the official
   documentation at
-  ``https://cloud.google.com/dataflow/model/custom-io#creating-sinks``.
+  
``https://beam.apache.org/documentation/sdks/python-custom-io#creating-sinks``
   """
 
   def initialize_write(self):

http://git-wip-us.apache.org/repos/asf/beam/blob/be224134/sdks/python/apache_beam/transforms/trigger.py
--
diff --git a/sdks/python/apache_beam/transforms/trigger.py 
b/sdks/python/apache_beam/transforms/trigger.py
index b55d602..e35c349 100644
--- a/sdks/python/apache_beam/transforms/trigger.py
+++ b/sdks/python/apache_beam/transforms/trigger.py
@@ -122,7 +122,7 @@ class WatermarkHoldStateTag(StateTag):
 class TriggerFn(object):
   """A TriggerFn determines when window (panes) are emitted.
 
-  See https://cloud.google.com/dataflow/model/triggers.
+  See https://beam.apache.org/documentation/programming-guide/#triggers
   """
   __metaclass__ = ABCMeta

[2/2] beam git commit: This closes #2278

2017-03-20 Thread altay

This closes #2278


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f7855842
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f7855842
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f7855842

Branch: refs/heads/master
Commit: f7855842d9c132473bef3096bba7cb9f5c281567
Parents: 4ffd43e be22413
Author: Ahmet Altay 
Authored: Mon Mar 20 19:19:37 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Mar 20 19:19:37 2017 -0700

--
 sdks/python/apache_beam/__init__.py   |  2 +-
 sdks/python/apache_beam/io/iobase.py  | 11 ---
 sdks/python/apache_beam/transforms/trigger.py |  2 +-
 3 files changed, 6 insertions(+), 9 deletions(-)
--

[jira] [Commented] (BEAM-1769) Travis - python only executes py27 tox environment

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933982#comment-15933982
 ] 

ASF GitHub Bot commented on BEAM-1769:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2280

[BEAM-1769] Travis should run all tox tests

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
R: @aaltay PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam 
BEAM-1769-travis-should-run-all-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2280.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2280


commit 82d8b31ebab34c3ea200b256244931dff723e2a9
Author: Sourabh Bajaj 
Date:   2017-03-21T02:13:30Z

[BEAM-1769] Travis should run all tox tests




> Travis - python only executes py27 tox environment
> --
>
> Key: BEAM-1769
> URL: https://issues.apache.org/jira/browse/BEAM-1769
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Sourabh Bajaj
>
> https://github.com/apache/beam/blob/master/.travis.yml#L71



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam pull request #2280: [BEAM-1769] Travis should run all tox tests

2017-03-20 Thread sb2nov

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/2280

[BEAM-1769] Travis should run all tox tests

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
R: @aaltay PTAL

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam 
BEAM-1769-travis-should-run-all-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2280.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2280


commit 82d8b31ebab34c3ea200b256244931dff723e2a9
Author: Sourabh Bajaj 
Date:   2017-03-21T02:13:30Z

[BEAM-1769] Travis should run all tox tests




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (BEAM-1769) Travis - python only executes py27 tox environment

2017-03-20 Thread Ahmet Altay (JIRA)

Ahmet Altay created BEAM-1769:
-

 Summary: Travis - python only executes py27 tox environment
 Key: BEAM-1769
 URL: https://issues.apache.org/jira/browse/BEAM-1769
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Ahmet Altay
Assignee: Sourabh Bajaj


https://github.com/apache/beam/blob/master/.travis.yml#L71



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1768) assert_that always passes for empty inputs

2017-03-20 Thread Robert Bradshaw (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933939#comment-15933939
 ] 

Robert Bradshaw commented on BEAM-1768:
---

Looks like this was introduced at https://github.com/apache/beam/pull/1699

> assert_that always passes for empty inputs
> --
>
> Key: BEAM-1768
> URL: https://issues.apache.org/jira/browse/BEAM-1768
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam pull request #2279: [BEAM-1768] Fix assert_that for empty inputs

2017-03-20 Thread robertwb

GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2279

[BEAM-1768] Fix assert_that for empty inputs

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam passert

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2279.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2279


commit f46ac4d1e92d2080a97e0f29d084816f1b97af54
Author: Robert Bradshaw 
Date:   2017-03-21T01:20:01Z

[BEAM-1768] Fix assert_that for empty inputs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (BEAM-1768) assert_that always passes for empty inputs

2017-03-20 Thread Robert Bradshaw (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Bradshaw updated BEAM-1768:
--
Summary: assert_that always passes for empty inputs  (was: PAssert always 
passes for empty inputs)

> assert_that always passes for empty inputs
> --
>
> Key: BEAM-1768
> URL: https://issues.apache.org/jira/browse/BEAM-1768
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 0.5.0, 0.6.0
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>Priority: Critical
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (BEAM-1768) PAssert always passes for empty inputs

2017-03-20 Thread Robert Bradshaw (JIRA)

Robert Bradshaw created BEAM-1768:
-

 Summary: PAssert always passes for empty inputs
 Key: BEAM-1768
 URL: https://issues.apache.org/jira/browse/BEAM-1768
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Affects Versions: 0.6.0, 0.5.0
Reporter: Robert Bradshaw
Assignee: Robert Bradshaw
Priority: Critical
 Fix For: First stable release






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1218) De-Googlify Python SDK

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933936#comment-15933936
 ] 

ASF GitHub Bot commented on BEAM-1218:
--

GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/2278

[BEAM-1218] Clean source files from dataflow/google references

R: @chamikaramj 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam text

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2278.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2278


commit c3e8c39dee6230e178280ada651b7a1b11a92f9b
Author: Ahmet Altay 
Date:   2017-03-21T01:14:31Z

Clean source files from dataflow/google references




> De-Googlify Python SDK
> --
>
> Key: BEAM-1218
> URL: https://issues.apache.org/jira/browse/BEAM-1218
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py
>Reporter: Mark Liu
>Assignee: Ahmet Altay
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam pull request #2278: [BEAM-1218] Clean source files from dataflow/google...

2017-03-20 Thread aaltay

GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/2278

[BEAM-1218] Clean source files from dataflow/google references

R: @chamikaramj 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam text

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2278.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2278


commit c3e8c39dee6230e178280ada651b7a1b11a92f9b
Author: Ahmet Altay 
Date:   2017-03-21T01:14:31Z

Clean source files from dataflow/google references




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[2/3] beam-site git commit: Regenerate website

2017-03-20 Thread altay

Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/7d9208ea
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/7d9208ea
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/7d9208ea

Branch: refs/heads/asf-site
Commit: 7d9208ea92fb34c294455b0fb96066678abaf9f0
Parents: af0821f
Author: Ahmet Altay 
Authored: Mon Mar 20 17:57:45 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Mar 20 17:57:45 2017 -0700

--
 content/get-started/quickstart-py/index.html | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/7d9208ea/content/get-started/quickstart-py/index.html
--
diff --git a/content/get-started/quickstart-py/index.html 
b/content/get-started/quickstart-py/index.html
index b717987..ce774cb 100644
--- a/content/get-started/quickstart-py/index.html
+++ b/content/get-started/quickstart-py/index.html
@@ -221,8 +221,11 @@ environmentâs directories.
 
 Download and install
 
-Install the latest Python SDK from PyPI:
-  pip install apache-beam
+Install the latest Python SDK from PyPI:
+
+pip install 
apache-beam
+
+
 
 Execute a pipeline locally
 
@@ -234,14 +237,13 @@ environmentâs directories.
 
 
 
-# 
As part of the initial setup, install gcp specific extra components.
-pip install dist/apache-beam-*.tar.gz .[gcp]
+# 
As part of the initial setup, install Google Cloud Platform specific extra 
components.
+pip install apache-beam[gcp]
 python -m apache_beam.examples.wordcount --input 
gs://dataflow-samples/shakespeare/kinglear.txt \
  --output 
gs://your-gcs-bucket/counts \
  --runner DataflowRunner \
  --project your-gcp-project \
- --temp_location 
gs://your-gcs-bucket/tmp/ \
- --sdk_location 
dist/apache-beam-*.tar.gz
+ --temp_location 
gs://your-gcs-bucket/tmp/

[GitHub] beam-site pull request #192: Simplify python quick start.

2017-03-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/192


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[3/3] beam-site git commit: This closes #192

2017-03-20 Thread altay

This closes #192


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/0b21f131
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/0b21f131
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/0b21f131

Branch: refs/heads/asf-site
Commit: 0b21f131a93a5f3fcc877fcf1ecec668d9afd6f6
Parents: 61de614 7d9208e
Author: Ahmet Altay 
Authored: Mon Mar 20 17:57:46 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Mar 20 17:57:46 2017 -0700

--
 content/get-started/quickstart-py/index.html | 14 --
 src/get-started/quickstart-py.md | 12 +++-
 2 files changed, 15 insertions(+), 11 deletions(-)
--

[1/3] beam-site git commit: Simplify python quick start.

2017-03-20 Thread altay

Repository: beam-site
Updated Branches:
  refs/heads/asf-site 61de614fb -> 0b21f131a


Simplify python quick start.


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/af0821f3
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/af0821f3
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/af0821f3

Branch: refs/heads/asf-site
Commit: af0821f3628705036addc24f00d6ecb2a7fd9a4c
Parents: 61de614
Author: Ahmet Altay 
Authored: Mon Mar 20 15:31:14 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Mar 20 17:57:02 2017 -0700

--
 src/get-started/quickstart-py.md | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/af0821f3/src/get-started/quickstart-py.md
--
diff --git a/src/get-started/quickstart-py.md b/src/get-started/quickstart-py.md
index c04bf09..6d729bd 100644
--- a/src/get-started/quickstart-py.md
+++ b/src/get-started/quickstart-py.md
@@ -63,7 +63,10 @@ For instructions using other shells, see the [virtualenv 
documentation](https://
 ### Download and install
 
 Install the latest Python SDK from PyPI:
-  `pip install apache-beam`
+
+```
+pip install apache-beam
+```
 
 ## Execute a pipeline locally
 
@@ -78,14 +81,13 @@ python -m apache_beam.examples.wordcount --input README.md 
--output counts
 
 {:.runner-dataflow}
 ```
-# As part of the initial setup, install gcp specific extra components.
-pip install dist/apache-beam-*.tar.gz .[gcp]
+# As part of the initial setup, install Google Cloud Platform specific extra 
components.
+pip install apache-beam[gcp]
 python -m apache_beam.examples.wordcount --input 
gs://dataflow-samples/shakespeare/kinglear.txt \
  --output 
gs:///counts \
  --runner DataflowRunner \
  --project your-gcp-project \
- --temp_location 
gs:///tmp/ \
- --sdk_location 
dist/apache-beam-*.tar.gz
+ --temp_location 
gs:///tmp/
 ```
 
 ## Next Steps

[jira] [Commented] (BEAM-1751) Singleton ByteKeyRange with BigtableIO and Dataflow runner

2017-03-20 Thread peay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933915#comment-15933915
 ] 

peay commented on BEAM-1751:


Yes, correct. I have an unbounded source in this pipeline, and I use the 
collection from Bigtable as a side input to this other streaming source.

> Singleton ByteKeyRange with BigtableIO and Dataflow runner
> --
>
> Key: BEAM-1751
> URL: https://issues.apache.org/jira/browse/BEAM-1751
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-gcp
>Affects Versions: 0.5.0
>Reporter: peay
>Assignee: Daniel Halperin
>
> I am getting this exception on a smallish table of a couple hundreds of rows 
> from Bigtable, when running on Dataflow with a single worker.
> This doesn't occur with the direct runner on my laptop, only when running on 
> Dataflow. Backtrace is from Beam 0.5.
> {code}java.lang.IllegalArgumentException: Start [xx] must be less 
> than end [xx]
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:146)
>   at 
> org.apache.beam.sdk.io.range.ByteKeyRange.(ByteKeyRange.java:288)
>   at 
> org.apache.beam.sdk.io.range.ByteKeyRange.withEndKey(ByteKeyRange.java:278)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.withEndKey(BigtableIO.java:728)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableReader.splitAtFraction(BigtableIO.java:1034)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableReader.splitAtFraction(BigtableIO.java:953)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$ResidualSource.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:530)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$Reader.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:386)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$Reader.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:283)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:278)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:778)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.access$700(StreamingDataflowWorker.java:105)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker$9.run(StreamingDataflowWorker.java:858)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This is in the log right before:
> {code}
> "Proposing to split 
> ByteKeyRangeTracker{range=ByteKeyRange{startKey=[xx], endKey=[]}, 
> position=null} at fraction 0.0 (key [xx])"   
> {code}
> I have replaced the actual key with {{xx}}, but it is always the same 
> everywhere. In 
> https://github.com/apache/beam/blob/e68a70e08c9fe00df9ec163d1532da130f69588a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/ByteKeyRange.java#L260,
>  the end position is obtained by truncating the fractional part of {{size * 
> fraction}}, such that the resulting offset can just be zero if {{fraction}} 
> is too small. `ByteKeyRange` does not allow a singleton range, however. Since 
> {{fraction}} is zero here, the call to {{splitAtFraction}} fails. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (BEAM-632) Dataflow runner does not correctly flatten duplicate inputs

2017-03-20 Thread Daniel Halperin (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-632:
-
Fix Version/s: First stable release

> Dataflow runner does not correctly flatten duplicate inputs
> ---
>
> Key: BEAM-632
> URL: https://issues.apache.org/jira/browse/BEAM-632
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Kenneth Knowles
>Priority: Critical
> Fix For: First stable release
>
>
> https://github.com/apache/incubator-beam/pull/960
> Builds #1148+ are failing the new test that [~tgroh] added in that PR.
> https://builds.apache.org/job/beam_PostCommit_RunnableOnService_GoogleCloudDataflow/changes



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-302) Add Scio Scala DSL to Beam

2017-03-20 Thread Nicholaus E Halecky (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933886#comment-15933886
 ] 

Nicholaus E Halecky commented on BEAM-302:
--

Hi all! Wonderful to see the progress made here so far, and was interested to 
know the status of this effort?

> Add Scio Scala DSL to Beam
> --
>
> Key: BEAM-302
> URL: https://issues.apache.org/jira/browse/BEAM-302
> Project: Beam
>  Issue Type: Wish
>  Components: sdk-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Neville Li
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933841#comment-15933841
 ] 

ASF GitHub Bot commented on BEAM-775:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2277


> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Ben Chambers
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam pull request #2277: [BEAM-775] Remove Aggregators from BigQuery and Pub...

2017-03-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2277


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[1/2] beam git commit: Remove Aggregators from BigQuery and PubSub

2017-03-20 Thread dhalperi

Repository: beam
Updated Branches:
  refs/heads/master 8d240981b -> 4ffd43ed7


Remove Aggregators from BigQuery and PubSub


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/695936ff
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/695936ff
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/695936ff

Branch: refs/heads/master
Commit: 695936ffaac03799d4ee972fb99b73202582e7fa
Parents: 8d24098
Author: Pablo 
Authored: Mon Mar 20 15:39:53 2017 -0700
Committer: Dan Halperin 
Committed: Mon Mar 20 17:15:35 2017 -0700

--
 .../apache/beam/sdk/io/PubsubUnboundedSink.java | 24 
 .../beam/sdk/io/PubsubUnboundedSource.java  |  8 +++
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 13 ---
 3 files changed, 19 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/695936ff/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubUnboundedSink.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubUnboundedSink.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubUnboundedSink.java
index c726fd7..f41b5b7 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubUnboundedSink.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/PubsubUnboundedSink.java
@@ -45,15 +45,15 @@ import org.apache.beam.sdk.coders.NullableCoder;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.coders.VarIntCoder;
 import org.apache.beam.sdk.io.PubsubIO.PubsubMessage;
+import org.apache.beam.sdk.metrics.Counter;
+import org.apache.beam.sdk.metrics.Metrics;
 import org.apache.beam.sdk.options.PubsubOptions;
 import org.apache.beam.sdk.options.ValueProvider;
-import org.apache.beam.sdk.transforms.Aggregator;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.GroupByKey;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.ParDo;
 import org.apache.beam.sdk.transforms.SimpleFunction;
-import org.apache.beam.sdk.transforms.Sum;
 import org.apache.beam.sdk.transforms.display.DisplayData;
 import org.apache.beam.sdk.transforms.display.DisplayData.Builder;
 import org.apache.beam.sdk.transforms.windowing.AfterFirst;
@@ -164,8 +164,7 @@ public class PubsubUnboundedSink extends 
PTransform {
* Convert elements to messages and shard them.
*/
   private static class ShardFn extends DoFn> {
-private final Aggregator elementCounter =
-createAggregator("elements", Sum.ofLongs());
+private final Counter elementCounter = Metrics.counter(ShardFn.class, 
"elements");
 private final Coder elementCoder;
 private final int numShards;
 private final RecordIdMethod recordIdMethod;
@@ -181,7 +180,7 @@ public class PubsubUnboundedSink extends 
PTransform {
 
 @ProcessElement
 public void processElement(ProcessContext c) throws Exception {
-  elementCounter.addValue(1L);
+  elementCounter.inc();
   byte[] elementBytes = null;
   Map attributes = ImmutableMap.of();
   if (formatFn != null) {
@@ -242,12 +241,9 @@ public class PubsubUnboundedSink extends 
PTransform {
 @Nullable
 private transient PubsubClient pubsubClient;
 
-private final Aggregator batchCounter =
-createAggregator("batches", Sum.ofLongs());
-private final Aggregator elementCounter =
-createAggregator("elements", Sum.ofLongs());
-private final Aggregator byteCounter =
-createAggregator("bytes", Sum.ofLongs());
+private final Counter batchCounter = Metrics.counter(WriterFn.class, 
"batches");
+private final Counter elementCounter = Metrics.counter(WriterFn.class, 
"elements");
+private final Counter byteCounter = Metrics.counter(WriterFn.class, 
"bytes");
 
 WriterFn(
 PubsubClientFactory pubsubFactory, ValueProvider topic,
@@ -269,9 +265,9 @@ public class PubsubUnboundedSink extends 
PTransform {
   int n = pubsubClient.publish(topic.get(), messages);
   checkState(n == messages.size(), "Attempted to publish %s messages but 
%s were successful",
  messages.size(), n);
-  batchCounter.addValue(1L);
-  elementCounter.addValue((long) messages.size());
-  byteCounter.addValue((long) bytes);
+  batchCounter.inc();
+  elementCounter.inc(messages.size());
+  byteCounter.inc(bytes);
 }
 
 @StartBundle

[2/2] beam git commit: This closes #2277

2017-03-20 Thread dhalperi

This closes #2277


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/4ffd43ed
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/4ffd43ed
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/4ffd43ed

Branch: refs/heads/master
Commit: 4ffd43ed7728c51f6af0b40970b741cc961e9636
Parents: 8d24098 695936f
Author: Dan Halperin 
Authored: Mon Mar 20 17:15:39 2017 -0700
Committer: Dan Halperin 
Committed: Mon Mar 20 17:15:39 2017 -0700

--
 .../apache/beam/sdk/io/PubsubUnboundedSink.java | 24 
 .../beam/sdk/io/PubsubUnboundedSource.java  |  8 +++
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 13 ---
 3 files changed, 19 insertions(+), 26 deletions(-)
--

Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #2963

2017-03-20 Thread Apache Jenkins Server

See

[jira] [Comment Edited] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933746#comment-15933746
 ] 

Ted Yu edited comment on BEAM-1761 at 3/20/17 11:01 PM:


I checked most recent master branch.
AutoValue_HDFSFileSink.java is generated code and HDFSFileSource.java doesn't 
contain the above quoted snippet :-)


was (Author: yuzhih...@gmail.com):
I checked most recent master branch.
AutoValue_HDFSFileSink.java is gone and HDFSFileSource.java doesn't contain the 
above quoted snippet :-)

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: Not applicable
>
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (BEAM-1762) Python SDK Error Message no python 3 compatible

2017-03-20 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-1762.
---
   Resolution: Fixed
Fix Version/s: First stable release

> Python SDK Error Message no python 3 compatible
> ---
>
> Key: BEAM-1762
> URL: https://issues.apache.org/jira/browse/BEAM-1762
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: julien lhermitte
>Assignee: Ahmet Altay
>Priority: Trivial
> Fix For: First stable release
>
>
> The error message when checking for the correct python versions is not 
> forward compatible with future python versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved BEAM-1761.
--
   Resolution: Not A Problem
Fix Version/s: Not applicable

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: Not applicable
>
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1762) Python SDK Error Message no python 3 compatible

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933759#comment-15933759
 ] 

ASF GitHub Bot commented on BEAM-1762:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2275


> Python SDK Error Message no python 3 compatible
> ---
>
> Key: BEAM-1762
> URL: https://issues.apache.org/jira/browse/BEAM-1762
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: julien lhermitte
>Assignee: Ahmet Altay
>Priority: Trivial
>
> The error message when checking for the correct python versions is not 
> forward compatible with future python versions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[2/2] beam git commit: This closes #2275

2017-03-20 Thread altay

This closes #2275


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8d240981
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8d240981
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8d240981

Branch: refs/heads/master
Commit: 8d240981b69803771bf025e4834686678ac68982
Parents: 9b48a2d 30b5fe5
Author: Ahmet Altay 
Authored: Mon Mar 20 15:52:09 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Mar 20 15:52:09 2017 -0700

--
 sdks/python/apache_beam/__init__.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--

[GitHub] beam pull request #2275: [BEAM-1762] Make error message python 3.6 compatibl...

2017-03-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2275


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[1/2] beam git commit: Make error message python 3.6 compatible

2017-03-20 Thread altay

Repository: beam
Updated Branches:
  refs/heads/master 9b48a2d78 -> 8d240981b


Make error message python 3.6 compatible


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/30b5fe55
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/30b5fe55
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/30b5fe55

Branch: refs/heads/master
Commit: 30b5fe552cbf40a6914d327ac5455394ee615493
Parents: 9b48a2d
Author: Julien L 
Authored: Mon Mar 20 14:09:08 2017 -0400
Committer: Ahmet Altay 
Committed: Mon Mar 20 15:51:40 2017 -0700

--
 sdks/python/apache_beam/__init__.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/30b5fe55/sdks/python/apache_beam/__init__.py
--
diff --git a/sdks/python/apache_beam/__init__.py 
b/sdks/python/apache_beam/__init__.py
index 5a63fff..e680c9e 100644
--- a/sdks/python/apache_beam/__init__.py
+++ b/sdks/python/apache_beam/__init__.py
@@ -69,7 +69,7 @@ import sys
 if not (sys.version_info[0] == 2 and sys.version_info[1] == 7):
   raise RuntimeError(
   'Dataflow SDK for Python is supported only on Python 2.7. '
-  'It is not supported on Python [%s].' % sys.version_info)
+  'It is not supported on Python ['+ str(sys.version_info) + '].')
 
 # pylint: disable=wrong-import-position
 import apache_beam.internal.pickler

Jenkins build became unstable: beam_PostCommit_Java_MavenInstall #2962

2017-03-20 Thread Apache Jenkins Server

See

[jira] [Resolved] (BEAM-536) Aggregator.py. More misleading documentation. More bad documentation

2017-03-20 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-536.
--
   Resolution: Fixed
Fix Version/s: Not applicable

> Aggregator.py.  More misleading documentation.  More bad documentation
> --
>
> Key: BEAM-536
> URL: https://issues.apache.org/jira/browse/BEAM-536
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Frank Yellin
>Priority: Minor
> Fix For: Not applicable
>
>
> The last paragraph of the documentation for Aggregator is:
> You can also query the combined value(s) of an aggregator by calling
> aggregated_value() or aggregated_values() on the result object returned after
> running a pipeline.
> There are multiple problems in this one sentence!
> #1) There is no such method aggregated_value() that I can find anywhere.
> #2) DirectRunner implements aggregated_values(), but DirectPipelineRunner 
> does not.  The latter is the far more interesting case.
> #3) When I use a BlockingDirectPipelineRunner and ask for its 
> aggregated_values(), I get an error message indicating that this is not 
> implemented in DirectPipelineRunner.  Very confusing since I never asked for 
> a DirectPipelineRunner.
> It is clear that this is because BlockingDirectPipelineRunner is a method 
> rather than a class.  Is this really the right thing?  Will there be other 
> confusing error messages.
> #4) The documentation for aggregated_values() says "returns a dict of step 
> names to values of the aggregator."  I have no idea what a "step" means in 
> this context.  In practice, it seems to be a single-element dictionary whose 
> key is 'user--' prefixed onto the aggregator name.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-536) Aggregator.py. More misleading documentation. More bad documentation

2017-03-20 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933754#comment-15933754
 ] 

Ahmet Altay commented on BEAM-536:
--

Agreed. All the mentioned issues are resolved. Aggregator API is replaced wiht 
metrics and BlockingDataflowPipelineRunner is removed.

> Aggregator.py.  More misleading documentation.  More bad documentation
> --
>
> Key: BEAM-536
> URL: https://issues.apache.org/jira/browse/BEAM-536
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Frank Yellin
>Priority: Minor
>
> The last paragraph of the documentation for Aggregator is:
> You can also query the combined value(s) of an aggregator by calling
> aggregated_value() or aggregated_values() on the result object returned after
> running a pipeline.
> There are multiple problems in this one sentence!
> #1) There is no such method aggregated_value() that I can find anywhere.
> #2) DirectRunner implements aggregated_values(), but DirectPipelineRunner 
> does not.  The latter is the far more interesting case.
> #3) When I use a BlockingDirectPipelineRunner and ask for its 
> aggregated_values(), I get an error message indicating that this is not 
> implemented in DirectPipelineRunner.  Very confusing since I never asked for 
> a DirectPipelineRunner.
> It is clear that this is because BlockingDirectPipelineRunner is a method 
> rather than a class.  Is this really the right thing?  Will there be other 
> confusing error messages.
> #4) The documentation for aggregated_values() says "returns a dict of step 
> names to values of the aggregator."  I have no idea what a "step" means in 
> this context.  In practice, it seems to be a single-element dictionary whose 
> key is 'user--' prefixed onto the aggregator name.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (BEAM-1753) ImportError (cannot import name descriptor) in new venv after 'python setup.py install'

2017-03-20 Thread Ahmet Altay (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-1753:
-

Assignee: María GH  (was: Ahmet Altay)

> ImportError (cannot import name descriptor) in new venv after 'python 
> setup.py install'
> ---
>
> Key: BEAM-1753
> URL: https://issues.apache.org/jira/browse/BEAM-1753
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: María GH
>Assignee: María GH
>
> After 'python setup.py install' in a clean virtual environment, I get the 
> following when running nosetest:
> (dataflow) mariagh (ppp_inmaster *) python $ nosetests --logging-level=INFO 
> apache_beam/io/fileio_test.py
> /Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/plugins/manager.py:395:
>  RuntimeWarning: Unable to load plugin beam_test_plugin = 
> test_config:BeamTestPlugin: (dill 0.2.5 
> (/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages), 
> Requirement.parse('dill==0.2.6'))
>   RuntimeWarning)
> Failure: ImportError (cannot import name descriptor) ... ERROR
> ==
> ERROR: Failure: ImportError (cannot import name descriptor)
> --
> Traceback (most recent call last):
>   File 
> "/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/loader.py",
>  line 418, in loadTestsFromName
> addr.filename, addr.module)
>   File 
> "/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/importer.py",
>  line 47, in importFromPath
> return self.importFromDir(dir_path, fqname)
>   File 
> "/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/importer.py",
>  line 94, in importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/__init__.py",
>  line 77, in 
> from apache_beam import coders
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/coders/__init__.py",
>  line 18, in 
> from apache_beam.coders.coders import *
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/coders/coders.py",
>  line 26, in 
> from apache_beam.utils import proto_utils
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/utils/proto_utils.py",
>  line 18, in 
> from google.protobuf import any_pb2
>   File "build/bdist.macosx-10.11-x86_64/egg/google/protobuf/any_pb2.py", line 
> 6, in 
> ImportError: cannot import name descriptor
> --
> Ran 1 test in 0.001s
> FAILED (errors=1)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1751) Singleton ByteKeyRange with BigtableIO and Dataflow runner

2017-03-20 Thread Daniel Halperin (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933751#comment-15933751
 ] 

Daniel Halperin commented on BEAM-1751:
---

You are running with the {{--streaming}} option set?

> Singleton ByteKeyRange with BigtableIO and Dataflow runner
> --
>
> Key: BEAM-1751
> URL: https://issues.apache.org/jira/browse/BEAM-1751
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-gcp
>Affects Versions: 0.5.0
>Reporter: peay
>Assignee: Daniel Halperin
>
> I am getting this exception on a smallish table of a couple hundreds of rows 
> from Bigtable, when running on Dataflow with a single worker.
> This doesn't occur with the direct runner on my laptop, only when running on 
> Dataflow. Backtrace is from Beam 0.5.
> {code}java.lang.IllegalArgumentException: Start [xx] must be less 
> than end [xx]
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:146)
>   at 
> org.apache.beam.sdk.io.range.ByteKeyRange.(ByteKeyRange.java:288)
>   at 
> org.apache.beam.sdk.io.range.ByteKeyRange.withEndKey(ByteKeyRange.java:278)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.withEndKey(BigtableIO.java:728)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableReader.splitAtFraction(BigtableIO.java:1034)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableReader.splitAtFraction(BigtableIO.java:953)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$ResidualSource.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:530)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$Reader.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:386)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$Reader.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:283)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:278)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:778)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.access$700(StreamingDataflowWorker.java:105)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker$9.run(StreamingDataflowWorker.java:858)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This is in the log right before:
> {code}
> "Proposing to split 
> ByteKeyRangeTracker{range=ByteKeyRange{startKey=[xx], endKey=[]}, 
> position=null} at fraction 0.0 (key [xx])"   
> {code}
> I have replaced the actual key with {{xx}}, but it is always the same 
> everywhere. In 
> https://github.com/apache/beam/blob/e68a70e08c9fe00df9ec163d1532da130f69588a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/ByteKeyRange.java#L260,
>  the end position is obtained by truncating the fractional part of {{size * 
> fraction}}, such that the resulting offset can just be zero if {{fraction}} 
> is too small. `ByteKeyRange` does not allow a singleton range, however. Since 
> {{fraction}} is zero here, the call to {{splitAtFraction}} fails. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933746#comment-15933746
 ] 

Ted Yu commented on BEAM-1761:
--

I checked most recent master branch.
AutoValue_HDFSFileSink.java is gone and HDFSFileSource.java doesn't contain the 
above quoted snippet :-)

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933743#comment-15933743
 ] 

ASF GitHub Bot commented on BEAM-775:
-

GitHub user pabloem opened a pull request:

https://github.com/apache/beam/pull/2277

[BEAM-775] Remove Aggregators from BigQuery and PubSub

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam agg-bq-pubsub

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2277.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2277






> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Ben Chambers
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam pull request #2277: [BEAM-775] Remove Aggregators from BigQuery and Pub...

2017-03-20 Thread pabloem

GitHub user pabloem opened a pull request:

https://github.com/apache/beam/pull/2277

[BEAM-775] Remove Aggregators from BigQuery and PubSub

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam agg-bq-pubsub

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2277.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2277






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Assigned] (BEAM-1751) Singleton ByteKeyRange with BigtableIO and Dataflow runner

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-1751:
--

Assignee: Daniel Halperin  (was: Davor Bonaci)

> Singleton ByteKeyRange with BigtableIO and Dataflow runner
> --
>
> Key: BEAM-1751
> URL: https://issues.apache.org/jira/browse/BEAM-1751
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-java-gcp
>Affects Versions: 0.5.0
>Reporter: peay
>Assignee: Daniel Halperin
>
> I am getting this exception on a smallish table of a couple hundreds of rows 
> from Bigtable, when running on Dataflow with a single worker.
> This doesn't occur with the direct runner on my laptop, only when running on 
> Dataflow. Backtrace is from Beam 0.5.
> {code}java.lang.IllegalArgumentException: Start [xx] must be less 
> than end [xx]
>   at 
> org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:146)
>   at 
> org.apache.beam.sdk.io.range.ByteKeyRange.(ByteKeyRange.java:288)
>   at 
> org.apache.beam.sdk.io.range.ByteKeyRange.withEndKey(ByteKeyRange.java:278)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableSource.withEndKey(BigtableIO.java:728)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableReader.splitAtFraction(BigtableIO.java:1034)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$BigtableReader.splitAtFraction(BigtableIO.java:953)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$ResidualSource.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:530)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$Reader.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:386)
>   at 
> org.apache.beam.runners.dataflow.DataflowUnboundedReadFromBoundedSource$BoundedToUnboundedSourceAdapter$Reader.getCheckpointMark(DataflowUnboundedReadFromBoundedSource.java:283)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingModeExecutionContext.flushState(StreamingModeExecutionContext.java:278)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:778)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker.access$700(StreamingDataflowWorker.java:105)
>   at 
> com.google.cloud.dataflow.worker.runners.worker.StreamingDataflowWorker$9.run(StreamingDataflowWorker.java:858)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}
> This is in the log right before:
> {code}
> "Proposing to split 
> ByteKeyRangeTracker{range=ByteKeyRange{startKey=[xx], endKey=[]}, 
> position=null} at fraction 0.0 (key [xx])"   
> {code}
> I have replaced the actual key with {{xx}}, but it is always the same 
> everywhere. In 
> https://github.com/apache/beam/blob/e68a70e08c9fe00df9ec163d1532da130f69588a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/ByteKeyRange.java#L260,
>  the end position is obtained by truncating the fractional part of {{size * 
> fraction}}, such that the resulting offset can just be zero if {{fraction}} 
> is too small. `ByteKeyRange` does not allow a singleton range, however. Since 
> {{fraction}} is zero here, the call to {{splitAtFraction}} fails. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (BEAM-1568) Ineffective null check in IsmFormat#structuralValue

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci resolved BEAM-1568.

   Resolution: Fixed
Fix Version/s: First stable release

> Ineffective null check in IsmFormat#structuralValue
> ---
>
> Key: BEAM-1568
> URL: https://issues.apache.org/jira/browse/BEAM-1568
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: First stable release
>
>
> {code}
> public Object structuralValue(IsmRecord record) throws Exception {
>   checkState(record.getKeyComponents().size() == 
> keyComponentCoders.size(),
>   "Expected the number of key component coders %s "
>   + "to match the number of key components %s.",
>   keyComponentCoders.size(), record.getKeyComponents());
>   if (record != null && consistentWithEquals()) {
> {code}
> record is de-referenced before the null check.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (BEAM-1568) Ineffective null check in IsmFormat#structuralValue

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-1568:
--

Assignee: Ted Yu  (was: Davor Bonaci)

> Ineffective null check in IsmFormat#structuralValue
> ---
>
> Key: BEAM-1568
> URL: https://issues.apache.org/jira/browse/BEAM-1568
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: First stable release
>
>
> {code}
> public Object structuralValue(IsmRecord record) throws Exception {
>   checkState(record.getKeyComponents().size() == 
> keyComponentCoders.size(),
>   "Expected the number of key component coders %s "
>   + "to match the number of key components %s.",
>   keyComponentCoders.size(), record.getKeyComponents());
>   if (record != null && consistentWithEquals()) {
> {code}
> record is de-referenced before the null check.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Davor Bonaci (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933742#comment-15933742
 ] 

Davor Bonaci commented on BEAM-1761:


(That one is done.)

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (BEAM-1754) Will Dataflow ever support Node.js with an SDK similar to Java or Python?

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-1754:
--

Assignee: (was: Davor Bonaci)

> Will Dataflow ever support Node.js with an SDK similar to Java or Python?
> -
>
> Key: BEAM-1754
> URL: https://issues.apache.org/jira/browse/BEAM-1754
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Diego Zuluaga
>Priority: Critical
>  Labels: node.js
>
> I like the philosophy behind DataFlow and found the Java and Python samples 
> highly comprehensible. However, I have to admit that for most Node.js 
> developers who have little background on typed languages and are used to get 
> up to speed with frameworks incredibly fast, learning Dataflow might take 
> some learning curve that they/we're not used to. So, I wonder if at any point 
> in time Dataflow will provide a Node.js SDK. Maybe this is out of the 
> question, but I wanted to run it by the team as it would be awesome to have 
> something along these lines!
> Thanks,
> Diego
> Question originaly posted in SO:
> http://stackoverflow.com/questions/42893436/will-dataflow-ever-support-node-js-with-and-sdk-similar-to-java-or-python



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1754) Will Dataflow ever support Node.js with an SDK similar to Java or Python?

2017-03-20 Thread Davor Bonaci (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933740#comment-15933740
 ] 

Davor Bonaci commented on BEAM-1754:


It is hard to say whether there'll be such an SDK/DSL in Beam and/or be 
supported for execution by a particular Beam runner.

It is certainly within Beam's scope and vision to allow for multiple SDKs/DSLs, 
including Node.js. If someone wants to contribute it and a Beam committer is 
willing to support it, it is certainly within the realm of possibility. If/when 
that might happen, I cannot speculate.

> Will Dataflow ever support Node.js with an SDK similar to Java or Python?
> -
>
> Key: BEAM-1754
> URL: https://issues.apache.org/jira/browse/BEAM-1754
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Diego Zuluaga
>Assignee: Davor Bonaci
>Priority: Critical
>  Labels: node.js
>
> I like the philosophy behind DataFlow and found the Java and Python samples 
> highly comprehensible. However, I have to admit that for most Node.js 
> developers who have little background on typed languages and are used to get 
> up to speed with frameworks incredibly fast, learning Dataflow might take 
> some learning curve that they/we're not used to. So, I wonder if at any point 
> in time Dataflow will provide a Node.js SDK. Maybe this is out of the 
> question, but I wanted to run it by the team as it would be awesome to have 
> something along these lines!
> Thanks,
> Diego
> Question originaly posted in SO:
> http://stackoverflow.com/questions/42893436/will-dataflow-ever-support-node-js-with-and-sdk-similar-to-java-or-python



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (BEAM-1754) Will Dataflow ever support Node.js with an SDK similar to Java or Python?

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-1754:
---
Labels: node.js  (was: newbie node.js)

> Will Dataflow ever support Node.js with an SDK similar to Java or Python?
> -
>
> Key: BEAM-1754
> URL: https://issues.apache.org/jira/browse/BEAM-1754
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Diego Zuluaga
>Assignee: Davor Bonaci
>Priority: Critical
>  Labels: node.js
>
> I like the philosophy behind DataFlow and found the Java and Python samples 
> highly comprehensible. However, I have to admit that for most Node.js 
> developers who have little background on typed languages and are used to get 
> up to speed with frameworks incredibly fast, learning Dataflow might take 
> some learning curve that they/we're not used to. So, I wonder if at any point 
> in time Dataflow will provide a Node.js SDK. Maybe this is out of the 
> question, but I wanted to run it by the team as it would be awesome to have 
> something along these lines!
> Thanks,
> Diego
> Question originaly posted in SO:
> http://stackoverflow.com/questions/42893436/will-dataflow-ever-support-node-js-with-and-sdk-similar-to-java-or-python



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] beam-site pull request #192: Simplify python quick start.

2017-03-20 Thread aaltay

GitHub user aaltay opened a pull request:

https://github.com/apache/beam-site/pull/192

Simplify python quick start.

R: @chamikaramj 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam-site quick

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/192.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #192


commit 22026fe0b4a9483c31ed4e99350b8eb4046fac33
Author: Ahmet Altay 
Date:   2017-03-20T22:31:14Z

Simplify python quick start.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] beam pull request #2227: BEAM-1568 neffective null check in IsmFormat#struct...

2017-03-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2227


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933732#comment-15933732
 ] 

Ted Yu commented on BEAM-1761:
--

Can you review the PR over BEAM-1568 first ?

I can work on this once that is resolved.

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[2/2] beam git commit: This closes #2227

2017-03-20 Thread davor

This closes #2227


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/9b48a2d7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/9b48a2d7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/9b48a2d7

Branch: refs/heads/master
Commit: 9b48a2d78e37f51b0fa9184b4a743abda177d730
Parents: 59aa0da 656d195
Author: Davor Bonaci 
Authored: Mon Mar 20 15:33:28 2017 -0700
Committer: Davor Bonaci 
Committed: Mon Mar 20 15:33:28 2017 -0700

--
 .../java/org/apache/beam/runners/dataflow/internal/IsmFormat.java | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--

[1/2] beam git commit: BEAM-1568 neffective null check in IsmFormat#structuralValue

2017-03-20 Thread davor

Repository: beam
Updated Branches:
  refs/heads/master 59aa0dab7 -> 9b48a2d78


BEAM-1568 neffective null check in IsmFormat#structuralValue


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/656d1958
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/656d1958
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/656d1958

Branch: refs/heads/master
Commit: 656d1958f4b43d326542f0ae9c5f2650967e7de3
Parents: 59aa0da
Author: tedyu 
Authored: Sat Mar 11 19:59:19 2017 -0800
Committer: Davor Bonaci 
Committed: Mon Mar 20 15:33:22 2017 -0700

--
 .../java/org/apache/beam/runners/dataflow/internal/IsmFormat.java | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/656d1958/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
index 5b733c8..6daddc6 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/internal/IsmFormat.java
@@ -404,12 +404,13 @@ public class IsmFormat {
 
 @Override
 public Object structuralValue(IsmRecord record) throws Exception {
+  checkNotNull(record);
   checkState(record.getKeyComponents().size() == keyComponentCoders.size(),
   "Expected the number of key component coders %s "
   + "to match the number of key components %s.",
   keyComponentCoders.size(), record.getKeyComponents());
 
-  if (record != null && consistentWithEquals()) {
+  if (consistentWithEquals()) {
 ArrayList keyComponentStructuralValues = new ArrayList<>();
 for (int i = 0; i < keyComponentCoders.size(); ++i) {
   keyComponentStructuralValues.add(

[jira] [Commented] (BEAM-1447) Autodetect streaming/not streaming in DataflowRunner

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933731#comment-15933731
 ] 

ASF GitHub Bot commented on BEAM-1447:
--

GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/2276

[BEAM-1447] Remove need for Streaming Flag in Dataflow

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
Autodetect if streaming is required, and if so run in Streaming.

I would love to have tests for this, but given the location it occurs,
I'm not sure how to do so effectively.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam automatic_streaming_dataflow

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2276.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2276


commit 3ac32c7e0658089e6d28832ea7ade292d3a8d769
Author: Thomas Groh 
Date:   2017-03-20T21:40:05Z

Remove need for Streaming Flag in Dataflow

Autodetect if streaming is required, and if so run in Streaming.




> Autodetect streaming/not streaming in DataflowRunner
> 
>
> Key: BEAM-1447
> URL: https://issues.apache.org/jira/browse/BEAM-1447
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>
> Once pipeline surgery happens after construction, the Dataflow runner should 
> be able to automatically decide how to execute a pipeline based on 
> PCollection boundedness.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Jenkins build is back to stable : beam_PostCommit_Java_RunnableOnService_Spark #1297

2017-03-20 Thread Apache Jenkins Server

See

[jira] [Assigned] (BEAM-1760) Potential null dereference in HDFSFileSink#doFinalize

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-1760:
--

Assignee: Ted Yu  (was: Davor Bonaci)

> Potential null dereference in HDFSFileSink#doFinalize
> -
>
> Key: BEAM-1760
> URL: https://issues.apache.org/jira/browse/BEAM-1760
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> Here is related code:
> {code}
>   for (FileStatus s : statuses) {
> String name = s.getPath().getName();
> int pos = name.indexOf('.');
> String ext = pos > 0 ? name.substring(pos) : "";
> fs.rename(
> s.getPath(),
> new Path(s.getPath().getParent(), String.format("part-r-%05d%s", 
> i, ext)));
> i++;
>   }
> }
> {code}
> We should check whether s.getPath().getParent() is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1760) Potential null dereference in HDFSFileSink#doFinalize

2017-03-20 Thread Davor Bonaci (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933721#comment-15933721
 ] 

Davor Bonaci commented on BEAM-1760:


[~yuzhih...@gmail.com], would you mind submitting a PR to fix this?

> Potential null dereference in HDFSFileSink#doFinalize
> -
>
> Key: BEAM-1760
> URL: https://issues.apache.org/jira/browse/BEAM-1760
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> Here is related code:
> {code}
>   for (FileStatus s : statuses) {
> String name = s.getPath().getName();
> int pos = name.indexOf('.');
> String ext = pos > 0 ? name.substring(pos) : "";
> fs.rename(
> s.getPath(),
> new Path(s.getPath().getParent(), String.format("part-r-%05d%s", 
> i, ext)));
> i++;
>   }
> }
> {code}
> We should check whether s.getPath().getParent() is null.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-1761:
---
Component/s: (was: sdk-java-core)
 sdk-java-extensions

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Davor Bonaci (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933718#comment-15933718
 ] 

Davor Bonaci commented on BEAM-1761:


[~yuzhih...@gmail.com], would you mind submitting a PR to fix this?

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ted Yu
>Assignee: Davor Bonaci
>Priority: Minor
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (BEAM-1761) Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-1761:
--

Assignee: Ted Yu  (was: Davor Bonaci)

> Unintended unboxing of potential null pointer in AutoValue_HDFSFileSource
> -
>
> Key: BEAM-1761
> URL: https://issues.apache.org/jira/browse/BEAM-1761
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
>
> {code}
>   if (validateSource == null) {
> missing += " validateSource";
>   }
> ...
>   return new AutoValue_HDFSFileSource(
>   this.filepattern,
>   this.formatClass,
>   this.coder,
>   this.inputConverter,
>   this.serializableConfiguration,
>   this.serializableSplit,
>   this.username,
>   this.validateSource);
> {code}
> If validateSource is null, it would be unboxed in call to ctor of 
> AutoValue_HDFSFileSource



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (BEAM-1767) Remove Aggregators from Dataflow runner

2017-03-20 Thread Davor Bonaci (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci reassigned BEAM-1767:
--

Assignee: Pablo Estrada  (was: Davor Bonaci)

> Remove Aggregators from Dataflow runner
> ---
>
> Key: BEAM-1767
> URL: https://issues.apache.org/jira/browse/BEAM-1767
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>
> I have started removing aggregators from the Java SDK, but runners use them 
> in different ways that I can't figure out well. This is to track the 
> independent effort in Dataflow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-772) Implement Metrics support for Dataflow Runner

2017-03-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933696#comment-15933696
 ] 

ASF GitHub Bot commented on BEAM-772:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2223


> Implement Metrics support for Dataflow Runner
> -
>
> Key: BEAM-772
> URL: https://issues.apache.org/jira/browse/BEAM-772
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-dataflow
>Reporter: Ben Chambers
>Assignee: Ben Chambers
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[2/2] beam git commit: Closes #2223

2017-03-20 Thread bchambers

Closes #2223


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/59aa0dab
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/59aa0dab
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/59aa0dab

Branch: refs/heads/master
Commit: 59aa0dab728213cfdb049892b09ac09a1c3b3846
Parents: 1d9772a 6de412a
Author: bchambers 
Authored: Mon Mar 20 14:48:56 2017 -0700
Committer: bchambers 
Committed: Mon Mar 20 14:48:56 2017 -0700

--
 .../beam/runners/direct/DirectMetrics.java  |  70 +-
 .../beam/runners/direct/DirectMetricsTest.java  |  86 ++-
 .../beam/runners/dataflow/DataflowMetrics.java  | 212 +
 .../runners/dataflow/DataflowPipelineJob.java   |  14 +-
 .../runners/dataflow/DataflowMetricsTest.java   | 236 +++
 .../beam/sdk/metrics/MetricFiltering.java   |  99 
 .../beam/sdk/metrics/MetricFilteringTest.java   |  72 ++
 7 files changed, 655 insertions(+), 134 deletions(-)
--

[GitHub] beam pull request #2223: [BEAM-772] Adding support for metrics querying in D...

2017-03-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2223


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[1/2] beam git commit: Support for querying metrics in Dataflow Runner

2017-03-20 Thread bchambers

Repository: beam
Updated Branches:
  refs/heads/master 1d9772a3a -> 59aa0dab7


Support for querying metrics in Dataflow Runner

Added MetricsFiltering class for helper methods related to matching step
names.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/6de412a5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/6de412a5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/6de412a5

Branch: refs/heads/master
Commit: 6de412a5dfb3000ab5d354ada8761789230d3ce3
Parents: 1d9772a
Author: Pablo 
Authored: Fri Mar 10 16:10:31 2017 -0800
Committer: bchambers 
Committed: Mon Mar 20 14:48:22 2017 -0700

--
 .../beam/runners/direct/DirectMetrics.java  |  70 +-
 .../beam/runners/direct/DirectMetricsTest.java  |  86 ++-
 .../beam/runners/dataflow/DataflowMetrics.java  | 212 +
 .../runners/dataflow/DataflowPipelineJob.java   |  14 +-
 .../runners/dataflow/DataflowMetricsTest.java   | 236 +++
 .../beam/sdk/metrics/MetricFiltering.java   |  99 
 .../beam/sdk/metrics/MetricFilteringTest.java   |  72 ++
 7 files changed, 655 insertions(+), 134 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/6de412a5/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectMetrics.java
--
diff --git 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectMetrics.java
 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectMetrics.java
index fa8f9c3..f04dc21 100644
--- 
a/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectMetrics.java
+++ 
b/runners/direct-java/src/main/java/org/apache/beam/runners/direct/DirectMetrics.java
@@ -20,12 +20,10 @@ package org.apache.beam.runners.direct;
 import static java.util.Arrays.asList;
 
 import com.google.auto.value.AutoValue;
-import com.google.common.base.Objects;
 import com.google.common.collect.ImmutableList;
 import java.util.ArrayList;
 import java.util.Map;
 import java.util.Map.Entry;
-import java.util.Set;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.ConcurrentMap;
 import java.util.concurrent.ExecutorService;
@@ -35,9 +33,9 @@ import javax.annotation.concurrent.GuardedBy;
 import org.apache.beam.runners.direct.DirectRunner.CommittedBundle;
 import org.apache.beam.sdk.metrics.DistributionData;
 import org.apache.beam.sdk.metrics.DistributionResult;
+import org.apache.beam.sdk.metrics.MetricFiltering;
 import org.apache.beam.sdk.metrics.MetricKey;
 import org.apache.beam.sdk.metrics.MetricName;
-import org.apache.beam.sdk.metrics.MetricNameFilter;
 import org.apache.beam.sdk.metrics.MetricQueryResults;
 import org.apache.beam.sdk.metrics.MetricResult;
 import org.apache.beam.sdk.metrics.MetricResults;
@@ -258,7 +256,7 @@ class DirectMetrics extends MetricResults {
   MetricsFilter filter,
   ImmutableList.Builder resultsBuilder,
   Map.Entry entry) {
-if (matches(filter, entry.getKey())) {
+if (MetricFiltering.matches(filter, entry.getKey())) {
   resultsBuilder.add(DirectMetricResult.create(
   entry.getKey().metricName(),
   entry.getKey().stepName(),
@@ -267,70 +265,6 @@ class DirectMetrics extends MetricResults {
 }
   }
 
-  // Matching logic is implemented here rather than in MetricsFilter because 
we would like
-  // MetricsFilter to act as a "dumb" value-object, with the possibility of 
replacing it with
-  // a Proto/JSON/etc. schema object.
-  private boolean matches(MetricsFilter filter, MetricKey key) {
-return matchesName(key.metricName(), filter.names())
-&& matchesScope(key.stepName(), filter.steps());
-  }
-
-  /**
-  * {@code subPathMatches(haystack, needle)} returns true if {@code needle}
-  * represents a path within {@code haystack}. For example, "foo/bar" is in 
"a/foo/bar/b",
-  * but not "a/fool/bar/b" or "a/foo/bart/b".
-  */
-  public boolean subPathMatches(String haystack, String needle) {
-int location = haystack.indexOf(needle);
-int end = location + needle.length();
-if (location == -1) {
-  return false;  // needle not found
-} else if (location != 0 && haystack.charAt(location - 1) != '/') {
-  return false; // the first entry in needle wasn't exactly matched
-} else if (end != haystack.length() && haystack.charAt(end) != '/') {
-  return false; // the last entry in needle wasn't exactly matched
-} else {
-  return true;
-}
-  }
-
-  /**
-   * {@code matchesScope(actualScope, scopes)} returns true if the scope of a 
metric is matched
-   * by any of the filters in {@code scopes}. A metric scope is a path of type 
"A/B/D".

[GitHub] beam-site pull request #168: Add windowing section to programming guide

2017-03-20 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/168


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[2/3] beam-site git commit: Regenerate website

2017-03-20 Thread davor

Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/fa7d6168
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/fa7d6168
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/fa7d6168

Branch: refs/heads/asf-site
Commit: fa7d61680afb67248c2b9ae55417bc0ce7148c9b
Parents: ddb6079
Author: Davor Bonaci 
Authored: Mon Mar 20 14:56:42 2017 -0700
Committer: Davor Bonaci 
Committed: Mon Mar 20 14:56:42 2017 -0700

--
 .../documentation/programming-guide/index.html  | 248 ++-
 content/images/fixed-time-windows.png   | Bin 0 -> 11717 bytes
 content/images/session-windows.png  | Bin 0 -> 16697 bytes
 content/images/sliding-time-windows.png | Bin 0 -> 16537 bytes
 content/images/unwindowed-pipeline-bounded.png  | Bin 0 -> 9589 bytes
 content/images/windowing-pipeline-bounded.png   | Bin 0 -> 13325 bytes
 content/images/windowing-pipeline-unbounded.png | Bin 0 -> 21890 bytes
 7 files changed, 245 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/fa7d6168/content/documentation/programming-guide/index.html
--
diff --git a/content/documentation/programming-guide/index.html 
b/content/documentation/programming-guide/index.html
index 19853df..9d0a3b6 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -369,7 +369,7 @@
 
 The bounded (or unbounded) nature of your PCollection affects how Beam processes your 
data. A bounded PCollection can be 
processed using a batch job, which might read the entire data set once, and 
perform processing in a job of finite length. An unbounded PCollection must be processed using a 
streaming job that runs continuously, as the entire collection can never be 
available for processing at any one time.
 
-When performing an operation that groups elements in an unbounded PCollection, Beam requires a concept called 
Windowing to divide a continuously updating data set into 
logical windows of finite size.  Beam processes each window as a bundle, and 
processing continues as the data set is generated. These logical windows are 
determined by some characteristic associated with a data element, such as a 
timestamp.
+When performing an operation that groups elements in an unbounded PCollection, Beam requires a concept called 
windowing to divide a continuously updating data set into 
logical windows of finite size.  Beam processes each window as a bundle, and 
processing continues as the data set is generated. These logical windows are 
determined by some characteristic associated with a data element, such as a 
timestamp.
 
 Element timestamps
 
@@ -1522,8 +1522,250 @@ tree, [2]
 
 The Beam SDK for Python does not support annotating 
data types with a default coder. If you would like to set a default coder, use 
the method described in the previous section, Setting the default coder for 
a type.
 
-
-
+Working with windowing
+
+Windowing subdivides a PCollection 
according to the timestamps of its individual elements. Transforms that 
aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window 
basisâthat is, they process each PCollection as a succession of multiple, 
finite windows, though the entire collection itself may be of unbounded 
size.
+
+A related concept, called triggers, determines when to 
emit the results of aggregation as unbounded data arrives. Using a trigger can 
help to refine the windowing strategy for your PCollection to deal with late-arriving data or 
to provide early results. See the triggers section for 
more information.
+
+Windowing basics
+
+Some Beam transforms, such as GroupByKey and Combine, group multiple elements by a common 
key. Ordinarily, that grouping operation groups all of the elements that have 
the same key within the entire data set. With an unbounded data set, it is 
impossible to collect all of the elements, since new elements are constantly 
being added and may be infinitely many (e.g. streaming data). If you are 
working with unbounded PCollections, 
windowing is especially useful.
+
+In the Beam model, any PCollection 
(including unbounded PCollections) can 
be subdivided into logical windows. Each element in a PCollection is assigned to one or more windows 
according to the PCollectionâs 
windowing function, and each individual window contains a finite number of 
elements. Grouping transforms then consider each PCollectionâs elements on a per-window 
basis. GroupByKey, for example, 
implicitly groups the elements of a PCollection by key and window.
+
+Caution: The default windowing behavior is to assign all

[3/3] beam-site git commit: This closes #168

2017-03-20 Thread davor

This closes #168


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/61de614f
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/61de614f
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/61de614f

Branch: refs/heads/asf-site
Commit: 61de614fb259b3fb5bb623245bae801bbc0759ad
Parents: 3b3bc65 fa7d616
Author: Davor Bonaci 
Authored: Mon Mar 20 14:56:42 2017 -0700
Committer: Davor Bonaci 
Committed: Mon Mar 20 14:56:42 2017 -0700

--
 .../documentation/programming-guide/index.html  | 248 ++-
 content/images/fixed-time-windows.png   | Bin 0 -> 11717 bytes
 content/images/session-windows.png  | Bin 0 -> 16697 bytes
 content/images/sliding-time-windows.png | Bin 0 -> 16537 bytes
 content/images/unwindowed-pipeline-bounded.png  | Bin 0 -> 9589 bytes
 content/images/windowing-pipeline-bounded.png   | Bin 0 -> 13325 bytes
 content/images/windowing-pipeline-unbounded.png | Bin 0 -> 21890 bytes
 src/documentation/programming-guide.md  | 226 -
 src/images/fixed-time-windows.png   | Bin 0 -> 11717 bytes
 src/images/session-windows.png  | Bin 0 -> 16697 bytes
 src/images/sliding-time-windows.png | Bin 0 -> 16537 bytes
 src/images/unwindowed-pipeline-bounded.png  | Bin 0 -> 9589 bytes
 src/images/windowing-pipeline-bounded.png   | Bin 0 -> 13325 bytes
 src/images/windowing-pipeline-unbounded.png | Bin 0 -> 21890 bytes
 14 files changed, 467 insertions(+), 7 deletions(-)
--

[1/3] beam-site git commit: Add windowing section to programming guide

2017-03-20 Thread davor

Repository: beam-site
Updated Branches:
  refs/heads/asf-site 3b3bc65c2 -> 61de614fb


Add windowing section to programming guide


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/ddb60795
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/ddb60795
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/ddb60795

Branch: refs/heads/asf-site
Commit: ddb6079538a1683b798c72644eaa32f5aa7211bb
Parents: 3b3bc65
Author: melissa 
Authored: Wed Mar 1 19:54:17 2017 -0800
Committer: Davor Bonaci 
Committed: Mon Mar 20 14:56:17 2017 -0700

--
 src/documentation/programming-guide.md  | 226 ++-
 src/images/fixed-time-windows.png   | Bin 0 -> 11717 bytes
 src/images/session-windows.png  | Bin 0 -> 16697 bytes
 src/images/sliding-time-windows.png | Bin 0 -> 16537 bytes
 src/images/unwindowed-pipeline-bounded.png  | Bin 0 -> 9589 bytes
 src/images/windowing-pipeline-bounded.png   | Bin 0 -> 13325 bytes
 src/images/windowing-pipeline-unbounded.png | Bin 0 -> 21890 bytes
 7 files changed, 222 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/ddb60795/src/documentation/programming-guide.md
--
diff --git a/src/documentation/programming-guide.md 
b/src/documentation/programming-guide.md
index 57b49e8..e91a856 100644
--- a/src/documentation/programming-guide.md
+++ b/src/documentation/programming-guide.md
@@ -197,7 +197,7 @@ A `PCollection` can be either **bounded** or **unbounded** 
in size. A **bounded*
 
 The bounded (or unbounded) nature of your `PCollection` affects how Beam 
processes your data. A bounded `PCollection` can be processed using a batch 
job, which might read the entire data set once, and perform processing in a job 
of finite length. An unbounded `PCollection` must be processed using a 
streaming job that runs continuously, as the entire collection can never be 
available for processing at any one time.
 
-When performing an operation that groups elements in an unbounded 
`PCollection`, Beam requires a concept called **Windowing** to divide a 
continuously updating data set into logical windows of finite size.  Beam 
processes each window as a bundle, and processing continues as the data set is 
generated. These logical windows are determined by some characteristic 
associated with a data element, such as a **timestamp**.
+When performing an operation that groups elements in an unbounded 
`PCollection`, Beam requires a concept called **windowing** to divide a 
continuously updating data set into logical windows of finite size.  Beam 
processes each window as a bundle, and processing continues as the data set is 
generated. These logical windows are determined by some characteristic 
associated with a data element, such as a **timestamp**.
 
  Element timestamps
 
@@ -1193,7 +1193,7 @@ To set the default Coder for a Java Integer 
int values for a pipeline.
 
-```java  
+```java
 PipelineOptions options = PipelineOptionsFactory.create();
 Pipeline p = Pipeline.create(options);
 
@@ -1235,7 +1235,225 @@ public class MyCustomDataType {
 {:.language-py}
 The Beam SDK for Python does not support annotating data types with a default 
coder. If you would like to set a default coder, use the method described in 
the previous section, *Setting the default coder for a type*.
 
-
-
+## Working with windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its 
individual elements. Transforms that aggregate multiple elements, such as 
`GroupByKey` and `Combine`, work implicitly on a per-window basisâthat is, 
they process each `PCollection` as a succession of multiple, finite windows, 
though the entire collection itself may be of unbounded size.
+
+A related concept, called **triggers**, determines when to emit the results of 
aggregation as unbounded data arrives. Using a trigger can help to refine the 
windowing strategy for your `PCollection` to deal with late-arriving data or to 
provide early results. See the [triggers](#triggers) section for more 
information.
+
+### Windowing basics
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple 
elements by a common key. Ordinarily, that grouping operation groups all of the 
elements that have the same key within the entire data set. With an unbounded 
data set, it is impossible to collect all of the elements, since new elements 
are constantly being added and may be infinitely many (e.g. streaming data). If 
you are working with unbounded `PCollection`s, windowing is especially useful.
+
+In the Beam model, any `PCollection` (including unbounded `PCollection`s) can 
be subdivided into logical windows.

[jira] [Reopened] (BEAM-1157) Create HBaseIO

2017-03-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reopened BEAM-1157:


> Create HBaseIO
> --
>
> Key: BEAM-1157
> URL: https://issues.apache.org/jira/browse/BEAM-1157
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Affects Versions: Not applicable
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
> Fix For: 0.6.0
>
>
> Support for reading and writing to HBase. An initial plan is to keep a 
> similar API to that of BigTableIO so users can switch between both systems 
> (when possible).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (BEAM-1157) Create HBaseIO

2017-03-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía resolved BEAM-1157.

Resolution: Fixed

> Create HBaseIO
> --
>
> Key: BEAM-1157
> URL: https://issues.apache.org/jira/browse/BEAM-1157
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Affects Versions: Not applicable
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
> Fix For: 0.6.0
>
>
> Support for reading and writing to HBase. An initial plan is to keep a 
> similar API to that of BigTableIO so users can switch between both systems 
> (when possible).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Closed] (BEAM-1157) Create HBaseIO

2017-03-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía closed BEAM-1157.
--

> Create HBaseIO
> --
>
> Key: BEAM-1157
> URL: https://issues.apache.org/jira/browse/BEAM-1157
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Affects Versions: Not applicable
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
> Fix For: 0.6.0
>
>
> Support for reading and writing to HBase. An initial plan is to keep a 
> similar API to that of BigTableIO so users can switch between both systems 
> (when possible).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (BEAM-1157) Create HBaseIO

2017-03-20 Thread JIRA


 [ 
https://issues.apache.org/jira/browse/BEAM-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-1157:
---
Issue Type: New Feature  (was: Improvement)

> Create HBaseIO
> --
>
> Key: BEAM-1157
> URL: https://issues.apache.org/jira/browse/BEAM-1157
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Affects Versions: Not applicable
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
> Fix For: 0.6.0
>
>
> Support for reading and writing to HBase. An initial plan is to keep a 
> similar API to that of BigTableIO so users can switch between both systems 
> (when possible).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Jenkins build is back to stable : beam_PostCommit_Java_RunnableOnService_Dataflow #2595

2017-03-20 Thread Apache Jenkins Server

See

[jira] [Created] (BEAM-1767) Remove Aggregators from Dataflow runner

2017-03-20 Thread Pablo Estrada (JIRA)

Pablo Estrada created BEAM-1767:
---

 Summary: Remove Aggregators from Dataflow runner
 Key: BEAM-1767
 URL: https://issues.apache.org/jira/browse/BEAM-1767
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Pablo Estrada
Assignee: Davor Bonaci


I have started removing aggregators from the Java SDK, but runners use them in 
different ways that I can't figure out well. This is to track the independent 
effort in Dataflow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (BEAM-1765) Remove Aggregators from Spark runner

2017-03-20 Thread Pablo Estrada (JIRA)

Pablo Estrada created BEAM-1765:
---

 Summary: Remove Aggregators from Spark runner
 Key: BEAM-1765
 URL: https://issues.apache.org/jira/browse/BEAM-1765
 Project: Beam
  Issue Type: Bug
  Components: runner-spark
Reporter: Pablo Estrada
Assignee: Amit Sela


I have started removing aggregators from the Java SDK, but runners use them in 
different ways that I can't figure out well. This is to track the independent 
effort in Spark.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (BEAM-1766) Remove Aggregators from Apex runner

2017-03-20 Thread Pablo Estrada (JIRA)

Pablo Estrada created BEAM-1766:
---

 Summary: Remove Aggregators from Apex runner
 Key: BEAM-1766
 URL: https://issues.apache.org/jira/browse/BEAM-1766
 Project: Beam
  Issue Type: Bug
  Components: runner-apex
Reporter: Pablo Estrada


I have started removing aggregators from the Java SDK, but runners use them in 
different ways that I can't figure out well. This is to track the independent 
effort in Apex.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (BEAM-1764) Remove aggregators from Flink Runner

2017-03-20 Thread Pablo Estrada (JIRA)

Pablo Estrada created BEAM-1764:
---

 Summary: Remove aggregators from Flink Runner
 Key: BEAM-1764
 URL: https://issues.apache.org/jira/browse/BEAM-1764
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Reporter: Pablo Estrada
Assignee: Aljoscha Krettek


I have started removing aggregators from the Java SDK, but runners use them in 
different ways that I can't figure out well. This is to track the independent 
effort in Flink.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-1103) Add Tests For Aggregators in Flink Runner

2017-03-20 Thread Pablo Estrada (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933620#comment-15933620
 ] 

Pablo Estrada commented on BEAM-1103:
-

Perhaps we should close this since aggregators are being removed?

> Add Tests For Aggregators in Flink Runner
> -
>
> Key: BEAM-1103
> URL: https://issues.apache.org/jira/browse/BEAM-1103
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>
> We currently don't have tests that verify that aggregator values are 
> correctly forwarded to Flink.
> They didn't work correctly in the Batch Flink runner, as seen in BEAM-1102.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (BEAM-536) Aggregator.py. More misleading documentation. More bad documentation

2017-03-20 Thread Pablo Estrada (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933618#comment-15933618
 ] 

Pablo Estrada commented on BEAM-536:


Maybe mark as resolved?

> Aggregator.py.  More misleading documentation.  More bad documentation
> --
>
> Key: BEAM-536
> URL: https://issues.apache.org/jira/browse/BEAM-536
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Frank Yellin
>Priority: Minor
>
> The last paragraph of the documentation for Aggregator is:
> You can also query the combined value(s) of an aggregator by calling
> aggregated_value() or aggregated_values() on the result object returned after
> running a pipeline.
> There are multiple problems in this one sentence!
> #1) There is no such method aggregated_value() that I can find anywhere.
> #2) DirectRunner implements aggregated_values(), but DirectPipelineRunner 
> does not.  The latter is the far more interesting case.
> #3) When I use a BlockingDirectPipelineRunner and ask for its 
> aggregated_values(), I get an error message indicating that this is not 
> implemented in DirectPipelineRunner.  Very confusing since I never asked for 
> a DirectPipelineRunner.
> It is clear that this is because BlockingDirectPipelineRunner is a method 
> rather than a class.  Is this really the right thing?  Will there be other 
> confusing error messages.
> #4) The documentation for aggregated_values() says "returns a dict of step 
> names to values of the aggregator."  I have no idea what a "step" means in 
> this context.  In practice, it seems to be a single-element dictionary whose 
> key is 'user--' prefixed onto the aggregator name.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Jenkins build is back to normal : beam_PostCommit_Python_Verify #1558

2017-03-20 Thread Apache Jenkins Server

See

[jira] [Commented] (BEAM-1753) ImportError (cannot import name descriptor) in new venv after 'python setup.py install'

2017-03-20 Thread Ahmet Altay (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933611#comment-15933611
 ] 

Ahmet Altay commented on BEAM-1753:
---

I cannot reproduce this on a Mac.

Here is what I did:
- Fresh clone/fresh virtualenv
- python setup.py install
- pip install nose
- pip install pyhamcrest
- nosetests --logging-level=INFO apache_beam/io/fileio_test.py

Output I got:

$ nosetests --logging-level=INFO apache_beam/io/fileio_test.py
/Users/altay/Desktop/beam/tesm/beam/sdks/python/venv_tesm/lib/python2.7/site-packages/nose/plugins/manager.py:395:
 RuntimeWarning: Unable to load plugin beam_test_plugin = 
test_config:BeamTestPlugin: No module named test_config
  RuntimeWarning)
test_size_of_files_in_glob_complete 
(apache_beam.io.fileio_test.TestChannelFactory) ... ok
test_size_of_files_in_glob_incomplete 
(apache_beam.io.fileio_test.TestChannelFactory) ... ok
test_seekable (apache_beam.io.fileio_test.TestCompressedFile) ... ok
test_tell (apache_beam.io.fileio_test.TestCompressedFile) ... ok
test_empty_write (apache_beam.io.fileio_test.TestFileSink) ... ok
test_file_sink_display_data (apache_beam.io.fileio_test.TestFileSink) ... ok
test_file_sink_io_error (apache_beam.io.fileio_test.TestFileSink) ... ok
test_file_sink_multi_shards (apache_beam.io.fileio_test.TestFileSink) ... ok
test_file_sink_writing (apache_beam.io.fileio_test.TestFileSink) ... ok
test_fixed_shard_write (apache_beam.io.fileio_test.TestFileSink) ... 
/Users/altay/Desktop/beam/tesm/beam/sdks/python/apache_beam/coders/typecoders.py:132:
 UserWarning: Using fallback coder for typehint: Any.
  warnings.warn('Using fallback coder for typehint: %r.' % typehint)
ok
test_rename_batch (apache_beam.io.fileio_test.TestFileSink) ... ok

--
Ran 11 tests in 1.227s

OK

This is the output of 'pip list':
apache-beam (0.7.0.dev0)
appdirs (1.4.3)
avro (1.8.1)
crcmod (1.7)
dill (0.2.6)
funcsigs (1.0.2)
httplib2 (0.9.2)
mock (2.0.0)
nose (1.3.7)
oauth2client (3.0.0)
packaging (16.8)
pbr (2.0.0)
pip (9.0.1)
protobuf (3.2.0)
pyasn1 (0.2.3)
pyasn1-modules (0.0.8)
PyHamcrest (1.9.0)
pyparsing (2.2.0)
PyYAML (3.12)
rsa (3.4.2)
setuptools (34.3.2)
six (1.10.0)
wheel (0.30.0a0)

> ImportError (cannot import name descriptor) in new venv after 'python 
> setup.py install'
> ---
>
> Key: BEAM-1753
> URL: https://issues.apache.org/jira/browse/BEAM-1753
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: María GH
>Assignee: Ahmet Altay
>
> After 'python setup.py install' in a clean virtual environment, I get the 
> following when running nosetest:
> (dataflow) mariagh (ppp_inmaster *) python $ nosetests --logging-level=INFO 
> apache_beam/io/fileio_test.py
> /Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/plugins/manager.py:395:
>  RuntimeWarning: Unable to load plugin beam_test_plugin = 
> test_config:BeamTestPlugin: (dill 0.2.5 
> (/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages), 
> Requirement.parse('dill==0.2.6'))
>   RuntimeWarning)
> Failure: ImportError (cannot import name descriptor) ... ERROR
> ==
> ERROR: Failure: ImportError (cannot import name descriptor)
> --
> Traceback (most recent call last):
>   File 
> "/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/loader.py",
>  line 418, in loadTestsFromName
> addr.filename, addr.module)
>   File 
> "/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/importer.py",
>  line 47, in importFromPath
> return self.importFromDir(dir_path, fqname)
>   File 
> "/Users/mariagh/Documents/venvs/dataflow/lib/python2.7/site-packages/nose/importer.py",
>  line 94, in importFromDir
> mod = load_module(part_fqname, fh, filename, desc)
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/__init__.py",
>  line 77, in 
> from apache_beam import coders
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/coders/__init__.py",
>  line 18, in 
> from apache_beam.coders.coders import *
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/coders/coders.py",
>  line 26, in 
> from apache_beam.utils import proto_utils
>   File 
> "/Users/mariagh/Documents/beam/incubator-beam/sdks/python/apache_beam/utils/proto_utils.py",
>  line 18, in 
> from google.protobuf import any_pb2
>   File "build/bdist.macosx-10.11-x86_64/egg/google/protobuf/any_pb2.py", line 
> 6, in 
> ImportError: cannot import name descriptor
>

Jenkins build became unstable: beam_PostCommit_Java_RunnableOnService_Spark #1296

2017-03-20 Thread Apache Jenkins Server

See

Jenkins build is back to stable : beam_PostCommit_Java_MavenInstall #2960

2017-03-20 Thread Apache Jenkins Server

See

Jenkins build is back to stable : beam_PostCommit_Java_RunnableOnService_Spark #1295

2017-03-20 Thread Apache Jenkins Server

See

1 2 3 >

1 - 100 of 203 matches

Mail list logo