Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2482

2017-06-26 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3448: [BEAM-2521] Use installed distribution name for sdk...

2017-06-26 Thread aaltay
GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/3448

[BEAM-2521] Use installed distribution name for sdk name

Choose SDK name based on installed distributions. This would make it easier 
for downstream distributions directly depending on Beam to use custom sdk 
names. (It is also cleaner than using container versions.)

R: @charlesccychen 
cc: @tvalentyn 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam dist2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3448.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3448


commit 98bf7e8e78b6b8ad615d438b04af6c3a3863ac79
Author: Ahmet Altay 
Date:   2017-06-27T06:22:36Z

Use installed distribution name for sdk name




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2521) Simplify packaging for python distributions

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064345#comment-16064345
 ] 

ASF GitHub Bot commented on BEAM-2521:
--

GitHub user aaltay opened a pull request:

https://github.com/apache/beam/pull/3448

[BEAM-2521] Use installed distribution name for sdk name

Choose SDK name based on installed distributions. This would make it easier 
for downstream distributions directly depending on Beam to use custom sdk 
names. (It is also cleaner than using container versions.)

R: @charlesccychen 
cc: @tvalentyn 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aaltay/beam dist2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3448.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3448


commit 98bf7e8e78b6b8ad615d438b04af6c3a3863ac79
Author: Ahmet Altay 
Date:   2017-06-27T06:22:36Z

Use installed distribution name for sdk name




> Simplify packaging for python distributions
> ---
>
> Key: BEAM-2521
> URL: https://issues.apache.org/jira/browse/BEAM-2521
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Ahmet Altay
> Fix For: 2.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2481

2017-06-26 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3444: Implement streaming GroupByKey in Python DirectRunn...

2017-06-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3444


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: This closes #3444

2017-06-26 Thread altay
This closes #3444


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8036001d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8036001d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8036001d

Branch: refs/heads/master
Commit: 8036001da6f90eac20d787a05d35a51e30146278
Parents: 95e6bbe eb379e7
Author: Ahmet Altay 
Authored: Mon Jun 26 22:49:19 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jun 26 22:49:19 2017 -0700

--
 .../apache_beam/runners/direct/direct_runner.py |  29 +++-
 .../runners/direct/evaluation_context.py|   2 +-
 .../runners/direct/transform_evaluator.py   | 138 ++-
 sdks/python/apache_beam/runners/direct/util.py  |  25 ++--
 .../runners/direct/watermark_manager.py |  26 ++--
 .../apache_beam/testing/test_stream_test.py |  37 -
 sdks/python/apache_beam/transforms/trigger.py   |  16 +++
 7 files changed, 239 insertions(+), 34 deletions(-)
--




[1/2] beam git commit: Implement streaming GroupByKey in Python DirectRunner

2017-06-26 Thread altay
Repository: beam
Updated Branches:
  refs/heads/master 95e6bbe50 -> 8036001da


Implement streaming GroupByKey in Python DirectRunner


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/eb379e76
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/eb379e76
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/eb379e76

Branch: refs/heads/master
Commit: eb379e76adaa9c4b4e24a4b3c5757be8523d95c4
Parents: 95e6bbe
Author: Charles Chen 
Authored: Mon Jun 26 16:54:00 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jun 26 22:49:06 2017 -0700

--
 .../apache_beam/runners/direct/direct_runner.py |  29 +++-
 .../runners/direct/evaluation_context.py|   2 +-
 .../runners/direct/transform_evaluator.py   | 138 ++-
 sdks/python/apache_beam/runners/direct/util.py  |  25 ++--
 .../runners/direct/watermark_manager.py |  26 ++--
 .../apache_beam/testing/test_stream_test.py |  37 -
 sdks/python/apache_beam/transforms/trigger.py   |  16 +++
 7 files changed, 239 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/eb379e76/sdks/python/apache_beam/runners/direct/direct_runner.py
--
diff --git a/sdks/python/apache_beam/runners/direct/direct_runner.py 
b/sdks/python/apache_beam/runners/direct/direct_runner.py
index d80ef10..2a75977 100644
--- a/sdks/python/apache_beam/runners/direct/direct_runner.py
+++ b/sdks/python/apache_beam/runners/direct/direct_runner.py
@@ -34,6 +34,7 @@ from apache_beam.runners.runner import PipelineRunner
 from apache_beam.runners.runner import PipelineState
 from apache_beam.runners.runner import PValueCache
 from apache_beam.transforms.core import _GroupAlsoByWindow
+from apache_beam.transforms.core import _GroupByKeyOnly
 from apache_beam.options.pipeline_options import DirectOptions
 from apache_beam.options.pipeline_options import StandardOptions
 from apache_beam.options.value_provider import RuntimeValueProvider
@@ -47,6 +48,13 @@ K = typehints.TypeVariable('K')
 V = typehints.TypeVariable('V')
 
 
+@typehints.with_input_types(typehints.KV[K, V])
+@typehints.with_output_types(typehints.KV[K, typehints.Iterable[V]])
+class _StreamingGroupByKeyOnly(_GroupByKeyOnly):
+  """Streaming GroupByKeyOnly placeholder for overriding in DirectRunner."""
+  pass
+
+
 @typehints.with_input_types(typehints.KV[K, typehints.Iterable[V]])
 @typehints.with_output_types(typehints.KV[K, typehints.Iterable[V]])
 class _StreamingGroupAlsoByWindow(_GroupAlsoByWindow):
@@ -79,17 +87,24 @@ class DirectRunner(PipelineRunner):
 except NotImplementedError:
   return transform.expand(pcoll)
 
+  def apply__GroupByKeyOnly(self, transform, pcoll):
+if (transform.__class__ == _GroupByKeyOnly and
+pcoll.pipeline._options.view_as(StandardOptions).streaming):
+  # Use specialized streaming implementation, if requested.
+  type_hints = transform.get_type_hints()
+  return pcoll | (_StreamingGroupByKeyOnly()
+  .with_input_types(*type_hints.input_types[0])
+  .with_output_types(*type_hints.output_types[0]))
+return transform.expand(pcoll)
+
   def apply__GroupAlsoByWindow(self, transform, pcoll):
 if (transform.__class__ == _GroupAlsoByWindow and
 pcoll.pipeline._options.view_as(StandardOptions).streaming):
   # Use specialized streaming implementation, if requested.
-  raise NotImplementedError(
-  'Streaming support is not yet available on the DirectRunner.')
-  # TODO(ccy): enable when streaming implementation is plumbed through.
-  # type_hints = transform.get_type_hints()
-  # return pcoll | (_StreamingGroupAlsoByWindow(transform.windowing)
-  # .with_input_types(*type_hints.input_types[0])
-  # .with_output_types(*type_hints.output_types[0]))
+  type_hints = transform.get_type_hints()
+  return pcoll | (_StreamingGroupAlsoByWindow(transform.windowing)
+  .with_input_types(*type_hints.input_types[0])
+  .with_output_types(*type_hints.output_types[0]))
 return transform.expand(pcoll)
 
   def run(self, pipeline):

http://git-wip-us.apache.org/repos/asf/beam/blob/eb379e76/sdks/python/apache_beam/runners/direct/evaluation_context.py
--
diff --git a/sdks/python/apache_beam/runners/direct/evaluation_context.py 
b/sdks/python/apache_beam/runners/direct/evaluation_context.py
index 669a68a..54c407c 100644
--- a/sdks/python/apache_beam/runners/direct/evaluation_context.py
+++ b/sdks/python/apache_beam/runners/direct/evaluation_context.py
@@ -213,7 +213,7 @@ class EvaluationContext(object):
   result.unprocessed_bundl

[jira] [Commented] (BEAM-2287) UDAF support

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064280#comment-16064280
 ] 

ASF GitHub Bot commented on BEAM-2287:
--

GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3447

[BEAM-2287] UDAF support

R: @xumingming @takidau 

add an abstract class `BeamSqlUdaf` following the UDAF definition in 
Calcite, also COUNT/SUM/AVG/MAX/MIN/ are rewritten with this new format.

Note that the unit test is ignored after rebase BEAM-2446. Will re-open it 
in BEAM-2520.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-2287

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3447


commit eab43c0830655e4497693a9e24f6a560ba742858
Author: mingmxu 
Date:   2017-06-26T23:03:51Z

support of UDAF + rebase




> UDAF support
> 
>
> Key: BEAM-2287
> URL: https://issues.apache.org/jira/browse/BEAM-2287
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>  Labels: dsl_sql_merge
>
> Create an aggregation wrapper, to accept UDAF functions, with 
> {{AggregateFunctionImpl}};



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3447: [BEAM-2287] UDAF support

2017-06-26 Thread XuMingmin
GitHub user XuMingmin opened a pull request:

https://github.com/apache/beam/pull/3447

[BEAM-2287] UDAF support

R: @xumingming @takidau 

add an abstract class `BeamSqlUdaf` following the UDAF definition in 
Calcite, also COUNT/SUM/AVG/MAX/MIN/ are rewritten with this new format.

Note that the unit test is ignored after rebase BEAM-2446. Will re-open it 
in BEAM-2520.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/XuMingmin/beam BEAM-2287

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3447.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3447


commit eab43c0830655e4497693a9e24f6a560ba742858
Author: mingmxu 
Date:   2017-06-26T23:03:51Z

support of UDAF + rebase




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2523) GCP IO exposes protobuf on its API surface, causing user pain

2017-06-26 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064247#comment-16064247
 ] 

Luke Cwik commented on BEAM-2523:
-

Based upon the dependency dump, it seems as though no protobuf-java is being 
included which doesn't seem correct.

Also, even though GCP IO may have dependencies which have 3 different versions, 
how is it that all 3 are being pulled in on the classpath? (Maven should be 
resolving dependencies so only one major version of the same dependency is ever 
used.)

> GCP IO exposes protobuf on its API surface, causing user pain
> -
>
> Key: BEAM-2523
> URL: https://issues.apache.org/jira/browse/BEAM-2523
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 2.1.0
>
>
> Putting the SDK, DataflowRunner, and GCP IO on the same classpath, results in 
> (at least) three versions of protobuf getting pulled in. These should be made 
> to converge. We should consider using maven enforcer, which I think can check 
> this.
> {code}
> [INFO] com.example:foo:jar:0.1
> [INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
> [INFO] +- 
> org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
> [INFO] |  +- 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
> [INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> [INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
> [INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- 
> com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
> [INFO] |  |  +- 
> (com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile 
> - omitted for conflict with 1.22.0)
> [INFO] |  |  +- 
> com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
> [INFO] |  |  |  +- 
> (com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
> conflict with 1.22.0)
> [INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
> [INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> {code}
> Incidentally, the dependency plugin stopped supporting the verbose tree, so 
> we can't even visually inspect this except by downgrading.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (BEAM-2523) GCP IO exposes protobuf on its API surface, causing user pain

2017-06-26 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2523:
--
Comment: was deleted

(was: Putting the SDK, DataflowRunner, and GCP IO on the same classpath, 
results in (at least) three versions of protobuf getting pulled in. These 
should be made to converge. We should consider using maven enforcer, which I 
think can check this.

{code}
[INFO] com.example:foo:jar:0.1
[INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
[INFO] +- 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
[INFO] |  +- 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
[INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - omitted 
for duplicate)
[INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - omitted 
for conflict with 3.2.0)
[INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
[INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
omitted for conflict with 3.2.0)
[INFO] |  +- 
com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
[INFO] |  |  +- 
(com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted for 
duplicate)
[INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile - 
omitted for conflict with 1.22.0)
[INFO] |  |  +- 
com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
[INFO] |  |  |  +- 
(com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
conflict with 1.22.0)
[INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
omitted for conflict with 3.2.0)
[INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - omitted 
for conflict with 3.2.0)
[INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
[INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
for duplicate)
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - omitted 
for duplicate)
{code}

Incidentally, the dependency plugin stopped supporting the verbose tree, so we 
can't even visually inspect this except by downgrading.)

> GCP IO exposes protobuf on its API surface, causing user pain
> -
>
> Key: BEAM-2523
> URL: https://issues.apache.org/jira/browse/BEAM-2523
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 2.1.0
>
>
> Putting the SDK, DataflowRunner, and GCP IO on the same classpath, results in 
> (at least) three versions of protobuf getting pulled in. These should be made 
> to converge. We should consider using maven enforcer, which I think can check 
> this.
> {code}
> [INFO] com.example:foo:jar:0.1
> [INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
> [INFO] +- 
> org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
> [INFO] |  +- 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
> [INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> [INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
> [INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- 
> com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
> [INFO] |  |  +- 
> (com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile 
> - omitted for conflict with 1.22.0)
> [INFO] |  |  +- 
> com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
> [INFO] |  |  |  +- 
> (com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
> conflict with 1.22.0)
> [INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
> [INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> {code}
> Incidentally, the d

[jira] [Updated] (BEAM-2523) GCP IO exposes protobuf on its API surface, causing user pain

2017-06-26 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2523:
--
Description: 
Putting the SDK, DataflowRunner, and GCP IO on the same classpath, results in 
(at least) three versions of protobuf getting pulled in. These should be made 
to converge. We should consider using maven enforcer, which I think can check 
this.

{code}
[INFO] com.example:foo:jar:0.1
[INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
[INFO] +- 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
[INFO] |  +- 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
[INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - omitted 
for duplicate)
[INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - omitted 
for conflict with 3.2.0)
[INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
[INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
omitted for conflict with 3.2.0)
[INFO] |  +- 
com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
[INFO] |  |  +- 
(com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted for 
duplicate)
[INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile - 
omitted for conflict with 1.22.0)
[INFO] |  |  +- 
com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
[INFO] |  |  |  +- 
(com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
conflict with 1.22.0)
[INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
omitted for conflict with 3.2.0)
[INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - omitted 
for conflict with 3.2.0)
[INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
[INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
for duplicate)
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - omitted 
for duplicate)
{code}

Incidentally, the dependency plugin stopped supporting the verbose tree, so we 
can't even visually inspect this except by downgrading.

> GCP IO exposes protobuf on its API surface, causing user pain
> -
>
> Key: BEAM-2523
> URL: https://issues.apache.org/jira/browse/BEAM-2523
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 2.1.0
>
>
> Putting the SDK, DataflowRunner, and GCP IO on the same classpath, results in 
> (at least) three versions of protobuf getting pulled in. These should be made 
> to converge. We should consider using maven enforcer, which I think can check 
> this.
> {code}
> [INFO] com.example:foo:jar:0.1
> [INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
> [INFO] +- 
> org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
> [INFO] |  +- 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
> [INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> [INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
> [INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- 
> com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
> [INFO] |  |  +- 
> (com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile 
> - omitted for conflict with 1.22.0)
> [INFO] |  |  +- 
> com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
> [INFO] |  |  |  +- 
> (com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
> conflict with 1.22.0)
> [INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
> [INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> {code}
> Incidentally, the dependency plugi

[jira] [Commented] (BEAM-2523) GCP IO exposes protobuf on its API surface, causing user pain

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064239#comment-16064239
 ] 

Kenneth Knowles commented on BEAM-2523:
---

Putting the SDK, DataflowRunner, and GCP IO on the same classpath, results in 
(at least) three versions of protobuf getting pulled in. These should be made 
to converge. We should consider using maven enforcer, which I think can check 
this.

{code}
[INFO] com.example:foo:jar:0.1
[INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
[INFO] +- 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
[INFO] |  +- 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
[INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - omitted 
for duplicate)
[INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - omitted 
for conflict with 3.2.0)
[INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
[INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
omitted for conflict with 3.2.0)
[INFO] |  +- 
com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
[INFO] |  |  +- 
(com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted for 
duplicate)
[INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile - 
omitted for conflict with 1.22.0)
[INFO] |  |  +- 
com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
[INFO] |  |  |  +- 
(com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
conflict with 1.22.0)
[INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
omitted for conflict with 3.2.0)
[INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - omitted 
for conflict with 3.2.0)
[INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
[INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
for duplicate)
[INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - omitted 
for duplicate)
{code}

Incidentally, the dependency plugin stopped supporting the verbose tree, so we 
can't even visually inspect this except by downgrading.

> GCP IO exposes protobuf on its API surface, causing user pain
> -
>
> Key: BEAM-2523
> URL: https://issues.apache.org/jira/browse/BEAM-2523
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 2.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2523) GCP IO exposes protobuf on its API surface, causing user pain

2017-06-26 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2523:
--
Summary: GCP IO exposes protobuf on its API surface, causing user pain  
(was: Dataflow runner exposes protobuf on its API surface, causing user pain)

> GCP IO exposes protobuf on its API surface, causing user pain
> -
>
> Key: BEAM-2523
> URL: https://issues.apache.org/jira/browse/BEAM-2523
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 2.1.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Issue Comment Deleted] (BEAM-940) ByteBuddyOnTimerInvokerFactory: key the cache with a (Class, id) tuple or OnTimerMethod

2017-06-26 Thread Innocent (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Innocent updated BEAM-940:
--
Comment: was deleted

(was: Do you mean LoadingCache> was 
renamed to ByteBuddyOnTimerInvokerFactory? or you are refering to 
ByteBuddyDoFnOnTimerInvokerFactory.)

> ByteBuddyOnTimerInvokerFactory: key the cache with a (Class, id) tuple or 
> OnTimerMethod
> ---
>
> Key: BEAM-940
> URL: https://issues.apache.org/jira/browse/BEAM-940
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Innocent
>Priority: Trivial
>
> Right now it is a {{LoadingCache>}}. 
> It is correct but just a bit less straightforward then we might like.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2523) Dataflow runner exposes protobuf on its API surface, causing user pain

2017-06-26 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2523:
-

 Summary: Dataflow runner exposes protobuf on its API surface, 
causing user pain
 Key: BEAM-2523
 URL: https://issues.apache.org/jira/browse/BEAM-2523
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Kenneth Knowles
Assignee: Kenneth Knowles
 Fix For: 2.1.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2498) Dataflow runner should shade Runner/Fn API protos

2017-06-26 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles updated BEAM-2498:
--
Fix Version/s: 2.1.0

> Dataflow runner should shade Runner/Fn API protos
> -
>
> Key: BEAM-2498
> URL: https://issues.apache.org/jira/browse/BEAM-2498
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
> Fix For: 2.1.0
>
>
> Just checked, and runners-core-construction is shaded but not the Runner API 
> protos. There may be a technical reason this cannot be done trivially, but we 
> need to work at it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2522) upgrading jackson

2017-06-26 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064198#comment-16064198
 ] 

Luke Cwik commented on BEAM-2522:
-

[~antonymayi] Do you want to propose a PR which does this change?

> upgrading jackson
> -
>
> Key: BEAM-2522
> URL: https://issues.apache.org/jira/browse/BEAM-2522
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Antony Mayi
>Assignee: Davor Bonaci
>  Labels: security
>
> please consider upgrading jackson to mitigate its [deserlization 
> vulnerability in 
> 2.8.8|https://github.com/FasterXML/jackson-databind/issues/1599]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2522) upgrading jackson

2017-06-26 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2522:

Priority: Minor  (was: Major)

> upgrading jackson
> -
>
> Key: BEAM-2522
> URL: https://issues.apache.org/jira/browse/BEAM-2522
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Antony Mayi
>Assignee: Davor Bonaci
>Priority: Minor
>  Labels: security
>
> please consider upgrading jackson to mitigate its [deserlization 
> vulnerability in 
> 2.8.8|https://github.com/FasterXML/jackson-databind/issues/1599]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-210) Allow control of empty ON_TIME panes analogous to final panes

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064192#comment-16064192
 ] 

ASF GitHub Bot commented on BEAM-210:
-

GitHub user peihe opened a pull request:

https://github.com/apache/beam/pull/3446

[BEAM-210] WindowingStrategy: add OnTimeBehavior to control whether to emit 
empty ON_TIME pane.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam backport-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3446.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3446


commit 311ad8c59439948a0c7f4a2fcece589696b4d09f
Author: Pei He 
Date:   2017-06-20T23:09:26Z

WindowingStrategy: add OnTimeBehavior to control whether to emit empty 
ON_TIME pane.

commit 26ffd22ee9cb64d54a69f92bfd834796640ee875
Author: Author: 波特 
Date:   2017-05-26T09:46:55Z

ReduceFnRunner.onTrigger: add short circuit for empty pane, and move 
inputWM and pane after the short circuit.




> Allow control of empty ON_TIME panes analogous to final panes
> -
>
> Key: BEAM-210
> URL: https://issues.apache.org/jira/browse/BEAM-210
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model-runner-api, sdk-java-core
>Reporter: Mark Shields
>Assignee: Pei He
>
> Today, ON_TIME panes are emitted whether or not they are empty. We had 
> decided that for final panes the user would want to control this behavior, to 
> control data volume. But for ON_TIME panes no such control exists. The 
> rationale is perhaps that the ON_TIME pane is a fundamental result that 
> should not be elided. To be considered: whether this is what we want.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3446: [BEAM-210] WindowingStrategy: add OnTimeBehavior to...

2017-06-26 Thread peihe
GitHub user peihe opened a pull request:

https://github.com/apache/beam/pull/3446

[BEAM-210] WindowingStrategy: add OnTimeBehavior to control whether to emit 
empty ON_TIME pane.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/peihe/incubator-beam backport-4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3446.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3446


commit 311ad8c59439948a0c7f4a2fcece589696b4d09f
Author: Pei He 
Date:   2017-06-20T23:09:26Z

WindowingStrategy: add OnTimeBehavior to control whether to emit empty 
ON_TIME pane.

commit 26ffd22ee9cb64d54a69f92bfd834796640ee875
Author: Author: 波特 
Date:   2017-05-26T09:46:55Z

ReduceFnRunner.onTrigger: add short circuit for empty pane, and move 
inputWM and pane after the short circuit.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2522) upgrading jackson

2017-06-26 Thread Antony Mayi (JIRA)
Antony Mayi created BEAM-2522:
-

 Summary: upgrading jackson
 Key: BEAM-2522
 URL: https://issues.apache.org/jira/browse/BEAM-2522
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core
Affects Versions: 2.0.0, 2.1.0
Reporter: Antony Mayi
Assignee: Davor Bonaci


please consider upgrading jackson to mitigate its [deserlization vulnerability 
in 2.8.8|https://github.com/FasterXML/jackson-databind/issues/1599]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2356) support logical operators

2017-06-26 Thread James Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Xu resolved BEAM-2356.

   Resolution: Fixed
Fix Version/s: Not applicable

> support logical operators
> -
>
> Key: BEAM-2356
> URL: https://issues.apache.org/jira/browse/BEAM-2356
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2191) Support Set opeators

2017-06-26 Thread James Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Xu resolved BEAM-2191.

   Resolution: Fixed
Fix Version/s: Not applicable

> Support Set opeators
> 
>
> Key: BEAM-2191
> URL: https://issues.apache.org/jira/browse/BEAM-2191
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
> Fix For: Not applicable
>
>
> support the set operators in query:
>   query UNION [ ALL | DISTINCT ] query
>   query EXCEPT [ ALL | DISTINCT ] query
>   query MINUS [ ALL | DISTINCT ] query
>   query INTERSECT [ ALL | DISTINCT ] query



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2325) Support Set Operator: INTERSECT, EXCEPT, UNION

2017-06-26 Thread James Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Xu resolved BEAM-2325.

   Resolution: Fixed
Fix Version/s: Not applicable

> Support Set Operator: INTERSECT, EXCEPT, UNION
> --
>
> Key: BEAM-2325
> URL: https://issues.apache.org/jira/browse/BEAM-2325
> Project: Beam
>  Issue Type: Sub-task
>  Components: dsl-sql
>Reporter: James Xu
>Assignee: James Xu
> Fix For: Not applicable
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3445: [BEAM-2509}] Enable grpc controller in fn_api_runne...

2017-06-26 Thread vikkyrk
GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/3445

[BEAM-2509}] Enable grpc controller in fn_api_runner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam fn_api_runner_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3445.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3445


commit 556d291bc3154695450ac18076c82790ecb7
Author: Vikas Kedigehalli 
Date:   2017-06-27T01:47:39Z

Enable grpc controller in fn_api_runner




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2509) Fn API Runner hangs in grpc controller mode

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064104#comment-16064104
 ] 

ASF GitHub Bot commented on BEAM-2509:
--

GitHub user vikkyrk opened a pull request:

https://github.com/apache/beam/pull/3445

[BEAM-2509}] Enable grpc controller in fn_api_runner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vikkyrk/incubator-beam fn_api_runner_test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3445.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3445


commit 556d291bc3154695450ac18076c82790ecb7
Author: Vikas Kedigehalli 
Date:   2017-06-27T01:47:39Z

Enable grpc controller in fn_api_runner




> Fn API Runner hangs in grpc controller mode
> ---
>
> Key: BEAM-2509
> URL: https://issues.apache.org/jira/browse/BEAM-2509
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model-fn-api, sdk-py
>Reporter: Vikas Kedigehalli
>Assignee: Luke Cwik
>Priority: Minor
>
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/fn_api_runner.py#L312
>  tests only run in direct mode, but we should run in grpc mode as well. 
> Currently the grpc mode is broken and needs fixing. Once we enable it, these 
> tests can catch issues like https://github.com/apache/beam/pull/3431



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064098#comment-16064098
 ] 

Kenneth Knowles commented on BEAM-2140:
---

[~lzljs3620320] you are correct - the decision about whether a window is 
expired is based on the input watermark. That is why it does not continue 
processing processing time timers.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4219

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Jingsong Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064090#comment-16064090
 ] 

Jingsong Lee commented on BEAM-2140:


On 1: If the decision whether the window expired is output watermark hold,(Uh. 
I always thought it was input watermark) it does not end, need to continue 
processing ProcessTimer. 

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2447) Reintroduce DoFn.ProcessContinuation

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064074#comment-16064074
 ] 

Kenneth Knowles commented on BEAM-2447:
---

You have forbidden the behavior of returning {{resume()}} after a failed 
{{tryClaim()}} call. I am suggesting to remove the proscription by making the 
one and only allowed behavior automatically followed.

> Reintroduce DoFn.ProcessContinuation
> 
>
> Key: BEAM-2447
> URL: https://issues.apache.org/jira/browse/BEAM-2447
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>
> ProcessContinuation.resume() is useful for tailing files - when we reach 
> current EOF, we want to voluntarily suspend the process() call rather than 
> wait for runner to checkpoint us.
> In BEAM-1903, DoFn.ProcessContinuation was removed because there was 
> ambiguity about the semantics of resume() especially w.r.t. the following 
> situation described in 
> https://docs.google.com/document/d/1BGc8pM1GOvZhwR9SARSVte-20XEoBUxrGJ5gTWXdv3c/edit
>  : the runner has taken a checkpoint on the tracker, and then the 
> ProcessElement call returns resume() signaling that the work is still not 
> done - then there's 2 checkpoints to deal with.
> Instead, the proper way to refine this semantics is:
> - After checkpoint() on a RestrictionTracker, the tracker MUST fail all 
> subsequent tryClaim() calls, and MUST succeed in checkDone().
> - After a failed tryClaim() call, the ProcessElement method MUST return stop()
> - So ProcessElement can return resume() only *instead* of doing tryClaim()
> - Then, if the runner has already taken a checkpoint but tracker has returned 
> resume(), we do not need to take a new checkpoint - the one already taken 
> already accurately describes the remainder of the work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-1962) Connection should be closed in case start() throws exception

2017-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated BEAM-1962:
-
Description: 
In JmsIO#start() :
{code}
  try {
Connection connection;
if (spec.getUsername() != null) {
  connection =
  connectionFactory.createConnection(spec.getUsername(), 
spec.getPassword());
} else {
  connection = connectionFactory.createConnection();
}
connection.start();
this.connection = connection;
  } catch (Exception e) {
throw new IOException("Error connecting to JMS", e);
  }
{code}
If start() throws exception, connection should be closed.

  was:
In JmsIO#start() :

{code}
  try {
Connection connection;
if (spec.getUsername() != null) {
  connection =
  connectionFactory.createConnection(spec.getUsername(), 
spec.getPassword());
} else {
  connection = connectionFactory.createConnection();
}
connection.start();
this.connection = connection;
  } catch (Exception e) {
throw new IOException("Error connecting to JMS", e);
  }
{code}
If start() throws exception, connection should be closed.


> Connection should be closed in case start() throws exception
> 
>
> Key: BEAM-1962
> URL: https://issues.apache.org/jira/browse/BEAM-1962
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Ted Yu
>Assignee: Jean-Baptiste Onofré
>Priority: Minor
>
> In JmsIO#start() :
> {code}
>   try {
> Connection connection;
> if (spec.getUsername() != null) {
>   connection =
>   connectionFactory.createConnection(spec.getUsername(), 
> spec.getPassword());
> } else {
>   connection = connectionFactory.createConnection();
> }
> connection.start();
> this.connection = connection;
>   } catch (Exception e) {
> throw new IOException("Error connecting to JMS", e);
>   }
> {code}
> If start() throws exception, connection should be closed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2335) Document various maven commands for running tests

2017-06-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated BEAM-2335:
-
Description: 
In this discussion thread, various maven commands for running / not running 
selected tests were mentioned:

http://search-hadoop.com/m/Beam/gfKHFd9bPDh5WJr1?subj=Re+How+can+I+disable+running+Python+SDK+tests+when+testing+my+Java+change+

We should document these commands under 
https://beam.apache.org/contribute/testing/ 

Borisa raised the following questions:

how do I execute only one test marked as @NeedsRunner?
How do I execute one specific test in java io?
How to execute one pecific test in any of the runners?
How to use beamTestpipelineoptions with few json examples?
Will mvn clean verify execute ALL tests against all runners?


For the #1 above, we can create profile which is used run tests in NeedsRunner 
category.
See the following:
http://stackoverflow.com/questions/3100924/how-to-run-junit-tests-by-category-in-maven

  was:
In this discussion thread, various maven commands for running / not running 
selected tests were mentioned:

http://search-hadoop.com/m/Beam/gfKHFd9bPDh5WJr1?subj=Re+How+can+I+disable+running+Python+SDK+tests+when+testing+my+Java+change+

We should document these commands under 
https://beam.apache.org/contribute/testing/ 

Borisa raised the following questions:

how do I execute only one test marked as @NeedsRunner?
How do I execute one specific test in java io?
How to execute one pecific test in any of the runners?
How to use beamTestpipelineoptions with few json examples?
Will mvn clean verify execute ALL tests against all runners?

For the #1 above, we can create profile which is used run tests in NeedsRunner 
category.
See the following:
http://stackoverflow.com/questions/3100924/how-to-run-junit-tests-by-category-in-maven


> Document various maven commands for running tests
> -
>
> Key: BEAM-2335
> URL: https://issues.apache.org/jira/browse/BEAM-2335
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Ted Yu
>
> In this discussion thread, various maven commands for running / not running 
> selected tests were mentioned:
> http://search-hadoop.com/m/Beam/gfKHFd9bPDh5WJr1?subj=Re+How+can+I+disable+running+Python+SDK+tests+when+testing+my+Java+change+
> We should document these commands under 
> https://beam.apache.org/contribute/testing/ 
> Borisa raised the following questions:
> how do I execute only one test marked as @NeedsRunner?
> How do I execute one specific test in java io?
> How to execute one pecific test in any of the runners?
> How to use beamTestpipelineoptions with few json examples?
> Will mvn clean verify execute ALL tests against all runners?
> For the #1 above, we can create profile which is used run tests in 
> NeedsRunner category.
> See the following:
> http://stackoverflow.com/questions/3100924/how-to-run-junit-tests-by-category-in-maven



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4218

2017-06-26 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2480

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2447) Reintroduce DoFn.ProcessContinuation

2017-06-26 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064054#comment-16064054
 ] 

Eugene Kirpichov commented on BEAM-2447:


If the runner did not already take a checkpoint, and the DoFn returns resume(), 
and the runner treats it as stop(), then the DoFn won't be resumed even though 
it should.

Perhaps you meant treat everything as resume()? In that case, the DoFn will 
never terminate, even if it's done with the restriction, because it'll 
constantly be resumed and will keep producing empty checkpoints.

> Reintroduce DoFn.ProcessContinuation
> 
>
> Key: BEAM-2447
> URL: https://issues.apache.org/jira/browse/BEAM-2447
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>
> ProcessContinuation.resume() is useful for tailing files - when we reach 
> current EOF, we want to voluntarily suspend the process() call rather than 
> wait for runner to checkpoint us.
> In BEAM-1903, DoFn.ProcessContinuation was removed because there was 
> ambiguity about the semantics of resume() especially w.r.t. the following 
> situation described in 
> https://docs.google.com/document/d/1BGc8pM1GOvZhwR9SARSVte-20XEoBUxrGJ5gTWXdv3c/edit
>  : the runner has taken a checkpoint on the tracker, and then the 
> ProcessElement call returns resume() signaling that the work is still not 
> done - then there's 2 checkpoints to deal with.
> Instead, the proper way to refine this semantics is:
> - After checkpoint() on a RestrictionTracker, the tracker MUST fail all 
> subsequent tryClaim() calls, and MUST succeed in checkDone().
> - After a failed tryClaim() call, the ProcessElement method MUST return stop()
> - So ProcessElement can return resume() only *instead* of doing tryClaim()
> - Then, if the runner has already taken a checkpoint but tracker has returned 
> resume(), we do not need to take a new checkpoint - the one already taken 
> already accurately describes the remainder of the work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2521) Simplify packaging for python distributions

2017-06-26 Thread Ahmet Altay (JIRA)
Ahmet Altay created BEAM-2521:
-

 Summary: Simplify packaging for python distributions
 Key: BEAM-2521
 URL: https://issues.apache.org/jira/browse/BEAM-2521
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Ahmet Altay
Assignee: Ahmet Altay
 Fix For: 2.1.0






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2520) add UDF/UDAF in BeamSql.query/simpleQuery

2017-06-26 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2520:


 Summary: add UDF/UDAF in BeamSql.query/simpleQuery
 Key: BEAM-2520
 URL: https://issues.apache.org/jira/browse/BEAM-2520
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2503) use static table name in BeamSql.simpleQuery

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064049#comment-16064049
 ] 

ASF GitHub Bot commented on BEAM-2503:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3427


> use static table name in BeamSql.simpleQuery
> 
>
> Key: BEAM-2503
> URL: https://issues.apache.org/jira/browse/BEAM-2503
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>  Labels: dsl_sql_merge
>
> As discussed in #3372, we agree that it's more clear to have a static table 
> name in {{BeamSql.simpleQuery}}. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3427: [BEAM-2503] use static table name in BeamSql.simple...

2017-06-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3427


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[2/2] beam git commit: [BEAM-2503] This closes #3427

2017-06-26 Thread takidau
[BEAM-2503] This closes #3427


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bd99528a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bd99528a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bd99528a

Branch: refs/heads/DSL_SQL
Commit: bd99528af89450b44d94abc42a8b884e00cbc26e
Parents: 8e9b930 1f61204
Author: Tyler Akidau 
Authored: Mon Jun 26 17:36:58 2017 -0700
Committer: Tyler Akidau 
Committed: Mon Jun 26 17:36:58 2017 -0700

--
 .../java/org/apache/beam/dsls/sql/BeamSql.java  | 27 +++-
 .../dsls/sql/BeamSqlDslAggregationTest.java |  6 ++---
 .../beam/dsls/sql/BeamSqlDslFilterTest.java |  2 +-
 .../beam/dsls/sql/BeamSqlDslProjectTest.java|  2 +-
 4 files changed, 25 insertions(+), 12 deletions(-)
--




[1/2] beam git commit: use static table name PCOLLECTION in BeamSql.simpleQuery.

2017-06-26 Thread takidau
Repository: beam
Updated Branches:
  refs/heads/DSL_SQL 8e9b930bc -> bd99528af


use static table name PCOLLECTION in BeamSql.simpleQuery.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1f612049
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1f612049
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1f612049

Branch: refs/heads/DSL_SQL
Commit: 1f612049b83a67070d13aae790d61e0f71d79ca7
Parents: 8e9b930
Author: mingmxu 
Authored: Thu Jun 22 16:50:58 2017 -0700
Committer: mingmxu 
Committed: Thu Jun 22 16:50:58 2017 -0700

--
 .../java/org/apache/beam/dsls/sql/BeamSql.java  | 27 +++-
 .../dsls/sql/BeamSqlDslAggregationTest.java |  6 ++---
 .../beam/dsls/sql/BeamSqlDslFilterTest.java |  2 +-
 .../beam/dsls/sql/BeamSqlDslProjectTest.java|  2 +-
 4 files changed, 25 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1f612049/dsls/sql/src/main/java/org/apache/beam/dsls/sql/BeamSql.java
--
diff --git a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/BeamSql.java 
b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/BeamSql.java
index e68188b..5f90380 100644
--- a/dsls/sql/src/main/java/org/apache/beam/dsls/sql/BeamSql.java
+++ b/dsls/sql/src/main/java/org/apache/beam/dsls/sql/BeamSql.java
@@ -50,9 +50,8 @@ PCollection inputTableB = 
p.apply(TextIO.read().from("/my/input/path
 .apply(...);
 
 //run a simple query, and register the output as a table in BeamSql;
-String sql1 = "select MY_FUNC(c1), c2 from TABLE_A";
-PCollection outputTableA = 
inputTableA.apply(BeamSql.simpleQuery(sql1)
-.withUdf("MY_FUNC", myFunc));
+String sql1 = "select MY_FUNC(c1), c2 from PCOLLECTION";
+PCollection outputTableA = 
inputTableA.apply(BeamSql.simpleQuery(sql1));
 
 //run a JOIN with one table from TextIO, and one table from another query
 PCollection outputTableB = PCollectionTuple.of(
@@ -91,6 +90,8 @@ public class BeamSql {
*
* This is a simplified form of {@link #query(String)} where the query 
must reference
* a single input table.
+   *
+   * Make sure to query it from a static table name PCOLLECTION.
*/
   public static PTransform, PCollection>
   simpleQuery(String sqlQuery) throws Exception {
@@ -151,15 +152,20 @@ public class BeamSql {
*/
   private static class SimpleQueryTransform
   extends PTransform, PCollection> {
+private static final String PCOLLECTION_TABLE_NAME = "PCOLLECTION";
 BeamSqlEnv sqlEnv = new BeamSqlEnv();
 private String sqlQuery;
 
 public SimpleQueryTransform(String sqlQuery) {
   this.sqlQuery = sqlQuery;
+  validateQuery();
 }
 
-@Override
-public PCollection expand(PCollection input) {
+// public SimpleQueryTransform withUdf(String udfName){
+// throw new UnsupportedOperationException("Pending for UDF support");
+// }
+
+private void validateQuery() {
   SqlNode sqlNode;
   try {
 sqlNode = sqlEnv.planner.parseQuery(sqlQuery);
@@ -171,12 +177,19 @@ public class BeamSql {
   if (sqlNode instanceof SqlSelect) {
 SqlSelect select = (SqlSelect) sqlNode;
 String tableName = select.getFrom().toString();
-return PCollectionTuple.of(new TupleTag(tableName), input)
-.apply(new QueryTransform(sqlQuery, sqlEnv));
+if (!tableName.equalsIgnoreCase(PCOLLECTION_TABLE_NAME)) {
+  throw new IllegalStateException("Use fixed table name " + 
PCOLLECTION_TABLE_NAME);
+}
   } else {
 throw new UnsupportedOperationException(
 "Sql operation: " + sqlNode.toString() + " is not supported!");
   }
 }
+
+@Override
+public PCollection expand(PCollection input) {
+  return PCollectionTuple.of(new 
TupleTag(PCOLLECTION_TABLE_NAME), input)
+  .apply(new QueryTransform(sqlQuery, sqlEnv));
+}
   }
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/1f612049/dsls/sql/src/test/java/org/apache/beam/dsls/sql/BeamSqlDslAggregationTest.java
--
diff --git 
a/dsls/sql/src/test/java/org/apache/beam/dsls/sql/BeamSqlDslAggregationTest.java
 
b/dsls/sql/src/test/java/org/apache/beam/dsls/sql/BeamSqlDslAggregationTest.java
index f7349c6..b0509ae 100644
--- 
a/dsls/sql/src/test/java/org/apache/beam/dsls/sql/BeamSqlDslAggregationTest.java
+++ 
b/dsls/sql/src/test/java/org/apache/beam/dsls/sql/BeamSqlDslAggregationTest.java
@@ -37,7 +37,7 @@ public class BeamSqlDslAggregationTest extends BeamSqlDslBase 
{
*/
   @Test
   public void testAggregationWithoutWindow() throws Exception {
-String sql = "SELECT f_int2, COUNT(*) AS `size` FROM TABLE_A GROUP BY 
f_int2";
+String sql = "SELECT f_int2, COUNT(*) AS `size

[jira] [Commented] (BEAM-2447) Reintroduce DoFn.ProcessContinuation

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16064043#comment-16064043
 ] 

ASF GitHub Bot commented on BEAM-2447:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3383


> Reintroduce DoFn.ProcessContinuation
> 
>
> Key: BEAM-2447
> URL: https://issues.apache.org/jira/browse/BEAM-2447
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>
> ProcessContinuation.resume() is useful for tailing files - when we reach 
> current EOF, we want to voluntarily suspend the process() call rather than 
> wait for runner to checkpoint us.
> In BEAM-1903, DoFn.ProcessContinuation was removed because there was 
> ambiguity about the semantics of resume() especially w.r.t. the following 
> situation described in 
> https://docs.google.com/document/d/1BGc8pM1GOvZhwR9SARSVte-20XEoBUxrGJ5gTWXdv3c/edit
>  : the runner has taken a checkpoint on the tracker, and then the 
> ProcessElement call returns resume() signaling that the work is still not 
> done - then there's 2 checkpoints to deal with.
> Instead, the proper way to refine this semantics is:
> - After checkpoint() on a RestrictionTracker, the tracker MUST fail all 
> subsequent tryClaim() calls, and MUST succeed in checkDone().
> - After a failed tryClaim() call, the ProcessElement method MUST return stop()
> - So ProcessElement can return resume() only *instead* of doing tryClaim()
> - Then, if the runner has already taken a checkpoint but tracker has returned 
> resume(), we do not need to take a new checkpoint - the one already taken 
> already accurately describes the remainder of the work.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3383: [BEAM-2447] Reintroduces DoFn.ProcessContinuation (...

2017-06-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3383


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2519) Record type of state in Python InMemoryUnmergedState

2017-06-26 Thread Charles Chen (JIRA)
Charles Chen created BEAM-2519:
--

 Summary: Record type of state in Python InMemoryUnmergedState
 Key: BEAM-2519
 URL: https://issues.apache.org/jira/browse/BEAM-2519
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Charles Chen
Assignee: Ahmet Altay


Currently, the Python InMemoryUnmergedState implementation does not record the 
type of tag used to store state.  This means that, for example, it is hard to 
enumerate all state of a specific type of tag (e.g. WatermarkHolds).  We should 
fix this so that we can more cleanly extract, for example, the earliest 
WatermarkHold for the state of a given key.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[1/3] beam git commit: Bump Dataflow worker to 0623

2017-06-26 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 1ea1de4aa -> 95e6bbe50


Bump Dataflow worker to 0623


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2052cc76
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2052cc76
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2052cc76

Branch: refs/heads/master
Commit: 2052cc7689b4ed53f817f56dd71a32235fb083ca
Parents: bec32fe
Author: Eugene Kirpichov 
Authored: Fri Jun 23 10:16:30 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon Jun 26 17:25:04 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2052cc76/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index fbb0b87..2ba163b 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170622
+
beam-master-20170623
 
1
 
6
   



[3/3] beam git commit: This closes #3383: [BEAM-2447] Reintroduces DoFn.ProcessContinuation (Dataflow worker compatibility part)

2017-06-26 Thread jkff
This closes #3383: [BEAM-2447] Reintroduces DoFn.ProcessContinuation (Dataflow 
worker compatibility part)


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/95e6bbe5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/95e6bbe5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/95e6bbe5

Branch: refs/heads/master
Commit: 95e6bbe5065a983f9c71b79b20df1fda83c9fd1b
Parents: 1ea1de4 2052cc7
Author: Eugene Kirpichov 
Authored: Mon Jun 26 17:25:14 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon Jun 26 17:25:14 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 .../src/main/java/org/apache/beam/sdk/transforms/DoFn.java | 3 +++
 .../sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java| 6 ++
 .../org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java| 4 +++-
 4 files changed, 13 insertions(+), 2 deletions(-)
--




[2/3] beam git commit: Reintroduces DoFn.ProcessContinuation (Dataflow worker compatibility part)

2017-06-26 Thread jkff
Reintroduces DoFn.ProcessContinuation (Dataflow worker compatibility part)


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bec32fe9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bec32fe9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bec32fe9

Branch: refs/heads/master
Commit: bec32fe93c6b5c16563d7ea4b877a2dee3352fee
Parents: 1ea1de4
Author: Eugene Kirpichov 
Authored: Fri Jun 16 14:56:07 2017 -0700
Committer: Eugene Kirpichov 
Committed: Mon Jun 26 17:25:04 2017 -0700

--
 .../src/main/java/org/apache/beam/sdk/transforms/DoFn.java | 3 +++
 .../sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java| 6 ++
 .../org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java| 4 +++-
 3 files changed, 12 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/bec32fe9/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
index e711ac2..fb6d0ee 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java
@@ -677,6 +677,9 @@ public abstract class DoFn implements 
Serializable, HasDisplayD
   @Experimental(Kind.SPLITTABLE_DO_FN)
   public @interface UnboundedPerElement {}
 
+  /** Temporary, do not use. See 
https://issues.apache.org/jira/browse/BEAM-1904 */
+  public class ProcessContinuation {}
+
   /**
* Finalize the {@link DoFn} construction to prepare for processing.
* This method should be called by runners before any processing methods.

http://git-wip-us.apache.org/repos/asf/beam/blob/bec32fe9/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
index 5d5887a..4f67db4 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactory.java
@@ -49,6 +49,7 @@ import net.bytebuddy.implementation.bytecode.Throw;
 import net.bytebuddy.implementation.bytecode.assign.Assigner;
 import net.bytebuddy.implementation.bytecode.assign.Assigner.Typing;
 import net.bytebuddy.implementation.bytecode.assign.TypeCasting;
+import net.bytebuddy.implementation.bytecode.constant.NullConstant;
 import net.bytebuddy.implementation.bytecode.constant.TextConstant;
 import net.bytebuddy.implementation.bytecode.member.FieldAccess;
 import net.bytebuddy.implementation.bytecode.member.MethodInvocation;
@@ -667,6 +668,11 @@ public class ByteBuddyDoFnInvokerFactory implements 
DoFnInvokerFactory {
   }
   return new StackManipulation.Compound(pushParameters);
 }
+
+@Override
+protected StackManipulation afterDelegation(MethodDescription 
instrumentedMethod) {
+  return new StackManipulation.Compound(NullConstant.INSTANCE, 
MethodReturn.REFERENCE);
+}
   }
 
   private static class UserCodeMethodInvocation implements StackManipulation {

http://git-wip-us.apache.org/repos/asf/beam/blob/bec32fe9/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java
index 6fd4052..ed81f42 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvoker.java
@@ -53,8 +53,10 @@ public interface DoFnInvoker {
* Invoke the {@link DoFn.ProcessElement} method on the bound {@link DoFn}.
*
* @param extra Factory for producing extra parameter objects (such as 
window), if necessary.
+   * @return {@code null} - see https://issues.apache.org/jira/browse/BEAM-1904";>JIRA
+   * tracking the complete removal of {@link DoFn.ProcessContinuation}.
*/
-  void invokeProcessElement(ArgumentProvider extra);
+  DoFn.ProcessContinuation invokeProcessElement(ArgumentProvider extra);
 
   /** Invoke the appropriate {@link DoFn.OnTimer} method on the bound {@link 
DoFn}. */
   void invokeOnTimer(String timerId, ArgumentProvider 
arguments);



[jira] [Created] (BEAM-2518) Support TimestampCombiner in Python streaming mode GroupByKey

2017-06-26 Thread Charles Chen (JIRA)
Charles Chen created BEAM-2518:
--

 Summary: Support TimestampCombiner in Python streaming mode 
GroupByKey
 Key: BEAM-2518
 URL: https://issues.apache.org/jira/browse/BEAM-2518
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
Reporter: Charles Chen
Assignee: Charles Chen


Currently, streaming mode GroupByKey in Python does not respect the specified 
TimestampCombiner semantics for output elements.  We should implement this to 
better conform to the Beam model.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2479

2017-06-26 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3444: Implement streaming GroupByKey in Python DirectRunn...

2017-06-26 Thread charlesccychen
GitHub user charlesccychen opened a pull request:

https://github.com/apache/beam/pull/3444

Implement streaming GroupByKey in Python DirectRunner

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/charlesccychen/beam streaming-gbk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3444.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3444


commit a39d4c02105754710ec5350d448d37cb20c9fcc4
Author: Charles Chen 
Date:   2017-06-26T23:54:00Z

Implement streaming GroupByKey in Python DirectRunner




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (BEAM-2437) quickstart.py docs is missing the path to MANIFEST.in

2017-06-26 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay resolved BEAM-2437.
---
Resolution: Fixed

> quickstart.py docs is missing the path to MANIFEST.in
> -
>
> Key: BEAM-2437
> URL: https://issues.apache.org/jira/browse/BEAM-2437
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Jonathan Bingham
>Assignee: Sourabh Bajaj
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> SUMMARY
> The wordcount example in quickstart-py does not work with the sample code 
> without modification.
> OBSERVED
> Copy-pasting from the doc page doesn't work:
> python -m apache_beam.examples.wordcount --input MANIFEST.in --output counts
> Error message: IOError: No files found based on the file pattern MANIFEST.in
> EXPECTED
> The example tells me to set the path to MANIFEST.in, or gives a pseudo-path 
> that I can substitute in the right path prefix.
> python -m apache_beam.examples.wordcount --input 
> /[path-to-git-clone-dir]/beam/sdks/python/MANIFEST.in --output counts



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2437) quickstart.py docs is missing the path to MANIFEST.in

2017-06-26 Thread Sourabh Bajaj (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063939#comment-16063939
 ] 

Sourabh Bajaj commented on BEAM-2437:
-

This can be closed now.

> quickstart.py docs is missing the path to MANIFEST.in
> -
>
> Key: BEAM-2437
> URL: https://issues.apache.org/jira/browse/BEAM-2437
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Jonathan Bingham
>Assignee: Sourabh Bajaj
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> SUMMARY
> The wordcount example in quickstart-py does not work with the sample code 
> without modification.
> OBSERVED
> Copy-pasting from the doc page doesn't work:
> python -m apache_beam.examples.wordcount --input MANIFEST.in --output counts
> Error message: IOError: No files found based on the file pattern MANIFEST.in
> EXPECTED
> The example tells me to set the path to MANIFEST.in, or gives a pseudo-path 
> that I can substitute in the right path prefix.
> python -m apache_beam.examples.wordcount --input 
> /[path-to-git-clone-dir]/beam/sdks/python/MANIFEST.in --output counts



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2437) quickstart.py docs is missing the path to MANIFEST.in

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063936#comment-16063936
 ] 

ASF GitHub Bot commented on BEAM-2437:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/261


> quickstart.py docs is missing the path to MANIFEST.in
> -
>
> Key: BEAM-2437
> URL: https://issues.apache.org/jira/browse/BEAM-2437
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Jonathan Bingham
>Assignee: Sourabh Bajaj
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> SUMMARY
> The wordcount example in quickstart-py does not work with the sample code 
> without modification.
> OBSERVED
> Copy-pasting from the doc page doesn't work:
> python -m apache_beam.examples.wordcount --input MANIFEST.in --output counts
> Error message: IOError: No files found based on the file pattern MANIFEST.in
> EXPECTED
> The example tells me to set the path to MANIFEST.in, or gives a pseudo-path 
> that I can substitute in the right path prefix.
> python -m apache_beam.examples.wordcount --input 
> /[path-to-git-clone-dir]/beam/sdks/python/MANIFEST.in --output counts



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[3/3] beam-site git commit: This closes #261

2017-06-26 Thread altay
This closes #261


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/3ab9c27e
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/3ab9c27e
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/3ab9c27e

Branch: refs/heads/asf-site
Commit: 3ab9c27eb524869d39631810c56f807848912251
Parents: 7360cb7 9d94c4b
Author: Ahmet Altay 
Authored: Mon Jun 26 15:56:23 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jun 26 15:56:23 2017 -0700

--
 content/contribute/maturity-model/index.html | 2 +-
 content/get-started/quickstart-py/index.html | 2 +-
 src/contribute/maturity-model.md | 2 +-
 src/get-started/quickstart-py.md | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)
--




[2/3] beam-site git commit: Regenerate website

2017-06-26 Thread altay
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/9d94c4bc
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/9d94c4bc
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/9d94c4bc

Branch: refs/heads/asf-site
Commit: 9d94c4bc5818d24de48524f7c016b773e62bf373
Parents: 2220c8e
Author: Ahmet Altay 
Authored: Mon Jun 26 15:56:23 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jun 26 15:56:23 2017 -0700

--
 content/contribute/maturity-model/index.html | 2 +-
 content/get-started/quickstart-py/index.html | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/9d94c4bc/content/contribute/maturity-model/index.html
--
diff --git a/content/contribute/maturity-model/index.html 
b/content/contribute/maturity-model/index.html
index ad623c6..3a3bcf9 100644
--- a/content/contribute/maturity-model/index.html
+++ b/content/contribute/maturity-model/index.html
@@ -281,7 +281,7 @@ graduation process and is no longer being 
maintained.
 
   QU50
   The project strives to respond to documented bug reports in a 
timely manner.
-  YES. The project has resolved https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Resolved%2C%20Closed)">550
 issues during incubation.Even further, https://issues.apache.org/jira/browse/BEAM/?selectedTab%3Dcom.atlassian.jira.jira-projects-plugin:components-panel=undefined&selectedTab=com.atlassian.jira.jira-projects-plugin:components-panel";>all
 project components have designated a single committer who gets assigned 
all newly filed issues for a triage/re-assignment to ensure timely 
action.
+  YES. The project has resolved https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Resolved%2C%20Closed)">550
 issues during incubation.Even further, https://issues.apache.org/jira/projects/BEAM?selectedItem=com.atlassian.jira.jira-projects-plugin%3Acomponents-page&selectedTab%3Dcom.atlassian.jira.jira-projects-plugin%3Acomponents-panel=undefined";>all
 project components have designated a single committer who gets assigned 
all newly filed issues for a triage/re-assignment to ensure timely 
action.
 
 
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/9d94c4bc/content/get-started/quickstart-py/index.html
--
diff --git a/content/get-started/quickstart-py/index.html 
b/content/get-started/quickstart-py/index.html
index d9e1f98..c56034d 100644
--- a/content/get-started/quickstart-py/index.html
+++ b/content/get-started/quickstart-py/index.html
@@ -263,7 +263,7 @@ environment’s directories.
 
 For example, to run wordcount.py, 
run:
 
-python -m apache_beam.examples.wordcount --input 
MANIFEST.in --output counts
+python -m apache_beam.examples.wordcount --input 
 --output counts
 
 
 



[GitHub] beam-site pull request #261: [BEAM-2437] Input path should be flexible in qu...

2017-06-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/261


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/3] beam-site git commit: [BEAM-2437] Input path should be flexible in quickstart

2017-06-26 Thread altay
Repository: beam-site
Updated Branches:
  refs/heads/asf-site 7360cb748 -> 3ab9c27eb


[BEAM-2437] Input path should be flexible in quickstart


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/2220c8ed
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/2220c8ed
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/2220c8ed

Branch: refs/heads/asf-site
Commit: 2220c8edbdb28ba546b5be8e86929065178ef9e2
Parents: 7360cb7
Author: Sourabh Bajaj 
Authored: Mon Jun 26 10:58:33 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jun 26 15:53:14 2017 -0700

--
 src/contribute/maturity-model.md | 2 +-
 src/get-started/quickstart-py.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/2220c8ed/src/contribute/maturity-model.md
--
diff --git a/src/contribute/maturity-model.md b/src/contribute/maturity-model.md
index 6fe272b..6a2d89f 100644
--- a/src/contribute/maturity-model.md
+++ b/src/contribute/maturity-model.md
@@ -138,7 +138,7 @@ The following table summarizes project's self-assessment 
against the Apache Matu
 
   QU50
   The project strives to respond to documented bug reports in a 
timely manner.
-  YES. The project has resolved https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Resolved%2C%20Closed)">550
 issues during incubation.Even further, https://issues.apache.org/jira/browse/BEAM/?selectedTab%3Dcom.atlassian.jira.jira-projects-plugin:components-panel=undefined&selectedTab=com.atlassian.jira.jira-projects-plugin:components-panel";>all
 project components have designated a single committer who gets assigned 
all newly filed issues for a triage/re-assignment to ensure timely 
action.
+  YES. The project has resolved https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Resolved%2C%20Closed)">550
 issues during incubation.Even further, https://issues.apache.org/jira/projects/BEAM?selectedItem=com.atlassian.jira.jira-projects-plugin%3Acomponents-page&selectedTab%3Dcom.atlassian.jira.jira-projects-plugin%3Acomponents-panel=undefined";>all
 project components have designated a single committer who gets assigned 
all newly filed issues for a triage/re-assignment to ensure timely 
action.
 
 
 

http://git-wip-us.apache.org/repos/asf/beam-site/blob/2220c8ed/src/get-started/quickstart-py.md
--
diff --git a/src/get-started/quickstart-py.md b/src/get-started/quickstart-py.md
index 3ebdf3e..f5cf2aa 100644
--- a/src/get-started/quickstart-py.md
+++ b/src/get-started/quickstart-py.md
@@ -102,7 +102,7 @@ For example, to run `wordcount.py`, run:
 
 {:.runner-direct}
 ```
-python -m apache_beam.examples.wordcount --input MANIFEST.in --output counts
+python -m apache_beam.examples.wordcount --input  --output 
counts
 ```
 
 {:.runner-dataflow}



[GitHub] beam pull request #3443: [BEAM-2511] Implements TextIO.ReadAll

2017-06-26 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3443

[BEAM-2511] Implements TextIO.ReadAll

Reads a PCollection of filenames. Part of the plan at 
http://s.apache.org/textio-sdf. Currently implemented pretty naively, and 
without SDF: expands glob, splits each file into 64MB chunks, reads each chunk 
using existing TextReader code. Pretty trivial, except had to duplicate code 
for managing compression - but this is tested by adding a ReadAll test to every 
Read test.

This won't advance the watermark very well because the chunks are 
unordered. However hopefully in streaming pipelines people will be ingesting 
PCollection's of small-ish files and this won't matter much. And TextIO doesn't 
report timestamps of elements anyway, so in fact it doesn't matter at all. One 
of the next steps is to develop also an SDF version of this, and have runners 
that support SDF use it via an override.

R: @reuvenlax 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam textio-read-all

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3443.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3443


commit f7f0f1e4e9105a678524894a2304520541359d33
Author: Eugene Kirpichov 
Date:   2017-06-24T01:01:53Z

Splits large TextIOTest into TextIOReadTest and TextIOWriteTest

commit 79ae1e8d4bbe92fad06837555db471368007bd45
Author: Eugene Kirpichov 
Date:   2017-06-24T01:02:10Z

Adds TextIO.readAll(), implemented rather naively




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3442: Splits large TextIOTest into TextIOReadTest and Tex...

2017-06-26 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3442

Splits large TextIOTest into TextIOReadTest and TextIOWriteTest

R: @reuvenlax  

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam split-textio-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3442.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3442


commit f7f0f1e4e9105a678524894a2304520541359d33
Author: Eugene Kirpichov 
Date:   2017-06-24T01:01:53Z

Splits large TextIOTest into TextIOReadTest and TextIOWriteTest




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4217

2017-06-26 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2478

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1265) Add streaming support to Python DirectRunner

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063816#comment-16063816
 ] 

ASF GitHub Bot commented on BEAM-1265:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3440


> Add streaming support to Python DirectRunner
> 
>
> Key: BEAM-1265
> URL: https://issues.apache.org/jira/browse/BEAM-1265
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Ahmet Altay
>Assignee: Charles Chen
>
> Continue the work started in https://issues.apache.org/jira/browse/BEAM-428



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3440: [BEAM-1265] Remove old deprecated PubSub code

2017-06-26 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3440


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[1/2] beam git commit: Remove old deprecated PubSub code

2017-06-26 Thread altay
flow/internal/dependency.py
@@ -73,7 +73,7 @@ from apache_beam.options.pipeline_options import SetupOptions
 # Update this version to the next version whenever there is a change that will
 # require changes to the execution environment.
 # This should be in the beam-[version]-[date] format, date is optional.
-BEAM_CONTAINER_VERSION = 'beam-2.1.0-20170601'
+BEAM_CONTAINER_VERSION = 'beam-2.1.0-20170626'
 
 # Standard file names used for staging files.
 WORKFLOW_TARBALL_FILE = 'workflow.tar.gz'



[2/2] beam git commit: This closes #3440

2017-06-26 Thread altay
This closes #3440


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1ea1de4a
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1ea1de4a
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1ea1de4a

Branch: refs/heads/master
Commit: 1ea1de4aa9d32e3c5a596ccd7d84aff1cc2a7428
Parents: 16f87f4 926f949
Author: Ahmet Altay 
Authored: Mon Jun 26 14:23:08 2017 -0700
Committer: Ahmet Altay 
Committed: Mon Jun 26 14:23:08 2017 -0700

--
 sdks/python/apache_beam/io/gcp/pubsub.py| 71 +---
 .../runners/dataflow/internal/dependency.py |  2 +-
 2 files changed, 2 insertions(+), 71 deletions(-)
--




[jira] [Updated] (BEAM-2517) Document how to build Python SDK from BEAM head in contribution guide.

2017-06-26 Thread Valentyn Tymofieiev (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-2517:
--
Component/s: (was: sdk-py)
 website

> Document how to build Python SDK from BEAM head in contribution guide.
> --
>
> Key: BEAM-2517
> URL: https://issues.apache.org/jira/browse/BEAM-2517
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Labels: starter
>
> We should add instructions how to build Python SDK from BEAM head to BEAM 
> contributor guide[1] .
> The commands can be as follows:
> cd ./beam/sdks/python
> python setup.py sdist
> SDK tarball will appear in ./dist/
> [1]: https://beam.apache.org/contribute/contribution-guide



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2271) Release guide or pom.xml needs update to avoid releasing Python binary artifacts

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063797#comment-16063797
 ] 

ASF GitHub Bot commented on BEAM-2271:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/3441

[BEAM-2271] Add more files to mvn clean

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

@aaltay I can't test this as I don't have permissions to follow the release 
guide. Can you test this out?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-2271-fix-release-filter-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3441.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3441






> Release guide or pom.xml needs update to avoid releasing Python binary 
> artifacts
> 
>
> Key: BEAM-2271
> URL: https://issues.apache.org/jira/browse/BEAM-2271
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Daniel Halperin
>Assignee: Ahmet Altay
> Fix For: 2.1.0
>
>
> The following directories (and children) were discovered in 2.0.0-RC2 and 
> were present in 0.6.0.
> {code}
> sdks/python: build   dist.eggs   nose-1.3.7-py2.7.egg  (and child 
> contents)
> {code}
> Ideally, these artifacts, which are created during setup and testing, would 
> get created in the {{sdks/python/target/}} subfolder where they will 
> automatically get ignored. More info below.
> For 2.0.0, we will manually remove these files from the source release RC3+. 
> This should be fixed before the next release.
> Here is a list of other paths that get excluded, should they be useful.
> {code}
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/).*${project.build.directory}.*]
> 
> 
>  
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?maven-eclipse\.xml]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.project]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.classpath]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iws]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.idea(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?out(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.ipr]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iml]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.settings(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.externalToolBuilders(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.deployables(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.wtpmodules(/.*)?]
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?cobertura\.ser]
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?pom\.xml\.releaseBackup]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?release\.properties]
>   
> {code}
> This list is stored inside of this jar, which you can find by tracking 
> maven-assembly-plugin from the root apache pom: 
> https://mvnrepository.com/artifact/org.apache.apache.resources/apache-source-release-assembly-descriptor/1.0.6
> http://svn.apache.org/repos/asf/maven/pom/tags/apache-18/pom.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3441: [BEAM-2271] Add more files to mvn clean

2017-06-26 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam/pull/3441

[BEAM-2271] Add more files to mvn clean

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---

@aaltay I can't test this as I don't have permissions to follow the release 
guide. Can you test this out?


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam BEAM-2271-fix-release-filter-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3441.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3441






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] beam pull request #3440: Remove old deprecated PubSub code

2017-06-26 Thread charlesccychen
GitHub user charlesccychen opened a pull request:

https://github.com/apache/beam/pull/3440

Remove old deprecated PubSub code

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/charlesccychen/beam remove-old-pubsub

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3440.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3440


commit 926f949580c3a21df72a8836feda1f6b947850ec
Author: Charles Chen 
Date:   2017-06-26T20:00:14Z

Remove old deprecated PubSub code




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (BEAM-2517) Document how to build Python SDK from BEAM head in contribution guide.

2017-06-26 Thread Ahmet Altay (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-2517:
-

Assignee: (was: Ahmet Altay)

> Document how to build Python SDK from BEAM head in contribution guide.
> --
>
> Key: BEAM-2517
> URL: https://issues.apache.org/jira/browse/BEAM-2517
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Valentyn Tymofieiev
>Priority: Minor
>  Labels: starter
>
> We should add instructions how to build Python SDK from BEAM head to BEAM 
> contributor guide[1] .
> The commands can be as follows:
> cd ./beam/sdks/python
> python setup.py sdist
> SDK tarball will appear in ./dist/
> [1]: https://beam.apache.org/contribute/contribution-guide



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2517) Document how to build Python SDK from BEAM head in contribution guide.

2017-06-26 Thread Valentyn Tymofieiev (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-2517:
--
Environment: (was: We should add instructions how to build Python SDK 
from BEAM head to BEAM contributor guide[1] .

The commands can be as follows:
cd ./beam/sdks/python
python setup.py sdist
SDK tarball will appear in ./dist/

[1]: https://beam.apache.org/contribute/contribution-guide)

> Document how to build Python SDK from BEAM head in contribution guide.
> --
>
> Key: BEAM-2517
> URL: https://issues.apache.org/jira/browse/BEAM-2517
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Valentyn Tymofieiev
>Assignee: Ahmet Altay
>Priority: Minor
>  Labels: starter
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2517) Document how to build Python SDK from BEAM head in contribution guide.

2017-06-26 Thread Valentyn Tymofieiev (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-2517:
--
Labels: starter  (was: )

> Document how to build Python SDK from BEAM head in contribution guide.
> --
>
> Key: BEAM-2517
> URL: https://issues.apache.org/jira/browse/BEAM-2517
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Valentyn Tymofieiev
>Assignee: Ahmet Altay
>Priority: Minor
>  Labels: starter
>
> We should add instructions how to build Python SDK from BEAM head to BEAM 
> contributor guide[1] .
> The commands can be as follows:
> cd ./beam/sdks/python
> python setup.py sdist
> SDK tarball will appear in ./dist/
> [1]: https://beam.apache.org/contribute/contribution-guide



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2517) Document how to build Python SDK from BEAM head in contribution guide.

2017-06-26 Thread Valentyn Tymofieiev (JIRA)
Valentyn Tymofieiev created BEAM-2517:
-

 Summary: Document how to build Python SDK from BEAM head in 
contribution guide.
 Key: BEAM-2517
 URL: https://issues.apache.org/jira/browse/BEAM-2517
 Project: Beam
  Issue Type: Bug
  Components: sdk-py
 Environment: We should add instructions how to build Python SDK from 
BEAM head to BEAM contributor guide[1] .

The commands can be as follows:
cd ./beam/sdks/python
python setup.py sdist
SDK tarball will appear in ./dist/

[1]: https://beam.apache.org/contribute/contribution-guide
Reporter: Valentyn Tymofieiev
Assignee: Ahmet Altay
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2517) Document how to build Python SDK from BEAM head in contribution guide.

2017-06-26 Thread Valentyn Tymofieiev (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Valentyn Tymofieiev updated BEAM-2517:
--
Description: 
We should add instructions how to build Python SDK from BEAM head to BEAM 
contributor guide[1] .

The commands can be as follows:
cd ./beam/sdks/python
python setup.py sdist
SDK tarball will appear in ./dist/

[1]: https://beam.apache.org/contribute/contribution-guide

> Document how to build Python SDK from BEAM head in contribution guide.
> --
>
> Key: BEAM-2517
> URL: https://issues.apache.org/jira/browse/BEAM-2517
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Valentyn Tymofieiev
>Assignee: Ahmet Altay
>Priority: Minor
>  Labels: starter
>
> We should add instructions how to build Python SDK from BEAM head to BEAM 
> contributor guide[1] .
> The commands can be as follows:
> cd ./beam/sdks/python
> python setup.py sdist
> SDK tarball will appear in ./dist/
> [1]: https://beam.apache.org/contribute/contribution-guide



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4216

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063594#comment-16063594
 ] 

Kenneth Knowles commented on BEAM-2140:
---

There is an open need to be able to set a timer with an associated _output_ 
watermark hold, which may be related. This has come up in conversation with 
[~reuvenlax].

For a processing time timer, it would be like {{timer.withOutputTime(new 
Instant(...)).setRelative(...)}} and for an event time timer, it would look 
similarly like {{timer.withOutputTime(new Instant(...)).set(...))}} and this 
would

These permit the {{@OnTimer}} callback to output elements with that timestamp. 
Both also manifest as output watermark holds.

It sounds like you want to set an event time timer if you want it to be fired 
on the input watermark going to +inf.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063594#comment-16063594
 ] 

Kenneth Knowles edited comment on BEAM-2140 at 6/26/17 6:49 PM:


There is an open need to be able to set a timer with an associated _output_ 
watermark hold, which may be related. This has come up in conversation with 
[~reuvenlax].

For a processing time timer, it would be like {{timer.withOutputTime(new 
Instant(...)).setRelative(...)}} and for an event time timer, it would look 
similarly like {{timer.withOutputTime(new Instant(...)).set(...))}}. These 
permit the {{@OnTimer}} callback to output elements with that timestamp. Both 
also manifest as output watermark holds.

It sounds like you want to set an event time timer if you want it to be fired 
on the input watermark going to +inf.


was (Author: kenn):
There is an open need to be able to set a timer with an associated _output_ 
watermark hold, which may be related. This has come up in conversation with 
[~reuvenlax].

For a processing time timer, it would be like {{timer.withOutputTime(new 
Instant(...)).setRelative(...)}} and for an event time timer, it would look 
similarly like {{timer.withOutputTime(new Instant(...)).set(...))}} and this 
would

These permit the {{@OnTimer}} callback to output elements with that timestamp. 
Both also manifest as output watermark holds.

It sounds like you want to set an event time timer if you want it to be fired 
on the input watermark going to +inf.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063581#comment-16063581
 ] 

Kenneth Knowles edited comment on BEAM-2140 at 6/26/17 6:40 PM:


A watermark hold constrains the output watermark, not the input watermark.


was (Author: kenn):
A watermark holds constrains the output watermark, not the input watermark.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063581#comment-16063581
 ] 

Kenneth Knowles commented on BEAM-2140:
---

A watermark holds constrains the output watermark, not the input watermark.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2477

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063548#comment-16063548
 ] 

Eugene Kirpichov commented on BEAM-2140:


On 1: the timer should not be dropped because ProcessFn is setting a watermark 
hold (bullet 3 in my previous comment here).

On 2: correct, it uses the low-level state/timer APIs - partially because it 
was written before the high-level APIs were available; partially because the 
high-level APIs don't support watermark holds; partially because we didn't want 
to tie the availability of one complex feature in a runner to the availability 
of another complex feature.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2437) quickstart.py docs is missing the path to MANIFEST.in

2017-06-26 Thread Sourabh Bajaj (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sourabh Bajaj reassigned BEAM-2437:
---

Assignee: Sourabh Bajaj  (was: Ahmet Altay)

> quickstart.py docs is missing the path to MANIFEST.in
> -
>
> Key: BEAM-2437
> URL: https://issues.apache.org/jira/browse/BEAM-2437
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Jonathan Bingham
>Assignee: Sourabh Bajaj
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> SUMMARY
> The wordcount example in quickstart-py does not work with the sample code 
> without modification.
> OBSERVED
> Copy-pasting from the doc page doesn't work:
> python -m apache_beam.examples.wordcount --input MANIFEST.in --output counts
> Error message: IOError: No files found based on the file pattern MANIFEST.in
> EXPECTED
> The example tells me to set the path to MANIFEST.in, or gives a pseudo-path 
> that I can substitute in the right path prefix.
> python -m apache_beam.examples.wordcount --input 
> /[path-to-git-clone-dir]/beam/sdks/python/MANIFEST.in --output counts



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2437) quickstart.py docs is missing the path to MANIFEST.in

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063519#comment-16063519
 ] 

ASF GitHub Bot commented on BEAM-2437:
--

GitHub user sb2nov opened a pull request:

https://github.com/apache/beam-site/pull/261

[BEAM-2437] Input path should be flexible in quickstart

R: @aaltay PTAL

The manifest or pom file all will be missing so we should just let the user 
point to any text file of choice.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam-site BEAM-2437-fix-input-file-path

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #261






> quickstart.py docs is missing the path to MANIFEST.in
> -
>
> Key: BEAM-2437
> URL: https://issues.apache.org/jira/browse/BEAM-2437
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Jonathan Bingham
>Assignee: Ahmet Altay
>Priority: Minor
>  Labels: easyfix
> Fix For: 2.1.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> SUMMARY
> The wordcount example in quickstart-py does not work with the sample code 
> without modification.
> OBSERVED
> Copy-pasting from the doc page doesn't work:
> python -m apache_beam.examples.wordcount --input MANIFEST.in --output counts
> Error message: IOError: No files found based on the file pattern MANIFEST.in
> EXPECTED
> The example tells me to set the path to MANIFEST.in, or gives a pseudo-path 
> that I can substitute in the right path prefix.
> python -m apache_beam.examples.wordcount --input 
> /[path-to-git-clone-dir]/beam/sdks/python/MANIFEST.in --output counts



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam-site pull request #261: [BEAM-2437] Input path should be flexible in qu...

2017-06-26 Thread sb2nov
GitHub user sb2nov opened a pull request:

https://github.com/apache/beam-site/pull/261

[BEAM-2437] Input path should be flexible in quickstart

R: @aaltay PTAL

The manifest or pom file all will be missing so we should just let the user 
point to any text file of choice.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sb2nov/beam-site BEAM-2437-fix-input-file-path

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/261.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #261






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2516) User reports 4 minutes to process 1 million line CSV in DirectRunner

2017-06-26 Thread Thomas Groh (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063511#comment-16063511
 ] 

Thomas Groh commented on BEAM-2516:
---

Worth investigating. We do expect things to take notably longer than, for 
example, a tuned unix utility, but this seems a bit over the top

> User reports 4 minutes to process 1 million line CSV in DirectRunner
> 
>
> Key: BEAM-2516
> URL: https://issues.apache.org/jira/browse/BEAM-2516
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Kenneth Knowles
>Assignee: Thomas Groh
>Priority: Minor
>
> https://stackoverflow.com/questions/44736414/simple-apache-beam-manipulations-work-very-slow
> I don't know what the expectation are here, so I wasn't ready to say this is 
> WAI. Low priority since it isn't what the runner is for anyhow, but this 
> seems like the scale of data that should be snappy. Worth investigating, or 
> maybe you can quickly indicate why it is expected?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2516) User reports 4 minutes to process 1 million line CSV in DirectRunner

2017-06-26 Thread Kenneth Knowles (JIRA)
Kenneth Knowles created BEAM-2516:
-

 Summary: User reports 4 minutes to process 1 million line CSV in 
DirectRunner
 Key: BEAM-2516
 URL: https://issues.apache.org/jira/browse/BEAM-2516
 Project: Beam
  Issue Type: Bug
  Components: runner-direct
Reporter: Kenneth Knowles
Assignee: Thomas Groh
Priority: Minor


https://stackoverflow.com/questions/44736414/simple-apache-beam-manipulations-work-very-slow

I don't know what the expectation are here, so I wasn't ready to say this is 
WAI. Low priority since it isn't what the runner is for anyhow, but this seems 
like the scale of data that should be snappy. Worth investigating, or maybe you 
can quickly indicate why it is expected?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2514) Improve error message for missing required options of Beam pipeline

2017-06-26 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063397#comment-16063397
 ] 

Luke Cwik commented on BEAM-2514:
-

This change assumes that all users are CLI users (compared to the existing code 
which assumed that all users were programmatic users). Should we word the 
message so its useful for both CLI and users who set the options 
programmatically?

> Improve error message for missing required options of Beam pipeline
> ---
>
> Key: BEAM-2514
> URL: https://issues.apache.org/jira/browse/BEAM-2514
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Manu Zhang
>Assignee: Manu Zhang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3439: Minor fixes for DSL SQL

2017-06-26 Thread iemejia
GitHub user iemejia opened a pull request:

https://github.com/apache/beam/pull/3439

Minor fixes for DSL SQL

- Make the example execution runner agnostic and some minor fixes.
- Enable findbugs for the profile release and fix current issues.

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iemejia/beam DSL_SQL

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3439


commit 0eadc1042f4fdb322bb1e41fd4b11916cdd50b94
Author: Ismaël Mejía 
Date:   2017-06-26T14:37:51Z

Small fixes to make the example run in a runner agnostic way

commit 4e8b07709e7606c9885b37671203d7e731f83b47
Author: Ismaël Mejía 
Date:   2017-06-26T15:19:01Z

Add findbugs validation and fix existing findbugs issues




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2515) BeamSql: refactor the MockedBeamSqlTable and related tests

2017-06-26 Thread James Xu (JIRA)
James Xu created BEAM-2515:
--

 Summary: BeamSql: refactor the MockedBeamSqlTable and related tests
 Key: BEAM-2515
 URL: https://issues.apache.org/jira/browse/BEAM-2515
 Project: Beam
  Issue Type: Bug
  Components: dsl-sql
Reporter: James Xu
Assignee: James Xu


MockedBeamSqlTable is only for Bounded data sources, after another Unbounded 
mock added, some refactor will be needed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2490) ReadFromText function is not taking all data with glob operator (*)

2017-06-26 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063224#comment-16063224
 ] 

Guillermo Rodríguez Cano edited comment on BEAM-2490 at 6/26/17 3:07 PM:
-

 Hello [~chamikara] and [~altay], and thanks for the comments,

 here you have some details of the setup I have used for the Direct runner so 
far (where apache beam version also applies for the Dataflow runner):
* OS: Mac OS X Sierra 10.12.5 
* Apache Beam: 2.0.0
* Python: 2.7.13

 I tried the HEAD from the official repository (git hash: 
[16f87f49f20796e29d01ed363a9097ea5420583c|https://github.com/apache/beam/tree/16f87f49f20796e29d01ed363a9097ea5420583c])
 as suggested by [~altay] and I cannot conclude yet whether it works or not. It 
seems that gz files are read 'more' than before because there is a higher 
memory usage than when using the current release of Apache Beam (and the amount 
of memory used is comparable to the case when the same non-compressed files are 
processed with the pipeline). However, it is extremely slow (again, with the 
Direct Runner), slower than using the non-compressed files.
Therefore, as a test of the HEAD I am now running only one of those gzip files 
now but the task hasn't completed (maybe then I just discovered some 
performance bug in that fix, https://github.com/apache/beam/pull/3428, because 
it feels very slow...). I'll report on this when done (will also try two files).

I am not sure if this would be faster in GCP but I could try this anyways on 
Dataflow though I am not sure if I can have Dataflow run the HEAD of the 
repository. I tried following the advice on the official 
[documentation|https://cloud.google.com/dataflow/pipelines/dependencies-python] 
but I don't manage to get the repository properly packed for the workers to 
pick it up.


was (Author: wileeam):
 Hello [~chamikara] and [~altay], and thanks for the comments,

 here you have some details of the setup I have used for the Direct runner so 
far (where apache beam version also applies for the Dataflow runner):
* OS: Mac OS X Sierra 10.12.5 
* Apache Beam: 2.0.0
* Python: 2.7.13

 I tried the HEAD from the official repository (git hash: 
[16f87f49f20796e29d01ed363a9097ea5420583c|https://github.com/apache/beam/tree/16f87f49f20796e29d01ed363a9097ea5420583c])
 as suggested by [~altay] and I cannot conclude yet whether it works or not. It 
seems that gz files are read 'more' than before because there is a higher 
memory usage than when using the current release of Apache Beam (and the amount 
of memory used is comparable to the case when the same non-compressed files are 
processed with the pipeline). However, it is extremely slow (again, with the 
Direct Runner), slower than using the non-compressed files.
Therefore, as a test of the HEAD I am now running only one of those gzip files 
now but the task hasn't completed (maybe then I just discovered some 
performance bug in that fix, https://github.com/apache/beam/pull/3428, because 
it feels very slow...).

I am not sure if this would be faster in GCP but I could try this anyways on 
Dataflow though I am not sure if I can have Dataflow run the HEAD of the 
repository. I tried following the advice on the official 
[documentation|https://cloud.google.com/dataflow/pipelines/dependencies-python] 
but I don't manage to get the repository properly packed for the workers to 
pick it up.

> ReadFromText function is not taking all data with glob operator (*) 
> 
>
> Key: BEAM-2490
> URL: https://issues.apache.org/jira/browse/BEAM-2490
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
> Environment: Usage with Google Cloud Platform: Dataflow runner
>Reporter: Olivier NGUYEN QUOC
>Assignee: Chamikara Jayalath
> Fix For: Not applicable
>
>
> I run a very simple pipeline:
> * Read my files from Google Cloud Storage
> * Split with '\n' char
> * Write in on a Google Cloud Storage
> I have 8 files that match with the pattern:
> * my_files_2016090116_20160902_060051_xx.csv.gz (229.25 MB)
> * my_files_2016090117_20160902_060051_xx.csv.gz (184.1 MB)
> * my_files_2016090118_20160902_060051_xx.csv.gz (171.73 MB)
> * my_files_2016090119_20160902_060051_xx.csv.gz (151.34 MB)
> * my_files_2016090120_20160902_060051_xx.csv.gz (129.69 MB)
> * my_files_2016090121_20160902_060051_xx.csv.gz (151.7 MB)
> * my_files_2016090122_20160902_060051_xx.csv.gz (346.46 MB)
> * my_files_2016090122_20160902_060051_xx.csv.gz (222.57 MB)
> This code should take them all:
> {code:python}
> beam.io.ReadFromText(
>   "gs://_folder1/my_files_20160901*.csv.gz",
>   skip_header_lines=1,
>   compression_ty

[jira] [Commented] (BEAM-2490) ReadFromText function is not taking all data with glob operator (*)

2017-06-26 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063224#comment-16063224
 ] 

Guillermo Rodríguez Cano commented on BEAM-2490:


 Hello [~chamikara] and [~altay], and thanks for the comments,

 here you have some details of the setup I have used for the Direct runner so 
far (where apache beam version also applies for the Dataflow runner):
* OS: Mac OS X Sierra 10.12.5 
* Apache Beam: 2.0.0
* Python: 2.7.13

 I tried the HEAD from the official repository (git hash: 
[16f87f49f20796e29d01ed363a9097ea5420583c|https://github.com/apache/beam/tree/16f87f49f20796e29d01ed363a9097ea5420583c])
 as suggested by [~altay] and I cannot conclude yet whether it works or not. It 
seems that gz files are read 'more' than before because there is a higher 
memory usage than when using the current release of Apache Beam (and the amount 
of memory used is comparable to the case when the same non-compressed files are 
processed with the pipeline). However, it is extremely slow (again, with the 
Direct Runner), slower than using the non-compressed files.
Therefore, as a test of the HEAD I am now running only one of those gzip files 
now but the task hasn't completed (maybe then I just discovered some 
performance bug in that fix, https://github.com/apache/beam/pull/3428, because 
it feels very slow...).

I am not sure if this would be faster in GCP but I could try this anyways on 
Dataflow though I am not sure if I can have Dataflow run the HEAD of the 
repository. I tried following the advice on the official 
[documentation|https://cloud.google.com/dataflow/pipelines/dependencies-python] 
but I don't manage to get the repository properly packed for the workers to 
pick it up.

> ReadFromText function is not taking all data with glob operator (*) 
> 
>
> Key: BEAM-2490
> URL: https://issues.apache.org/jira/browse/BEAM-2490
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Affects Versions: 2.0.0
> Environment: Usage with Google Cloud Platform: Dataflow runner
>Reporter: Olivier NGUYEN QUOC
>Assignee: Chamikara Jayalath
> Fix For: Not applicable
>
>
> I run a very simple pipeline:
> * Read my files from Google Cloud Storage
> * Split with '\n' char
> * Write in on a Google Cloud Storage
> I have 8 files that match with the pattern:
> * my_files_2016090116_20160902_060051_xx.csv.gz (229.25 MB)
> * my_files_2016090117_20160902_060051_xx.csv.gz (184.1 MB)
> * my_files_2016090118_20160902_060051_xx.csv.gz (171.73 MB)
> * my_files_2016090119_20160902_060051_xx.csv.gz (151.34 MB)
> * my_files_2016090120_20160902_060051_xx.csv.gz (129.69 MB)
> * my_files_2016090121_20160902_060051_xx.csv.gz (151.7 MB)
> * my_files_2016090122_20160902_060051_xx.csv.gz (346.46 MB)
> * my_files_2016090122_20160902_060051_xx.csv.gz (222.57 MB)
> This code should take them all:
> {code:python}
> beam.io.ReadFromText(
>   "gs://_folder1/my_files_20160901*.csv.gz",
>   skip_header_lines=1,
>   compression_type=beam.io.filesystem.CompressionTypes.GZIP
>   )
> {code}
> It runs well but there is only a 288.62 MB file in output of this pipeline 
> (instead of a 1.5 GB file).
> The whole pipeline code:
> {code:python}
> data = (p | 'ReadMyFiles' >> beam.io.ReadFromText(
>   "gs://_folder1/my_files_20160901*.csv.gz",
>   skip_header_lines=1,
>   compression_type=beam.io.filesystem.CompressionTypes.GZIP
>   )
>| 'SplitLines' >> beam.FlatMap(lambda x: x.split('\n'))
> )
> output = (
>   data| "Write" >> beam.io.WriteToText('gs://XXX_folder2/test.csv', 
> num_shards=1)
> )
> {code}
> Dataflow indicates me that the estimated size of the output after the 
> ReadFromText step is 602.29 MB only, which not correspond to any unique input 
> file size nor the overall file size matching with the pattern.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2140) Fix SplittableDoFn ValidatesRunner tests in FlinkRunner

2017-06-26 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063129#comment-16063129
 ] 

Kenneth Knowles commented on BEAM-2140:
---

Dropping the processing-time timer is the intended behavior when it comes in 
for an expired window.

> Fix SplittableDoFn ValidatesRunner tests in FlinkRunner
> ---
>
> Key: BEAM-2140
> URL: https://issues.apache.org/jira/browse/BEAM-2140
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>Assignee: Aljoscha Krettek
>
> As discovered as part of BEAM-1763, there is a failing SDF test. We disabled 
> the tests to unblock the open PR for BEAM-1763.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow #3449

2017-06-26 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4215

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2514) Improve error message for missing required options of Beam pipeline

2017-06-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063047#comment-16063047
 ] 

ASF GitHub Bot commented on BEAM-2514:
--

GitHub user manuzhang opened a pull request:

https://github.com/apache/beam/pull/3438

[BEAM-2514] Use option name in missing required value message

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manuzhang/beam BEAM-2514

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3438


commit ab85c2ce8e9948e340d2019085b6cbb8e7c4
Author: manuzhang 
Date:   2017-06-26T12:55:13Z

[BEAM-2514] Use option name in missing required value message




> Improve error message for missing required options of Beam pipeline
> ---
>
> Key: BEAM-2514
> URL: https://issues.apache.org/jira/browse/BEAM-2514
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: 2.0.0
>Reporter: Manu Zhang
>Assignee: Manu Zhang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3438: [BEAM-2514] Use option name in missing required val...

2017-06-26 Thread manuzhang
GitHub user manuzhang opened a pull request:

https://github.com/apache/beam/pull/3438

[BEAM-2514] Use option name in missing required value message

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`.
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manuzhang/beam BEAM-2514

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3438


commit ab85c2ce8e9948e340d2019085b6cbb8e7c4
Author: manuzhang 
Date:   2017-06-26T12:55:13Z

[BEAM-2514] Use option name in missing required value message




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Spark #2476

2017-06-26 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-688) Failure of beam-sdks-java-maven-archetypes-starter with undeclared dependency error

2017-06-26 Thread Manu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16062985#comment-16062985
 ] 

Manu Zhang commented on BEAM-688:
-

I'm seeing this error again on latest master.

> Failure of beam-sdks-java-maven-archetypes-starter with undeclared dependency 
> error
> ---
>
> Key: BEAM-688
> URL: https://issues.apache.org/jira/browse/BEAM-688
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Scott Wegner
>  Labels: flake
>
> The starter archetype has flaky dependencies. It is reported to fail reliably 
> on repeatedly install.
> {noformat}
> [INFO] --- maven-dependency-plugin:2.10:analyze-only (default) @ 
> beam-sdks-java-maven-archetypes-starter ---
> [WARNING] Used undeclared dependencies found:
> [WARNING]org.slf4j:slf4j-api:jar:1.7.14:runtime
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2514) Improve error message for missing required options of Beam pipeline

2017-06-26 Thread Manu Zhang (JIRA)
Manu Zhang created BEAM-2514:


 Summary: Improve error message for missing required options of 
Beam pipeline
 Key: BEAM-2514
 URL: https://issues.apache.org/jira/browse/BEAM-2514
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Affects Versions: 2.0.0
Reporter: Manu Zhang
Assignee: Manu Zhang
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >