[jira] [Work logged] (BEAM-5040) BigQueryIO retries infinitely in WriteTable and WriteRename

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5040?focusedWorklogId=129115=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129115
 ]

ASF GitHub Bot logged work on BEAM-5040:


Author: ASF GitHub Bot
Created on: 31/Jul/18 05:05
Start Date: 31/Jul/18 05:05
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6080: [BEAM-5040] Fix 
retry bug for BigQuery jobs.
URL: https://github.com/apache/beam/pull/6080#issuecomment-409095915
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129115)
Time Spent: 2h 10m  (was: 2h)

> BigQueryIO retries infinitely in WriteTable and WriteRename
> ---
>
> Key: BEAM-5040
> URL: https://issues.apache.org/jira/browse/BEAM-5040
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> BigQueryIO retries infinitely in WriteTable and WriteRename
> Several failure scenarios with the current code:
>  # It's possible for a load job to return failure even though it actually 
> succeeded (e.g. the reply might have timed out). In this case, BigQueryIO 
> will retry the job which will fail again (because the job id has already been 
> used), leading to indefinite retries. Correct behavior is to stop retrying as 
> the load job has succeeded.
>  # It's possible for a load job to be accepted by BigQuery, but then to fail 
> on the BigQuery side. In this case a retry with the same job id will fail as 
> that job id has already been used. BigQueryIO will sometimes detect this, but 
> if the worker has restarted it will instead issue a load with the old job id 
> and go into a retry loop. Correct behavior is to generate a new deterministic 
> job id and retry using that new job id.
>  # In many cases of worker restart, BigQueryIO ends up in infinite retry 
> loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5040) BigQueryIO retries infinitely in WriteTable and WriteRename

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5040?focusedWorklogId=129114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129114
 ]

ASF GitHub Bot logged work on BEAM-5040:


Author: ASF GitHub Bot
Created on: 31/Jul/18 04:47
Start Date: 31/Jul/18 04:47
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6080: [BEAM-5040] Fix 
retry bug for BigQuery jobs.
URL: https://github.com/apache/beam/pull/6080#issuecomment-409093386
 
 
   run Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129114)
Time Spent: 2h  (was: 1h 50m)

> BigQueryIO retries infinitely in WriteTable and WriteRename
> ---
>
> Key: BEAM-5040
> URL: https://issues.apache.org/jira/browse/BEAM-5040
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> BigQueryIO retries infinitely in WriteTable and WriteRename
> Several failure scenarios with the current code:
>  # It's possible for a load job to return failure even though it actually 
> succeeded (e.g. the reply might have timed out). In this case, BigQueryIO 
> will retry the job which will fail again (because the job id has already been 
> used), leading to indefinite retries. Correct behavior is to stop retrying as 
> the load job has succeeded.
>  # It's possible for a load job to be accepted by BigQuery, but then to fail 
> on the BigQuery side. In this case a retry with the same job id will fail as 
> that job id has already been used. BigQueryIO will sometimes detect this, but 
> if the worker has restarted it will instead issue a load with the old job id 
> and go into a retry loop. Correct behavior is to generate a new deterministic 
> job id and retry using that new job id.
>  # In many cases of worker restart, BigQueryIO ends up in infinite retry 
> loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129107=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129107
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 04:07
Start Date: 31/Jul/18 04:07
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6103: [BEAM-4793] Enable 
schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103#issuecomment-409087977
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129107)
Time Spent: 1h 10m  (was: 1h)

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129106
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 04:06
Start Date: 31/Jul/18 04:06
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6103: [BEAM-4793] Enable 
schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103#issuecomment-409087932
 
 
   Run Java PreCommit
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129106)
Time Spent: 1h  (was: 50m)

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129103=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129103
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 03:48
Start Date: 31/Jul/18 03:48
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6103: [BEAM-4793] Enable 
schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103#issuecomment-409085372
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129103)
Time Spent: 50m  (was: 40m)

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=129100=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129100
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 31/Jul/18 03:25
Start Date: 31/Jul/18 03:25
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r206385144
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -246,10 +263,33 @@ def __init__(self, value, timestamp):
 self.value = value
 self.timestamp = Timestamp.of(timestamp)
 
-  def __cmp__(self, other):
-if type(self) is not type(other):
-  return cmp(type(self), type(other))
-return cmp((self.value, self.timestamp), (other.value, other.timestamp))
+  def __eq__(self, other):
 
 Review comment:
   With this PR, the test becomes flaky, or in other words passes sometimes. It 
may still flake if we implement all ops manually - did you try running the test 
multiple times when all ops are implemented?
   
   I don't understand yet what change in behavior triggers this (we should find 
out), but I think we need to fix the test regardless of this: 
https://github.com/apache/beam/pull/6104.
   
   Performance-wise, last week I used 
https://github.com/apache/beam/compare/master...tvalentyn:transforms_microbenchmark
 to compare different options, and did not notice a significant difference when 
using @total_ordering (we could double check), so I favored the decorator to 
reduce the boilerplate. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129100)
Time Spent: 13.5h  (was: 13h 20m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 13.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4006) Futurize and fix python 2 compatibility for transforms subpackage

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4006?focusedWorklogId=129099=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129099
 ]

ASF GitHub Bot logged work on BEAM-4006:


Author: ASF GitHub Bot
Created on: 31/Jul/18 03:25
Start Date: 31/Jul/18 03:25
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on a change in pull request #5729: 
[BEAM-4006] Futurize transforms subpackage
URL: https://github.com/apache/beam/pull/5729#discussion_r206385144
 
 

 ##
 File path: sdks/python/apache_beam/transforms/window.py
 ##
 @@ -246,10 +263,33 @@ def __init__(self, value, timestamp):
 self.value = value
 self.timestamp = Timestamp.of(timestamp)
 
-  def __cmp__(self, other):
-if type(self) is not type(other):
-  return cmp(type(self), type(other))
-return cmp((self.value, self.timestamp), (other.value, other.timestamp))
+  def __eq__(self, other):
 
 Review comment:
   With this PR, the test becomes flaky, or in other words passes sometimes. It 
may still flake if we implement all ops manually - did you try running it 
multiple times?
   
   I don't understand yet what change in behavior triggers this (we should find 
out), but I think we need to fix the test regardless of this: 
https://github.com/apache/beam/pull/6104.
   
   Performance-wise, last week I used 
https://github.com/apache/beam/compare/master...tvalentyn:transforms_microbenchmark
 to compare different options, and did not notice a significant difference when 
using @total_ordering (we could double check), so I favored the decorator to 
reduce the boilerplate. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129099)
Time Spent: 13h 20m  (was: 13h 10m)

> Futurize and fix python 2 compatibility for transforms subpackage
> -
>
> Key: BEAM-4006
> URL: https://issues.apache.org/jira/browse/BEAM-4006
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Matthias Feys
>Priority: Major
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129098
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 03:24
Start Date: 31/Jul/18 03:24
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #6103: [BEAM-4793] Enable 
schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103#issuecomment-409082080
 
 
   Run Java PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129098)
Time Spent: 40m  (was: 0.5h)

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1124

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[tweise] [BEAM-2930] Side inputs are not yet supported in streaming mode.

--
[...truncated 17.88 MB...]
Jul 31, 2018 2:44:01 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Batch 
mutations together as step s39
Jul 31, 2018 2:44:01 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Write 
mutations to Spanner as step s40
Jul 31, 2018 2:44:01 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testreportfailures-jenkins-0731024357-352539dc/output/results/staging/
Jul 31, 2018 2:44:01 AM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <115993 bytes, hash zP-XwZrdU46-Vtp-Z_6wjA> to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testreportfailures-jenkins-0731024357-352539dc/output/results/staging/pipeline-zP-XwZrdU46-Vtp-Z_6wjA.pb

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_OUT
Dataflow SDK version: 2.7.0-SNAPSHOT

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_ERROR
Jul 31, 2018 2:44:02 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-07-30_19_44_01-13649990934639948326?project=apache-beam-testing

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_OUT
Submitted job: 2018-07-30_19_44_01-13649990934639948326

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testReportFailures 
STANDARD_ERROR
Jul 31, 2018 2:44:02 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-07-30_19_44_01-13649990934639948326
Jul 31, 2018 2:44:02 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 2018-07-30_19_44_01-13649990934639948326 with 0 
expected assertions.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:01.756Z: Autoscaling is enabled for job 
2018-07-30_19_44_01-13649990934639948326. The number of workers will be between 
1 and 1000.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:01.789Z: Autoscaling was automatically enabled for 
job 2018-07-30_19_44_01-13649990934639948326.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:04.485Z: Checking required Cloud APIs are enabled.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:04.663Z: Checking permissions granted to controller 
Service Account.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:08.845Z: Worker configuration: n1-standard-1 in 
us-central1-b.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.227Z: Expanding CoGroupByKey operations into 
optimizable parts.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.614Z: Expanding GroupByKey operations into 
optimizable parts.
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.651Z: Lifting ValueCombiningMappingFns into 
MergeBucketsMappingFns
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.846Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.876Z: Elided trivial flatten 
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.908Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Jul 31, 2018 2:44:12 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T02:44:09.943Z: Fusing consumer SpannerIO.Write/Write 

Jenkins build is back to normal : beam_PerformanceTests_XmlIOIT #575

2018-07-30 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_Compressed_TextIOIT_HDFS #480

2018-07-30 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-5039) python postcommit broken in call to WriteToPubSub

2018-07-30 Thread Udi Meiri (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Udi Meiri closed BEAM-5039.
---
   Resolution: Fixed
Fix Version/s: Not applicable

Green since https://builds.apache.org/job/beam_PostCommit_Python_Verify/5613/

> python postcommit broken in call to WriteToPubSub
> -
>
> Key: BEAM-5039
> URL: https://issues.apache.org/jira/browse/BEAM-5039
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 105, in test_streaming_wordcount_it
> self.test_pipeline.get_full_options_as_args(**extra_opts))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 90, in run
> output | beam.io.WriteToPubSub(known_args.output_topic)
> TypeError: __init__() takes at least 3 arguments (2 given)
> https://builds.apache.org/job/beam_PostCommit_Python_Verify/5597/consoleText



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=129086=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129086
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 31/Jul/18 02:02
Start Date: 31/Jul/18 02:02
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6082: [BEAM-2930] Side 
inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
index d376e58ec4b..0a48edb8d6b 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
@@ -433,17 +433,18 @@ private void translateImpulse(
   throw new RuntimeException(e);
 }
 
-String inputPCollectionId = 
Iterables.getOnlyElement(transform.getInputsMap().values());
+String inputPCollectionId = stagePayload.getInput();
+// TODO: https://issues.apache.org/jira/browse/BEAM-2930
+if (stagePayload.getSideInputsCount() > 0) {
+  throw new UnsupportedOperationException(
+  "[BEAM-2930] streaming translator does not support side inputs: " + 
transform);
+}
 
 Map, OutputTag>> tagsToOutputTags = 
Maps.newLinkedHashMap();
 Map, Coder>> tagsToCoders = 
Maps.newLinkedHashMap();
 // TODO: does it matter which output we designate as "main"
-TupleTag mainOutputTag;
-if (!outputs.isEmpty()) {
-  mainOutputTag = new TupleTag(outputs.keySet().iterator().next());
-} else {
-  mainOutputTag = null;
-}
+final TupleTag mainOutputTag =
+outputs.isEmpty() ? null : new 
TupleTag(outputs.keySet().iterator().next());
 
 // associate output tags with ids, output manager uses these Integer ids 
to serialize state
 BiMap outputIndexMap =


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129086)
Time Spent: 2h  (was: 1h 50m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated: [BEAM-2930] Side inputs are not yet supported in streaming mode.

2018-07-30 Thread thw
This is an automated email from the ASF dual-hosted git repository.

thw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 53400a1  [BEAM-2930] Side inputs are not yet supported in streaming 
mode.
53400a1 is described below

commit 53400a1c3e739502d1b854867c77ede6d9f94da5
Author: Thomas Weise 
AuthorDate: Thu Jul 26 19:55:27 2018 -0700

[BEAM-2930] Side inputs are not yet supported in streaming mode.
---
 .../flink/FlinkStreamingPortablePipelineTranslator.java   | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
index d376e58..0a48edb 100644
--- 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
@@ -433,17 +433,18 @@ public class FlinkStreamingPortablePipelineTranslator
   throw new RuntimeException(e);
 }
 
-String inputPCollectionId = 
Iterables.getOnlyElement(transform.getInputsMap().values());
+String inputPCollectionId = stagePayload.getInput();
+// TODO: https://issues.apache.org/jira/browse/BEAM-2930
+if (stagePayload.getSideInputsCount() > 0) {
+  throw new UnsupportedOperationException(
+  "[BEAM-2930] streaming translator does not support side inputs: " + 
transform);
+}
 
 Map, OutputTag>> tagsToOutputTags = 
Maps.newLinkedHashMap();
 Map, Coder>> tagsToCoders = 
Maps.newLinkedHashMap();
 // TODO: does it matter which output we designate as "main"
-TupleTag mainOutputTag;
-if (!outputs.isEmpty()) {
-  mainOutputTag = new TupleTag(outputs.keySet().iterator().next());
-} else {
-  mainOutputTag = null;
-}
+final TupleTag mainOutputTag =
+outputs.isEmpty() ? null : new 
TupleTag(outputs.keySet().iterator().next());
 
 // associate output tags with ids, output manager uses these Integer ids 
to serialize state
 BiMap outputIndexMap =



[jira] [Work logged] (BEAM-3906) Get Python Wheel Validation Automated

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3906?focusedWorklogId=129085=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129085
 ]

ASF GitHub Bot logged work on BEAM-3906:


Author: ASF GitHub Bot
Created on: 31/Jul/18 02:00
Start Date: 31/Jul/18 02:00
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on issue #4943: [BEAM-3906] Automate 
Validation Aganist Python Wheel
URL: https://github.com/apache/beam/pull/4943#issuecomment-409068965
 
 
   @aaltay @boyuanzz  Any other comments on this PR?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129085)
Time Spent: 27h 50m  (was: 27h 40m)

> Get Python Wheel Validation Automated
> -
>
> Key: BEAM-3906
> URL: https://issues.apache.org/jira/browse/BEAM-3906
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-python, testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 27h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3906) Get Python Wheel Validation Automated

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3906?focusedWorklogId=129084=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129084
 ]

ASF GitHub Bot logged work on BEAM-3906:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:59
Start Date: 31/Jul/18 01:59
Worklog Time Spent: 10m 
  Work Description: yifanzou removed a comment on issue #4943: [BEAM-3906] 
Automate Validation Aganist Python Wheel
URL: https://github.com/apache/beam/pull/4943#issuecomment-408555184
 
 
   @aaltay 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129084)
Time Spent: 27h 40m  (was: 27.5h)

> Get Python Wheel Validation Automated
> -
>
> Key: BEAM-3906
> URL: https://issues.apache.org/jira/browse/BEAM-3906
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-python, testing
>Reporter: yifan zou
>Assignee: yifan zou
>Priority: Major
>  Time Spent: 27h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1123

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[tweise] [BEAM-5023] Send HarnessId in DataClient (#6066)

--
[...truncated 18.33 MB...]
Jul 31, 2018 1:34:31 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 2018-07-30_18_34_30-11522354173662062825 with 0 
expected assertions.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:30.201Z: Autoscaling is enabled for job 
2018-07-30_18_34_30-11522354173662062825. The number of workers will be between 
1 and 1000.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:30.243Z: Autoscaling was automatically enabled for 
job 2018-07-30_18_34_30-11522354173662062825.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:32.920Z: Checking required Cloud APIs are enabled.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:33.113Z: Checking permissions granted to controller 
Service Account.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:36.683Z: Worker configuration: n1-standard-1 in 
us-central1-b.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.155Z: Expanding CoGroupByKey operations into 
optimizable parts.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.410Z: Expanding GroupByKey operations into 
optimizable parts.
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.456Z: Lifting ValueCombiningMappingFns into 
MergeBucketsMappingFns
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.722Z: Fusing adjacent ParDo, Read, Write, and 
Flatten operations
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.758Z: Elided trivial flatten 
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.804Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map into SpannerIO.Write/Write 
mutations to Cloud Spanner/Create seed/Read(CreateSource)
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.842Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Read information schema into SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/Wait/Map
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.886Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Write
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/ParDo(UseWindowHashAsKeyAndWindowAsSortKey)
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.928Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/ParDo(IsmRecordForSingularValuePerWindow) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/BatchViewOverrides.GroupByWindowHashAsKeyAndWindowAsSortKey/BatchViewOverrides.GroupByKeyAndSortValuesOnly/Read
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:37.968Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/WithKeys/AddKeys/Map
 into SpannerIO.Write/Write mutations to Cloud Spanner/Read information schema
Jul 31, 2018 1:34:41 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:34:38.006Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/Combine.globally(Singleton)/Combine.perKey(Singleton)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Schema 

[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=129082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129082
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:35
Start Date: 31/Jul/18 01:35
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6082: [BEAM-2930] Side 
inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#issuecomment-409065086
 
 
   Run Java Precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129082)
Time Spent: 1h 50m  (was: 1h 40m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129079=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129079
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:26
Start Date: 31/Jul/18 01:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206368150
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 
'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders'
+excludeCategories 'org.apache.beam.sdk.testing.LargeKeys$Above100MB'
+excludeCategories 'org.apache.beam.sdk.testing.UsesCommittedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesGaugeMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesDistributionMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesAttemptedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTimersInParDo'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTestStream'
+  }
+  testCategories = config.testCategories ? config.testCategories : 
testCategories
+  project.tasks.create(name: name, type: Test) {
+group = "Verification"
+description = "Validates the PortableRunner with JobServer 
${config.jobServerDriver}"
+systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  
"--runner=org.apache.beam.runners.reference.testing.TestPortableRunner",
+  "--jobServerDriver=${config.jobServerDriver}",
+  config.jobServerConfig ? 
"--jobServerConfig=${config.jobServerConfig}" : "",
+])
+classpath = project.configurations.validatesPortableRunner
 
 Review comment:
   I will make it pluggable and document it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129079)
Time Spent: 7h 50m  (was: 7h 40m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129077=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129077
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:26
Start Date: 31/Jul/18 01:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206369797
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 
'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders'
+excludeCategories 'org.apache.beam.sdk.testing.LargeKeys$Above100MB'
+excludeCategories 'org.apache.beam.sdk.testing.UsesCommittedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesGaugeMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesDistributionMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesAttemptedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTimersInParDo'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTestStream'
+  }
+  testCategories = config.testCategories ? config.testCategories : 
testCategories
+  project.tasks.create(name: name, type: Test) {
+group = "Verification"
+description = "Validates the PortableRunner with JobServer 
${config.jobServerDriver}"
+systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  
"--runner=org.apache.beam.runners.reference.testing.TestPortableRunner",
+  "--jobServerDriver=${config.jobServerDriver}",
+  config.jobServerConfig ? 
"--jobServerConfig=${config.jobServerConfig}" : "",
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129077)
Time Spent: 7.5h  (was: 7h 20m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129076=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129076
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:26
Start Date: 31/Jul/18 01:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206367704
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
 
 Review comment:
   Runner can provide test categories in config.testCategories so they can 
control the appropriate test category.
   Should I add only the exclude categories in the parameters?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129076)
Time Spent: 7.5h  (was: 7h 20m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129078=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129078
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:26
Start Date: 31/Jul/18 01:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206371013
 
 

 ##
 File path: runners/direct-java/build.gradle
 ##
 @@ -147,3 +156,5 @@ createJavaExamplesArchetypeValidationTask(type: 
'MobileGaming',
   gcsBucket: gcsBucket,
   bqDataset: bqDataset,
   pubsubTopic: pubsubTopic)
+
+createPortableValidatesRunnerTask(name: "validatesPortableRunner", 
jobServerDriver: 
"org.apache.beam.runners.direct.portable.job.ReferenceRunnerJobServer")
 
 Review comment:
   done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129078)
Time Spent: 7h 40m  (was: 7.5h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129080
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:26
Start Date: 31/Jul/18 01:26
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206369028
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -184,6 +185,18 @@ class BeamModulePlugin implements Plugin {
 String tag = null // Sets the image tag (optional).
   }
 
+  // A class defining the configuration for PortableValidatesRunner.
+  class PortableValidatesRunnerConfig {
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129080)
Time Spent: 8h  (was: 7h 50m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Py_VR_Dataflow #664

2018-07-30 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_XmlIOIT #574

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[ehudm] Make with_attributes kwarg optional.

[ehudm] Add decode and encode steps to streaming_wordcount

[qinyeli] Interactive Beam -- display update

[ehudm] Fix PubSubMessageMatcher not acking messages.

[pablo] Add GitScm poll trigger for post-commit tests.

[tweise] [BEAM-5023] Send HarnessId in DataClient (#6066)

--
[...truncated 252.95 KB...]
INFO: 2018-07-31T01:03:18.459Z: Fusing consumer Write xml 
files/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
 into Write xml 
files/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/GroupByWindow
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.513Z: Fusing consumer Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle/GroupByKey/GroupByWindow into 
Write xml files/WriteFiles/GatherTempFileResults/Reshuffle/GroupByKey/Read
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.559Z: Fusing consumer Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle/GroupByKey/Reify into Write 
xml files/WriteFiles/GatherTempFileResults/Reshuffle/Window.Into()/Window.Assign
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.605Z: Fusing consumer Write xml 
files/WriteFiles/FinalizeTempFileBundles/Finalize into Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Values/Values/Map
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.645Z: Fusing consumer Write xml 
files/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Pair with 
random key into Write xml files/WriteFiles/FinalizeTempFileBundles/Finalize
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.694Z: Fusing consumer Read xml 
files/ReadAllViaFileBasedSource/Read ranges into Read xml 
files/ReadAllViaFileBasedSource/Reshuffle/Values/Values/Map
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.742Z: Fusing consumer Get file names/Values/Map 
into Write xml 
files/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Values/Values/Map
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.790Z: Fusing consumer Write xml 
files/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Values/Values/Map
 into Write xml 
files/WriteFiles/FinalizeTempFileBundles/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.842Z: Fusing consumer Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle/GroupByKey/Write into Write 
xml files/WriteFiles/GatherTempFileResults/Reshuffle/GroupByKey/Reify
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.893Z: Fusing consumer Read xml 
files/ReadAllViaFileBasedSource/Reshuffle/Reshuffle/GroupByKey/Write into Read 
xml files/ReadAllViaFileBasedSource/Reshuffle/Reshuffle/GroupByKey/Reify
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.947Z: Fusing consumer Read xml 
files/ReadAllViaFileBasedSource/Reshuffle/Reshuffle/GroupByKey/Reify into Read 
xml 
files/ReadAllViaFileBasedSource/Reshuffle/Reshuffle/Window.Into()/Window.Assign
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:18.994Z: Fusing consumer Read xml 
files/ReadAllViaFileBasedSource/Reshuffle/Values/Values/Map into Read xml 
files/ReadAllViaFileBasedSource/Reshuffle/Reshuffle/ExpandIterable
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:19.047Z: Fusing consumer Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/ExpandIterable
 into Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/GroupByWindow
Jul 31, 2018 1:03:25 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T01:03:19.102Z: Fusing consumer Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/GroupByWindow
 into Write xml 
files/WriteFiles/GatherTempFileResults/Reshuffle.ViaRandomKey/Reshuffle/GroupByKey/Read
   

Build failed in Jenkins: beam_PerformanceTests_Compressed_TextIOIT_HDFS #479

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[ehudm] Make with_attributes kwarg optional.

[ehudm] Add decode and encode steps to streaming_wordcount

[qinyeli] Interactive Beam -- display update

[ehudm] Fix PubSubMessageMatcher not acking messages.

[pablo] Add GitScm poll trigger for post-commit tests.

[tweise] [BEAM-5023] Send HarnessId in DataClient (#6066)

--
[...truncated 2.13 KB...]
DEBUG: SDK update checks are disabled.
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins7405554649077551436.sh
+ cp /home/jenkins/.kube/config 

[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins5314194702348045826.sh
+ kubectl 
--kubeconfig=
 create namespace beam-performancetests-compressed-textioit-hdfs-479
namespace "beam-performancetests-compressed-textioit-hdfs-479" created
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins8839118963763456246.sh
++ kubectl config current-context
+ kubectl 
--kubeconfig=
 config set-context gke_apache-beam-testing_us-central1-a_io-datastores 
--namespace=beam-performancetests-compressed-textioit-hdfs-479
Context "gke_apache-beam-testing_us-central1-a_io-datastores" modified.
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4051221563849058508.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4476448276002481032.sh
+ rm -rf .beam_env
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins2839030187893588207.sh
+ rm -rf .perfkit_env
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins1570890421550287420.sh
+ virtualenv .beam_env --system-site-packages
New python executable in 

Also creating executable in 

Installing setuptools, pkg_resources, pip, wheel...done.
Running virtualenv with interpreter /usr/bin/python2
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins3864410940020842136.sh
+ virtualenv .perfkit_env --system-site-packages
New python executable in 

Also creating executable in 

Installing setuptools, pkg_resources, pip, wheel...done.
Running virtualenv with interpreter /usr/bin/python2
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins1408968765665145304.sh
+ .beam_env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in 
./.beam_env/lib/python2.7/site-packages (40.0.0)
Requirement already up-to-date: pip in ./.beam_env/lib/python2.7/site-packages 
(18.0)
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins3009530272904727137.sh
+ .perfkit_env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in 
./.perfkit_env/lib/python2.7/site-packages (40.0.0)
Requirement already up-to-date: pip in 
./.perfkit_env/lib/python2.7/site-packages (18.0)
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins4198030641263055346.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Compressed_TextIOIT_HDFS] $ /bin/bash -xe 
/tmp/jenkins913902941710435856.sh
+ .beam_env/bin/pip install -e 'src/sdks/python/[gcp,test]'
Obtaining 
file://
Requirement already satisfied: avro<2.0.0,>=1.8.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.7.0.dev0) 
(1.8.2)
Requirement already satisfied: crcmod<2.0,>=1.7 in 
/usr/lib/python2.7/dist-packages (from apache-beam==2.7.0.dev0) (1.7)
Requirement already satisfied: dill<=0.2.8.2,>=0.2.6 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.7.0.dev0) 
(0.2.8.2)
Requirement already satisfied: grpcio<2,>=1.8 in 
/home/jenkins/.local/lib/python2.7/site-packages (from apache-beam==2.7.0.dev0) 
(1.13.0)
Requirement already satisfied: hdfs<3.0.0,>=2.1.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 

[jira] [Work logged] (BEAM-5040) BigQueryIO retries infinitely in WriteTable and WriteRename

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5040?focusedWorklogId=129061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129061
 ]

ASF GitHub Bot logged work on BEAM-5040:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:00
Start Date: 31/Jul/18 01:00
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #6080: 
[BEAM-5040] Fix retry bug for BigQuery jobs.
URL: https://github.com/apache/beam/pull/6080#discussion_r206367334
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteRename.java
 ##
 @@ -138,10 +142,31 @@ private void copy(
 
 String projectId = ref.getProjectId();
 Job lastFailedCopyJob = null;
-for (int i = 0; i < BatchLoads.MAX_RETRY_JOBS; ++i) {
-  String jobId = jobIdPrefix + "-" + i;
-  JobReference jobRef = new 
JobReference().setProjectId(projectId).setJobId(jobId);
-  jobService.startCopyJob(jobRef, copyConfig);
+String jobId = jobIdPrefix + "-0";
+String bqLocation =
+BigQueryHelpers.getDatasetLocation(datasetService, ref.getProjectId(), 
ref.getDatasetId());
+for (int i = 0; i < maxRetryJobs; ++i) {
+  JobReference jobRef =
+  new 
JobReference().setProjectId(projectId).setJobId(jobId).setLocation(bqLocation);
+  LOG.info("Starting copy job for table {} using  {}, attempt {}", ref, 
jobRef, i);
+  try {
+jobService.startCopyJob(jobRef, copyConfig);
+  } catch (IOException e) {
+LOG.warn("Copy job {} failed with {}", jobRef, e);
+// It's possible that the job actually made it to BQ even though we 
got a failure here.
+// For example, the response from BQ may have timed out returning. 
getRetryJobId will
+// return the correct job id to use on retry, or a job id to continue 
polling (if it turns
+// out the the job has not actually failed yet).
+RetryJobIdResult result =
+BigQueryHelpers.getRetryJobId(jobId, projectId, bqLocation, 
jobService);
+jobId = result.jobId;
+if (result.shouldRetry) {
+  // Try the load again with the new job id.
+  continue;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129061)
Time Spent: 1h 40m  (was: 1.5h)

> BigQueryIO retries infinitely in WriteTable and WriteRename
> ---
>
> Key: BEAM-5040
> URL: https://issues.apache.org/jira/browse/BEAM-5040
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> BigQueryIO retries infinitely in WriteTable and WriteRename
> Several failure scenarios with the current code:
>  # It's possible for a load job to return failure even though it actually 
> succeeded (e.g. the reply might have timed out). In this case, BigQueryIO 
> will retry the job which will fail again (because the job id has already been 
> used), leading to indefinite retries. Correct behavior is to stop retrying as 
> the load job has succeeded.
>  # It's possible for a load job to be accepted by BigQuery, but then to fail 
> on the BigQuery side. In this case a retry with the same job id will fail as 
> that job id has already been used. BigQueryIO will sometimes detect this, but 
> if the worker has restarted it will instead issue a load with the old job id 
> and go into a retry loop. Correct behavior is to generate a new deterministic 
> job id and retry using that new job id.
>  # In many cases of worker restart, BigQueryIO ends up in infinite retry 
> loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5040) BigQueryIO retries infinitely in WriteTable and WriteRename

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5040?focusedWorklogId=129062=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129062
 ]

ASF GitHub Bot logged work on BEAM-5040:


Author: ASF GitHub Bot
Created on: 31/Jul/18 01:00
Start Date: 31/Jul/18 01:00
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #6080: 
[BEAM-5040] Fix retry bug for BigQuery jobs.
URL: https://github.com/apache/beam/pull/6080#discussion_r206367353
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/WriteTables.java
 ##
 @@ -261,16 +265,33 @@ private void load(
 Job lastFailedLoadJob = null;
 String bqLocation =
 BigQueryHelpers.getDatasetLocation(datasetService, ref.getProjectId(), 
ref.getDatasetId());
-for (int i = 0; i < BatchLoads.MAX_RETRY_JOBS; ++i) {
-  String jobId = jobIdPrefix + "-" + i;
-
+// First attempt is always jobIdPrefix-0.
+String jobId = jobIdPrefix + "-0";
+for (int i = 0; i < maxRetryJobs; ++i) {
   JobReference jobRef =
   new 
JobReference().setProjectId(projectId).setJobId(jobId).setLocation(bqLocation);
 
   LOG.info("Loading {} files into {} using job {}, attempt {}", 
gcsUris.size(), ref, jobRef, i);
-  jobService.startLoadJob(jobRef, loadConfig);
+  try {
+jobService.startLoadJob(jobRef, loadConfig);
+  } catch (IOException e) {
+LOG.warn("Load job {} failed with {}", jobRef, e);
+// It's possible that the job actually made it to BQ even though we 
got a failure here.
+// For example, the response from BQ may have timed out returning. 
getRetryJobId will
+// return the correct job id to use on retry, or a job id to continue 
polling (if it turns
+// out the the job has not actually failed yet).
+RetryJobIdResult result =
+BigQueryHelpers.getRetryJobId(jobId, projectId, bqLocation, 
jobService);
+jobId = result.jobId;
+if (result.shouldRetry) {
+  // Try the load again with the new job id.
+  continue;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129062)
Time Spent: 1h 50m  (was: 1h 40m)

> BigQueryIO retries infinitely in WriteTable and WriteRename
> ---
>
> Key: BEAM-5040
> URL: https://issues.apache.org/jira/browse/BEAM-5040
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> BigQueryIO retries infinitely in WriteTable and WriteRename
> Several failure scenarios with the current code:
>  # It's possible for a load job to return failure even though it actually 
> succeeded (e.g. the reply might have timed out). In this case, BigQueryIO 
> will retry the job which will fail again (because the job id has already been 
> used), leading to indefinite retries. Correct behavior is to stop retrying as 
> the load job has succeeded.
>  # It's possible for a load job to be accepted by BigQuery, but then to fail 
> on the BigQuery side. In this case a retry with the same job id will fail as 
> that job id has already been used. BigQueryIO will sometimes detect this, but 
> if the worker has restarted it will instead issue a load with the old job id 
> and go into a retry loop. Correct behavior is to generate a new deterministic 
> job id and retry using that new job id.
>  # In many cases of worker restart, BigQueryIO ends up in infinite retry 
> loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5040) BigQueryIO retries infinitely in WriteTable and WriteRename

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5040?focusedWorklogId=129060=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129060
 ]

ASF GitHub Bot logged work on BEAM-5040:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:59
Start Date: 31/Jul/18 00:59
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on a change in pull request #6080: 
[BEAM-5040] Fix retry bug for BigQuery jobs.
URL: https://github.com/apache/beam/pull/6080#discussion_r206367321
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
 ##
 @@ -55,6 +59,77 @@
   + " an earlier stage of the pipeline, this validation can be 
disabled using"
   + " #withoutValidation.";
 
+  private static final Logger LOG = 
LoggerFactory.getLogger(BigQueryHelpers.class);
+
+  // Given a potential failure and a current job-id, return the next job-id to 
be used on retry.
+  // Algorithm is as follows (given input of job_id_prefix-N)
+  //   If BigQuery has no status for job_id_prefix-n, we should retry with the 
same id.
+  //   If job-id-prefix-n is in the PENDING or successful states, no retry is 
needed.
+  //   Otherwise (job-id-prefix-n completed with errors), try again with 
job-id-prefix-(n+1)
+  //
+  // We continue to loop through these job ids until we find one that has 
either succeed, or that
+  // has not been issued yet.
+  static class RetryJobIdResult {
+public final String jobId;
+public final boolean shouldRetry;
+
+public RetryJobIdResult(String jobId, boolean shouldRetry) {
+  this.jobId = jobId;
+  this.shouldRetry = shouldRetry;
+}
+  }
+
+  static RetryJobIdResult getRetryJobId(
+  String currentJobId, String projectId, String bqLocation, JobService 
jobService)
+  throws InterruptedException {
+// Job ids should always be of the form -
 
 Review comment:
   Good idea. Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129060)
Time Spent: 1.5h  (was: 1h 20m)

> BigQueryIO retries infinitely in WriteTable and WriteRename
> ---
>
> Key: BEAM-5040
> URL: https://issues.apache.org/jira/browse/BEAM-5040
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> BigQueryIO retries infinitely in WriteTable and WriteRename
> Several failure scenarios with the current code:
>  # It's possible for a load job to return failure even though it actually 
> succeeded (e.g. the reply might have timed out). In this case, BigQueryIO 
> will retry the job which will fail again (because the job id has already been 
> used), leading to indefinite retries. Correct behavior is to stop retrying as 
> the load job has succeeded.
>  # It's possible for a load job to be accepted by BigQuery, but then to fail 
> on the BigQuery side. In this case a retry with the same job id will fail as 
> that job id has already been used. BigQueryIO will sometimes detect this, but 
> if the worker has restarted it will instead issue a load with the old job id 
> and go into a retry loop. Correct behavior is to generate a new deterministic 
> job id and retry using that new job id.
>  # In many cases of worker restart, BigQueryIO ends up in infinite retry 
> loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=129058=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129058
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:57
Start Date: 31/Jul/18 00:57
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6082: [BEAM-2930] Side 
inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#issuecomment-409059095
 
 
   Run Java Precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129058)
Time Spent: 1h 40m  (was: 1.5h)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129056=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129056
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:56
Start Date: 31/Jul/18 00:56
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206366711
 
 

 ##
 File path: runners/direct-java/build.gradle
 ##
 @@ -147,3 +156,5 @@ createJavaExamplesArchetypeValidationTask(type: 
'MobileGaming',
   gcsBucket: gcsBucket,
   bqDataset: bqDataset,
   pubsubTopic: pubsubTopic)
+
+createPortableValidatesRunnerTask(name: "validatesPortableRunner", 
jobServerDriver: 
"org.apache.beam.runners.direct.portable.job.ReferenceRunnerJobServer")
 
 Review comment:
   Looks like there's no trailing newline. I guess `./gradlew spotlessApply` 
doesn't fix it because it's not a Java file.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129056)
Time Spent: 7h 10m  (was: 7h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129055
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:56
Start Date: 31/Jul/18 00:56
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206366275
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
 
 Review comment:
   Sorry, make that `createPortableValidatesRunnerTask` rather than 
`applyJavaNature`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129055)
Time Spent: 7h  (was: 6h 50m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129057=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129057
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:56
Start Date: 31/Jul/18 00:56
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206366632
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 
'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders'
+excludeCategories 'org.apache.beam.sdk.testing.LargeKeys$Above100MB'
+excludeCategories 'org.apache.beam.sdk.testing.UsesCommittedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesGaugeMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesDistributionMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesAttemptedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTimersInParDo'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTestStream'
+  }
+  testCategories = config.testCategories ? config.testCategories : 
testCategories
+  project.tasks.create(name: name, type: Test) {
+group = "Verification"
+description = "Validates the PortableRunner with JobServer 
${config.jobServerDriver}"
+systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  
"--runner=org.apache.beam.runners.reference.testing.TestPortableRunner",
+  "--jobServerDriver=${config.jobServerDriver}",
+  config.jobServerConfig ? 
"--jobServerConfig=${config.jobServerConfig}" : "",
+])
+classpath = project.configurations.validatesPortableRunner
 
 Review comment:
   Please either document the requirement of the existence of a 
`validatesPortableRunner` configuration or make the configuration pluggable.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129057)
Time Spent: 7h 20m  (was: 7h 10m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129054
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:56
Start Date: 31/Jul/18 00:56
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206366425
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 
'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders'
+excludeCategories 'org.apache.beam.sdk.testing.LargeKeys$Above100MB'
+excludeCategories 'org.apache.beam.sdk.testing.UsesCommittedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesGaugeMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesDistributionMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesAttemptedMetrics'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTimersInParDo'
+excludeCategories 'org.apache.beam.sdk.testing.UsesTestStream'
+  }
+  testCategories = config.testCategories ? config.testCategories : 
testCategories
+  project.tasks.create(name: name, type: Test) {
+group = "Verification"
+description = "Validates the PortableRunner with JobServer 
${config.jobServerDriver}"
+systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  
"--runner=org.apache.beam.runners.reference.testing.TestPortableRunner",
+  "--jobServerDriver=${config.jobServerDriver}",
+  config.jobServerConfig ? 
"--jobServerConfig=${config.jobServerConfig}" : "",
 
 Review comment:
   It might be better to construct the json list out-of-band in order to avoid 
the empty arg.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129054)
Time Spent: 6h 50m  (was: 6h 40m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #1122

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[ehudm] Make with_attributes kwarg optional.

[ehudm] Add decode and encode steps to streaming_wordcount

[qinyeli] Interactive Beam -- display update

[ehudm] Fix PubSubMessageMatcher not acking messages.

[pablo] Add GitScm poll trigger for post-commit tests.

--
[...truncated 20.86 MB...]
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.391Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Flatten.Iterables/FlattenIterables/FlatMap into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.469Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Values/Values/Map into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.511Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Keys sample as view/ParDo(ToIsmMetadataRecordForKey) 
into SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample as 
view/GBKaSVForKeys/Read
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.609Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Extract
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.651Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/WithKeys/AddKeys/Map into 
SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/ParDo(CollectWindows)
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.738Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Read
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.785Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Write mutations to Spanner into 
SpannerIO.Write/Write mutations to Cloud Spanner/Batch mutations together
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.884Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey/Reify
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey+SpannerIO.Write/Write
 mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Partial
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:44.953Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/GroupByKey+SpannerIO.Write/Write
 mutations to Cloud Spanner/Wait.OnSignal/To wait view 
0/Sample.Any/Combine.globally(SampleAny)/Combine.perKey(SampleAny)/Combine.GroupedValues/Partial
 into SpannerIO.Write/Write mutations to Cloud Spanner/Wait.OnSignal/To wait 
view 0/Sample.Any/Combine.globally(SampleAny)/WithKeys/AddKeys/Map
Jul 31, 2018 12:48:48 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-07-31T00:48:45.022Z: Fusing consumer SpannerIO.Write/Write 
mutations to Cloud Spanner/Group by partition/Reify into SpannerIO.Write/Write 
mutations to 

[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129052=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129052
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:51
Start Date: 31/Jul/18 00:51
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206357414
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -184,6 +185,18 @@ class BeamModulePlugin implements Plugin {
 String tag = null // Sets the image tag (optional).
   }
 
+  // A class defining the configuration for PortableValidatesRunner.
+  class PortableValidatesRunnerConfig {
 
 Review comment:
   "Config" -> "Configuration" to match other configuration class names?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129052)
Time Spent: 6h 40m  (was: 6.5h)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=129051=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129051
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:51
Start Date: 31/Jul/18 00:51
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6073: 
[BEAM-4176] Validate Runner Tests generalization and enable for local reference 
runner
URL: https://github.com/apache/beam/pull/6073#discussion_r206366073
 
 

 ##
 File path: 
buildSrc/src/main/groovy/org/apache/beam/gradle/BeamModulePlugin.groovy
 ##
 @@ -1389,5 +1402,39 @@ artifactId=${project.name}
 args argsNeeded
   }
 }
+
+
+/** 
***/
+
+// Method to create the PortableValidatesRunnerTask.
+project.ext.createPortableValidatesRunnerTask = {
+  def config = it ? it as PortableValidatesRunnerConfig : new 
PortableValidatesRunnerConfig()
+  def name = config.name ? config.name : "validatesPortableRunner"
+  def testCategories = {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
 
 Review comment:
   Presumably these categories will be runner-specific (with the exception of 
ValidatesRunner itself). Can you add the excluded categories as a configuration 
parameter that is specified when calling `applyJavaNature`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129051)
Time Spent: 6.5h  (was: 6h 20m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5023) BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness

2018-07-30 Thread Thomas Weise (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Weise resolved BEAM-5023.

   Resolution: Fixed
Fix Version/s: 2.7.0

> BeamFnDataGrpcClient should pass the worker_id when connecting to the 
> RunnerHarness
> ---
>
> Key: BEAM-5023
> URL: https://issues.apache.org/jira/browse/BEAM-5023
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.7.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5023) BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5023?focusedWorklogId=129048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129048
 ]

ASF GitHub Bot logged work on BEAM-5023:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:47
Start Date: 31/Jul/18 00:47
Worklog Time Spent: 10m 
  Work Description: tweise closed pull request #6066: [BEAM-5023] Send 
HarnessId in DataClient
URL: https://github.com/apache/beam/pull/6066
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
index ec3821dfae9..6ba55bca900 100644
--- a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
+++ b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
@@ -19,8 +19,10 @@
 package org.apache.beam.fn.harness;
 
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.collect.ImmutableList;
 import java.util.EnumMap;
 import java.util.List;
+import org.apache.beam.fn.harness.control.AddHarnessIdInterceptor;
 import org.apache.beam.fn.harness.control.BeamFnControlClient;
 import org.apache.beam.fn.harness.control.ProcessBundleHandler;
 import org.apache.beam.fn.harness.control.RegisterHandler;
@@ -116,6 +118,8 @@ public static void main(
 }
 OutboundObserverFactory outboundObserverFactory =
 HarnessStreamObserverFactories.fromOptions(options);
+channelFactory =
+
channelFactory.withInterceptors(ImmutableList.of(AddHarnessIdInterceptor.create(id)));
 main(
 id,
 options,
diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
index 09d18b3f5f0..f1145b805a4 100644
--- 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
+++ 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
@@ -20,7 +20,6 @@
 
 import static com.google.common.base.Throwables.getStackTraceAsString;
 
-import com.google.common.collect.ImmutableList;
 import com.google.common.util.concurrent.Uninterruptibles;
 import java.util.EnumMap;
 import java.util.Objects;
@@ -79,11 +78,7 @@ public BeamFnControlClient(
 this.bufferedInstructions = new LinkedBlockingDeque<>();
 this.outboundObserver =
 outboundObserverFactory.outboundObserverFor(
-BeamFnControlGrpc.newStub(
-channelFactory
-
.withInterceptors(ImmutableList.of(AddHarnessIdInterceptor.create(id)))
-.forDescriptor(apiServiceDescriptor))
-::control,
+
BeamFnControlGrpc.newStub(channelFactory.forDescriptor(apiServiceDescriptor))::control,
 new InboundObserver());
 this.handlers = handlers;
 this.onFinish = new CompletableFuture<>();


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129048)
Time Spent: 50m  (was: 40m)

> BeamFnDataGrpcClient should pass the worker_id when connecting to the 
> RunnerHarness
> ---
>
> Key: BEAM-5023
> URL: https://issues.apache.org/jira/browse/BEAM-5023
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated: [BEAM-5023] Send HarnessId in DataClient (#6066)

2018-07-30 Thread thw
This is an automated email from the ASF dual-hosted git repository.

thw pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new c0b64fb  [BEAM-5023] Send HarnessId in DataClient (#6066)
c0b64fb is described below

commit c0b64fb09414e36c67480a8e26a6e0b392e9c6c7
Author: Ankur 
AuthorDate: Mon Jul 30 17:47:21 2018 -0700

[BEAM-5023] Send HarnessId in DataClient (#6066)

* Send HarnessId in DataClient

The data client needs to contain HarnessId to register itself

* Adding harness id to the channelFactory
---
 .../src/main/java/org/apache/beam/fn/harness/FnHarness.java| 4 
 .../org/apache/beam/fn/harness/control/BeamFnControlClient.java| 7 +--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
index ec3821d..6ba55bc 100644
--- a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
+++ b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
@@ -19,8 +19,10 @@
 package org.apache.beam.fn.harness;
 
 import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.collect.ImmutableList;
 import java.util.EnumMap;
 import java.util.List;
+import org.apache.beam.fn.harness.control.AddHarnessIdInterceptor;
 import org.apache.beam.fn.harness.control.BeamFnControlClient;
 import org.apache.beam.fn.harness.control.ProcessBundleHandler;
 import org.apache.beam.fn.harness.control.RegisterHandler;
@@ -116,6 +118,8 @@ public class FnHarness {
 }
 OutboundObserverFactory outboundObserverFactory =
 HarnessStreamObserverFactories.fromOptions(options);
+channelFactory =
+
channelFactory.withInterceptors(ImmutableList.of(AddHarnessIdInterceptor.create(id)));
 main(
 id,
 options,
diff --git 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
index 09d18b3..f1145b8 100644
--- 
a/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
+++ 
b/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/BeamFnControlClient.java
@@ -20,7 +20,6 @@ package org.apache.beam.fn.harness.control;
 
 import static com.google.common.base.Throwables.getStackTraceAsString;
 
-import com.google.common.collect.ImmutableList;
 import com.google.common.util.concurrent.Uninterruptibles;
 import java.util.EnumMap;
 import java.util.Objects;
@@ -79,11 +78,7 @@ public class BeamFnControlClient {
 this.bufferedInstructions = new LinkedBlockingDeque<>();
 this.outboundObserver =
 outboundObserverFactory.outboundObserverFor(
-BeamFnControlGrpc.newStub(
-channelFactory
-
.withInterceptors(ImmutableList.of(AddHarnessIdInterceptor.create(id)))
-.forDescriptor(apiServiceDescriptor))
-::control,
+
BeamFnControlGrpc.newStub(channelFactory.forDescriptor(apiServiceDescriptor))::control,
 new InboundObserver());
 this.handlers = handlers;
 this.onFinish = new CompletableFuture<>();



Jenkins build is back to normal : beam_PostCommit_Python_Verify #5613

2018-07-30 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=129038=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129038
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:31
Start Date: 31/Jul/18 00:31
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6082: [BEAM-2930] Side 
inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#issuecomment-409055117
 
 
   test this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129038)
Time Spent: 1.5h  (was: 1h 20m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Py_VR_Dataflow #663

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[ehudm] Make with_attributes kwarg optional.

[ehudm] Add decode and encode steps to streaming_wordcount

[qinyeli] Interactive Beam -- display update

[ehudm] Fix PubSubMessageMatcher not acking messages.

[pablo] Add GitScm poll trigger for post-commit tests.

--
[...truncated 162.37 KB...]
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting pyhamcrest (from -r postcommit_requirements.txt (line 1))
  File was already downloaded 
/tmp/dataflow-requirements-cache/PyHamcrest-1.9.0.tar.gz
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded 
/tmp/dataflow-requirements-cache/funcsigs-1.0.2.tar.gz
Collecting mock (from -r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/mock-2.0.0.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
Collecting pbr>=0.11 (from mock->-r postcommit_requirements.txt (line 2))
  File was already downloaded /tmp/dataflow-requirements-cache/pbr-4.2.0.tar.gz
Collecting setuptools (from pyhamcrest->-r postcommit_requirements.txt (line 1))
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
Successfully downloaded pyhamcrest mock setuptools six funcsigs pbr
  File was already downloaded 
/tmp/dataflow-requirements-cache/setuptools-40.0.0.zip
Collecting six (from pyhamcrest->-r postcommit_requirements.txt (line 1))
  File was already downloaded /tmp/dataflow-requirements-cache/six-1.11.0.tar.gz
Collecting funcsigs>=1 (from mock->-r postcommit_requirements.txt 

[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=129037=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129037
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:29
Start Date: 31/Jul/18 00:29
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6082: [BEAM-2930] Side 
inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#issuecomment-409054911
 
 
   Run Java PreCommit
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129037)
Time Spent: 1h 20m  (was: 1h 10m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3744) Support full PubsubMessages

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-3744?focusedWorklogId=129033=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129033
 ]

ASF GitHub Bot logged work on BEAM-3744:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:16
Start Date: 31/Jul/18 00:16
Worklog Time Spent: 10m 
  Work Description: aaltay commented on issue #5952: [BEAM-3744] Python 
PubSub API Fixes and Tests
URL: https://github.com/apache/beam/pull/5952#issuecomment-409052837
 
 
   Thank you! I appreciate this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129033)
Time Spent: 10h 40m  (was: 10.5h)

> Support full PubsubMessages
> ---
>
> Key: BEAM-3744
> URL: https://issues.apache.org/jira/browse/BEAM-3744
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Critical
> Fix For: 2.7.0
>
>  Time Spent: 10h 40m
>  Remaining Estimate: 0h
>
> Tracking changes to Pubsub support in Python SDK.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129032=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129032
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:16
Start Date: 31/Jul/18 00:16
Worklog Time Spent: 10m 
  Work Description: reuvenlax commented on issue #6103: [BEAM-4793] Enable 
schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103#issuecomment-409052734
 
 
   run Dataflow ValidatesRunner


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129032)
Time Spent: 0.5h  (was: 20m)

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129030=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129030
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:15
Start Date: 31/Jul/18 00:15
Worklog Time Spent: 10m 
  Work Description: reuvenlax opened a new pull request #6103: [BEAM-4793] 
Enable schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129030)
Time Spent: 10m
Remaining Estimate: 0h

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4793) Enable schemas for all runners

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4793?focusedWorklogId=129031=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129031
 ]

ASF GitHub Bot logged work on BEAM-4793:


Author: ASF GitHub Bot
Created on: 31/Jul/18 00:15
Start Date: 31/Jul/18 00:15
Worklog Time Spent: 10m 
  Work Description: holdensmagicalunicorn commented on issue #6103: 
[BEAM-4793] Enable schemas for Dataflow runner.
URL: https://github.com/apache/beam/pull/6103#issuecomment-409052641
 
 
   @reuvenlax, thanks! I am a bot who has found some folks who might be able to 
help with the review:@jkff, @herohde and @lukecwik


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129031)
Time Spent: 20m  (was: 10m)

> Enable schemas for all runners
> --
>
> Key: BEAM-4793
> URL: https://issues.apache.org/jira/browse/BEAM-4793
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently schemas are only enabled in the direct runner.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PreCommit_Java_Cron #167

2018-07-30 Thread Apache Jenkins Server
See 


Changes:

[ehudm] Make with_attributes kwarg optional.

[ehudm] Add decode and encode steps to streaming_wordcount

[qinyeli] Interactive Beam -- display update

[ehudm] Fix PubSubMessageMatcher not acking messages.

[pablo] Add GitScm poll trigger for post-commit tests.

--
[...truncated 16.63 MB...]
org.apache.beam.sdk.nexmark.queries.sql.SqlQuery5Test > testBids STANDARD_ERROR
Jul 31, 2018 12:11:29 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `AuctionBids`.`auction`, `AuctionBids`.`num`
FROM (SELECT `B1`.`auction`, COUNT(*) AS `num`, HOP_START(`B1`.`dateTime`, 
INTERVAL '5' SECOND, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B1`
GROUP BY `B1`.`auction`, HOP(`B1`.`dateTime`, INTERVAL '5' SECOND, INTERVAL 
'10' SECOND)) AS `AuctionBids`
INNER JOIN (SELECT MAX(`CountBids`.`num`) AS `maxnum`, 
`CountBids`.`starttime`
FROM (SELECT COUNT(*) AS `num`, HOP_START(`B2`.`dateTime`, INTERVAL '5' 
SECOND, INTERVAL '10' SECOND) AS `starttime`
FROM `beam`.`Bid` AS `B2`
GROUP BY `B2`.`auction`, HOP(`B2`.`dateTime`, INTERVAL '5' SECOND, INTERVAL 
'10' SECOND)) AS `CountBids`
GROUP BY `CountBids`.`starttime`) AS `MaxBids` ON `AuctionBids`.`starttime` 
= `MaxBids`.`starttime` AND `AuctionBids`.`num` >= `MaxBids`.`maxnum`
Jul 31, 2018 12:11:29 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(auction=[$0], num=[$1])
  LogicalJoin(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
LogicalProject(auction=[$0], num=[$2], starttime=[$1])
  LogicalAggregate(group=[{0, 1}], num=[COUNT()])
LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
  BeamIOSourceRel(table=[[beam, Bid]])
LogicalProject(maxnum=[$1], starttime=[$0])
  LogicalAggregate(group=[{0}], maxnum=[MAX($1)])
LogicalProject(starttime=[$1], num=[$0])
  LogicalProject(num=[$2], starttime=[$1])
LogicalAggregate(group=[{0, 1}], num=[COUNT()])
  LogicalProject(auction=[$0], $f1=[HOP($3, 5000, 1)])
BeamIOSourceRel(table=[[beam, Bid]])

Jul 31, 2018 12:11:29 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..4=[{inputs}], proj#0..1=[{exprs}])
  BeamJoinRel(condition=[AND(=($2, $4), >=($1, $3))], joinType=[inner])
BeamCalcRel(expr#0..2=[{inputs}], auction=[$t0], num=[$t2], 
starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], expr#6=[1], 
expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])
BeamCalcRel(expr#0..1=[{inputs}], maxnum=[$t1], starttime=[$t0])
  BeamAggregationRel(group=[{1}], maxnum=[MAX($0)])
BeamCalcRel(expr#0..2=[{inputs}], num=[$t2], starttime=[$t1])
  BeamAggregationRel(group=[{0, 1}], num=[COUNT()])
BeamCalcRel(expr#0..4=[{inputs}], expr#5=[5000], 
expr#6=[1], expr#7=[HOP($t3, $t5, $t6)], auction=[$t0], $f1=[$t7])
  BeamIOSourceRel(table=[[beam, Bid]])


org.apache.beam.sdk.nexmark.queries.sql.SqlQuery3Test > 
testJoinsPeopleWithAuctions STANDARD_ERROR
Jul 31, 2018 12:11:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQL:
SELECT `P`.`name`, `P`.`city`, `P`.`state`, `A`.`id`
FROM `beam`.`Auction` AS `A`
INNER JOIN `beam`.`Person` AS `P` ON `A`.`seller` = `P`.`id`
WHERE `A`.`category` = 10 AND (`P`.`state` = 'OR' OR `P`.`state` = 'ID' OR 
`P`.`state` = 'CA')
Jul 31, 2018 12:11:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: SQLPlan>
LogicalProject(name=[$11], city=[$14], state=[$15], id=[$0])
  LogicalFilter(condition=[AND(=($8, 10), OR(=($15, 'OR'), =($15, 'ID'), 
=($15, 'CA')))])
LogicalJoin(condition=[=($7, $10)], joinType=[inner])
  BeamIOSourceRel(table=[[beam, Auction]])
  BeamIOSourceRel(table=[[beam, Person]])

Jul 31, 2018 12:11:30 AM 
org.apache.beam.sdk.extensions.sql.impl.BeamQueryPlanner convertToBeamRel
INFO: BEAMPlan>
BeamCalcRel(expr#0..17=[{inputs}], name=[$t11], city=[$t14], state=[$t15], 
id=[$t0])
  BeamJoinRel(condition=[=($7, $10)], joinType=[inner])
BeamCalcRel(expr#0..9=[{inputs}], expr#10=[10], expr#11=[=($t8, $t10)], 
proj#0..9=[{exprs}], $condition=[$t11])
  BeamIOSourceRel(table=[[beam, Auction]])
BeamCalcRel(expr#0..7=[{inputs}], expr#8=['OR'], expr#9=[=($t5, $t8)], 
expr#10=['ID'], expr#11=[=($t5, $t10)], expr#12=['CA'], expr#13=[=($t5, $t12)], 
expr#14=[OR($t9, $t11, $t13)], 

[beam] 01/01: Merge pull request #6089 from qinyeli/display2

2018-07-30 Thread ccy
This is an automated email from the ASF dual-hosted git repository.

ccy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 48c3d04f46d40eb81e287e849e3862fedf9de297
Merge: 531dd8b 39a9fff
Author: Charles Chen 
AuthorDate: Mon Jul 30 16:37:06 2018 -0700

Merge pull request #6089 from qinyeli/display2

Interactive Beam -- display update

 .../runners/interactive/display_manager.py | 91 --
 .../interactive/interactive_pipeline_graph.py  | 56 +++--
 .../runners/interactive/pipeline_graph.py  | 40 ++
 3 files changed, 107 insertions(+), 80 deletions(-)



[beam] branch master updated (531dd8b -> 48c3d04)

2018-07-30 Thread ccy
This is an automated email from the ASF dual-hosted git repository.

ccy pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 531dd8b  Merge pull request #6085 from udim/beam-5039
 add 39a9fff  Interactive Beam -- display update
 new 48c3d04  Merge pull request #6089 from qinyeli/display2

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../runners/interactive/display_manager.py | 91 --
 .../interactive/interactive_pipeline_graph.py  | 56 +++--
 .../runners/interactive/pipeline_graph.py  | 40 ++
 3 files changed, 107 insertions(+), 80 deletions(-)



[jira] [Work logged] (BEAM-5039) python postcommit broken in call to WriteToPubSub

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5039?focusedWorklogId=129013=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129013
 ]

ASF GitHub Bot logged work on BEAM-5039:


Author: ASF GitHub Bot
Created on: 30/Jul/18 23:23
Start Date: 30/Jul/18 23:23
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on issue #6085: [BEAM-5039] 
Fix streaming_wordcount_it_test
URL: https://github.com/apache/beam/pull/6085#issuecomment-409043745
 
 
   Thanks! This LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129013)
Time Spent: 50m  (was: 40m)

> python postcommit broken in call to WriteToPubSub
> -
>
> Key: BEAM-5039
> URL: https://issues.apache.org/jira/browse/BEAM-5039
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> ERROR: test_streaming_wordcount_it 
> (apache_beam.examples.streaming_wordcount_it_test.StreamingWordCountIT)
> --
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/streaming_wordcount_it_test.py",
>  line 105, in test_streaming_wordcount_it
> self.test_pipeline.get_full_options_as_args(**extra_opts))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/streaming_wordcount.py",
>  line 90, in run
> output | beam.io.WriteToPubSub(known_args.output_topic)
> TypeError: __init__() takes at least 3 arguments (2 given)
> https://builds.apache.org/job/beam_PostCommit_Python_Verify/5597/consoleText



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5039) python postcommit broken in call to WriteToPubSub

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5039?focusedWorklogId=129014=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129014
 ]

ASF GitHub Bot logged work on BEAM-5039:


Author: ASF GitHub Bot
Created on: 30/Jul/18 23:23
Start Date: 30/Jul/18 23:23
Worklog Time Spent: 10m 
  Work Description: charlesccychen closed pull request #6085: [BEAM-5039] 
Fix streaming_wordcount_it_test
URL: https://github.com/apache/beam/pull/6085
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/examples/streaming_wordcount.py 
b/sdks/python/apache_beam/examples/streaming_wordcount.py
index db3b97cda18..2bb8e4fc27a 100644
--- a/sdks/python/apache_beam/examples/streaming_wordcount.py
+++ b/sdks/python/apache_beam/examples/streaming_wordcount.py
@@ -60,10 +60,16 @@ def run(argv=None):
 
   # Read from PubSub into a PCollection.
   if known_args.input_subscription:
-lines = p | beam.io.ReadFromPubSub(
-subscription=known_args.input_subscription)
+messages = (p
+| beam.io.ReadFromPubSub(
+subscription=known_args.input_subscription)
+.with_output_types(six.binary_type))
   else:
-lines = p | beam.io.ReadFromPubSub(topic=known_args.input_topic)
+messages = (p
+| beam.io.ReadFromPubSub(topic=known_args.input_topic)
+.with_output_types(six.binary_type))
+
+  lines = messages | 'decode' >> beam.Map(lambda x: x.decode('utf-8'))
 
   # Count the occurrences of each word.
   def count_ones(word_ones):
@@ -83,7 +89,10 @@ def format_result(word_count):
 (word, count) = word_count
 return '%s: %d' % (word, count)
 
-  output = counts | 'format' >> beam.Map(format_result)
+  output = (counts
+| 'format' >> beam.Map(format_result)
+| 'encode' >> beam.Map(lambda x: x.encode('utf-8'))
+.with_output_types(six.binary_type))
 
   # Write to PubSub.
   # pylint: disable=expression-not-assigned
diff --git a/sdks/python/apache_beam/io/gcp/pubsub.py 
b/sdks/python/apache_beam/io/gcp/pubsub.py
index a8e8d4de81a..3b65fac80e7 100644
--- a/sdks/python/apache_beam/io/gcp/pubsub.py
+++ b/sdks/python/apache_beam/io/gcp/pubsub.py
@@ -187,7 +187,7 @@ def to_runner_api_parameter(self, context):
 return self.to_runner_api_pickled(context)
 
 
-@deprecated(since='2.6.0', extra_message='Use ReadFromPubSub instead.')
+@deprecated(since='2.7.0', extra_message='Use ReadFromPubSub instead.')
 def ReadStringsFromPubSub(topic=None, subscription=None, id_label=None):
   return _ReadStringsFromPubSub(topic, subscription, id_label)
 
@@ -210,7 +210,7 @@ def expand(self, pvalue):
 return p
 
 
-@deprecated(since='2.6.0', extra_message='Use WriteToPubSub instead.')
+@deprecated(since='2.7.0', extra_message='Use WriteToPubSub instead.')
 def WriteStringsToPubSub(topic):
   return _WriteStringsToPubSub(topic)
 
@@ -238,7 +238,7 @@ class WriteToPubSub(PTransform):
   """A ``PTransform`` for writing messages to Cloud Pub/Sub."""
   # Implementation note: This ``PTransform`` is overridden by Directrunner.
 
-  def __init__(self, topic, with_attributes, id_label=None,
+  def __init__(self, topic, with_attributes=False, id_label=None,
timestamp_attribute=None):
 """Initializes ``WriteToPubSub``.
 
diff --git a/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py 
b/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py
index 9890dcd92e1..6217faf569d 100644
--- a/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py
+++ b/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py
@@ -102,6 +102,7 @@ def _wait_for_messages(self, subscription, expected_num, 
timeout):
 while time.time() - start_time <= timeout:
   pulled = subscription.pull(max_messages=MAX_MESSAGES_IN_ONE_PULL)
   for ack_id, message in pulled:
+subscription.acknowledge([ack_id])
 if not self.with_attributes:
   total_messages.append(message.data)
   continue
@@ -116,7 +117,6 @@ def _wait_for_messages(self, subscription, expected_num, 
timeout):
   'expected attribute not found.')
 total_messages.append(msg)
 
-subscription.acknowledge([ack_id])
   if len(total_messages) >= expected_num:
 return total_messages
   time.sleep(1)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue 

[beam] branch master updated (01414c5 -> 531dd8b)

2018-07-30 Thread ccy
This is an automated email from the ASF dual-hosted git repository.

ccy pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 01414c5  Add GitScm poll trigger for post-commit tests.
 add 20a9500  Make with_attributes kwarg optional.
 add 3b80159  Add decode and encode steps to streaming_wordcount
 add 5a9b7d9  Fix PubSubMessageMatcher not acking messages.
 new 531dd8b  Merge pull request #6085 from udim/beam-5039

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/examples/streaming_wordcount.py | 17 +
 sdks/python/apache_beam/io/gcp/pubsub.py|  6 +++---
 sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py  |  2 +-
 3 files changed, 17 insertions(+), 8 deletions(-)



[beam] 01/01: Merge pull request #6085 from udim/beam-5039

2018-07-30 Thread ccy
This is an automated email from the ASF dual-hosted git repository.

ccy pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 531dd8bd4d3342717306bc3a3453d0d96664113c
Merge: 01414c5 5a9b7d9
Author: Charles Chen 
AuthorDate: Mon Jul 30 16:23:02 2018 -0700

Merge pull request #6085 from udim/beam-5039

[BEAM-5039] Fix streaming_wordcount_it_test

 sdks/python/apache_beam/examples/streaming_wordcount.py | 17 +
 sdks/python/apache_beam/io/gcp/pubsub.py|  6 +++---
 sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py  |  2 +-
 3 files changed, 17 insertions(+), 8 deletions(-)



[jira] [Work logged] (BEAM-5023) BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5023?focusedWorklogId=129012=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-129012
 ]

ASF GitHub Bot logged work on BEAM-5023:


Author: ASF GitHub Bot
Created on: 30/Jul/18 23:22
Start Date: 30/Jul/18 23:22
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6066: [BEAM-5023] Send 
HarnessId in DataClient
URL: https://github.com/apache/beam/pull/6066#issuecomment-409043690
 
 
   R: @tweise 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 129012)
Time Spent: 40m  (was: 0.5h)

> BeamFnDataGrpcClient should pass the worker_id when connecting to the 
> RunnerHarness
> ---
>
> Key: BEAM-5023
> URL: https://issues.apache.org/jira/browse/BEAM-5023
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-5047) Make clean cleanup go vendor directories for Beam

2018-07-30 Thread holdenk (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk closed BEAM-5047.
-
   Resolution: Cannot Reproduce
Fix Version/s: Not applicable

> Make clean cleanup go vendor directories for Beam
> -
>
> Key: BEAM-5047
> URL: https://issues.apache.org/jira/browse/BEAM-5047
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: Henning Rohde
>Priority: Trivial
> Fix For: Not applicable
>
>
> I got into a state when building that it cached an old version of the file I 
> was working on and this was super confusing since clean didn't clean that up. 
> The gradle go plugin seems to have an idea of clean we could try?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5047) Make clean cleanup go vendor directories for Beam

2018-07-30 Thread holdenk (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562607#comment-16562607
 ] 

holdenk commented on BEAM-5047:
---

So it seems to be working now, but it wasn't before. I'm going to close this 
issue for now and if I run into again I'll re-open and take a look.

> Make clean cleanup go vendor directories for Beam
> -
>
> Key: BEAM-5047
> URL: https://issues.apache.org/jira/browse/BEAM-5047
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: holdenk
>Assignee: Henning Rohde
>Priority: Trivial
> Fix For: Not applicable
>
>
> I got into a state when building that it cached an old version of the file I 
> was working on and this was super confusing since clean didn't clean that up. 
> The gradle go plugin seems to have an idea of clean we could try?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5048) Document common build pattern for SDKs

2018-07-30 Thread holdenk (JIRA)
holdenk created BEAM-5048:
-

 Summary: Document common build pattern for SDKs
 Key: BEAM-5048
 URL: https://issues.apache.org/jira/browse/BEAM-5048
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go, sdk-py-core
Reporter: holdenk
Assignee: Henning Rohde


It appears some of the devs don't build there sdks with gradle during their 
normal development cycle. It would be good to have the customary dev build 
instructions in the respective README.md files in the SDK directory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5047) Make clean cleanup go vendor directories for Beam

2018-07-30 Thread holdenk (JIRA)
holdenk created BEAM-5047:
-

 Summary: Make clean cleanup go vendor directories for Beam
 Key: BEAM-5047
 URL: https://issues.apache.org/jira/browse/BEAM-5047
 Project: Beam
  Issue Type: Improvement
  Components: sdk-go
Reporter: holdenk
Assignee: Henning Rohde


I got into a state when building that it cached an old version of the file I 
was working on and this was super confusing since clean didn't clean that up. 
The gradle go plugin seems to have an idea of clean we could try?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128982=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128982
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 22:11
Start Date: 30/Jul/18 22:11
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206337441
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128978=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128978
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 22:08
Start Date: 30/Jul/18 22:08
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206337441
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4828) SQS Source

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=128976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128976
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 30/Jul/18 22:06
Start Date: 30/Jul/18 22:06
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis commented on issue #6101: [BEAM-4828] 
Add SqsIO source and sink
URL: https://github.com/apache/beam/pull/6101#issuecomment-409027581
 
 
   @lukecwik @iemejia @jbonofre 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128976)
Time Spent: 2h 20m  (was: 2h 10m)

> SQS Source
> --
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5030) Consolidate defer overhead per bundle

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5030?focusedWorklogId=128975=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128975
 ]

ASF GitHub Bot logged work on BEAM-5030:


Author: ASF GitHub Bot
Created on: 30/Jul/18 22:05
Start Date: 30/Jul/18 22:05
Worklog Time Spent: 10m 
  Work Description: holdenk opened a new pull request #6102: [BEAM-5030]  
Consolidate defer overhead per bundle
URL: https://github.com/apache/beam/pull/6102
 
 
   Directly call Fn since we're wrapped anyways in plan.go and the additional 
wrapping adding a 3% overhead.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ X ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128975)
Time Spent: 10m
Remaining Estimate: 0h

> Consolidate defer overhead per bundle
> -
>
> Key: BEAM-5030
> URL: https://issues.apache.org/jira/browse/BEAM-5030
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Robert Burke
>Assignee: Robert Burke
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At present, reflectx.CallNoPanic is 

[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128974
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 22:05
Start Date: 30/Jul/18 22:05
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206337441
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4828) SQS Source

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4828?focusedWorklogId=128973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128973
 ]

ASF GitHub Bot logged work on BEAM-4828:


Author: ASF GitHub Bot
Created on: 30/Jul/18 22:02
Start Date: 30/Jul/18 22:02
Worklog Time Spent: 10m 
  Work Description: JohnRudolfLewis opened a new pull request #6101: 
[BEAM-4828] Add SqsIO source and sink
URL: https://github.com/apache/beam/pull/6101
 
 
   Adds support for an SQS source and sink.
   
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128973)
Time Spent: 2h 10m  (was: 2h)

> SQS Source
> --
>
> Key: BEAM-4828
> URL: https://issues.apache.org/jira/browse/BEAM-4828
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-aws
>Reporter: John Rudolf Lewis
>Assignee: John Rudolf Lewis
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Add an SQS source
>  
> For people who would like to follow progress or help out: 
> [https://github.com/JohnRudolfLewis/beam/tree/Add-SqsIO]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4749) fastavro breaks macos tests

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4749?focusedWorklogId=128970=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128970
 ]

ASF GitHub Bot logged work on BEAM-4749:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:48
Start Date: 30/Jul/18 21:48
Worklog Time Spent: 10m 
  Work Description: ryan-williams opened a new pull request #6100: 
[BEAM-4749] Revert "Install fastavro only in linux"
URL: https://github.com/apache/beam/pull/6100
 
 
   This reverts commit 21d3a7083425615ad7dc8fe07f8aced308a3634f (#5939).
   
   fastavro is now easily installable on macOS
   
   R: @aaltay 
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue, if applicable. This will automatically link the pull request to the 
issue.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   It will help us expedite review of your Pull Request if you tag someone 
(e.g. `@username`) to look at it.
   
   Post-Commit Tests Status (on master branch)
   

   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
 | --- | --- | --- | --- | --- | ---
   Java | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
 | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
   Python | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
 | --- | [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
  [![Build 
Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
 | --- | --- | --- | ---
   
   
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128970)
Time Spent: 40m  (was: 0.5h)

> fastavro breaks macos tests
> ---
>
> Key: BEAM-4749
> URL: https://issues.apache.org/jira/browse/BEAM-4749
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Recent addition of the fastavro dependency 

[jira] [Work logged] (BEAM-4862) varint overflow -62135596800 exception with Cloud Spanner Timestamp 0001-01-01T00:00:00Z

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4862?focusedWorklogId=128968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128968
 ]

ASF GitHub Bot logged work on BEAM-4862:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:45
Start Date: 30/Jul/18 21:45
Worklog Time Spent: 10m 
  Work Description: pabloem closed pull request #6095: [BEAM-4862] Fixes 
bug in Spanner's MutationGroupEncoder by converting timestamps into Long and 
not Int.
URL: https://github.com/apache/beam/pull/6095
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
index 77ede3ea058..4c97fac5074 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
@@ -478,7 +478,7 @@ private void decodePrimitive(
   if (isNull) {
 m.set(fieldName).to((Timestamp) null);
   } else {
-int seconds = VarInt.decodeInt(bis);
+long seconds = VarInt.decodeLong(bis);
 int nanoseconds = VarInt.decodeInt(bis);
 m.set(fieldName).to(Timestamp.ofTimeSecondsAndNanos(seconds, 
nanoseconds));
   }
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
index 2509f4d4c35..a600551ed76 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
@@ -528,6 +528,36 @@ public void dateKeys() throws Exception {
 verifyEncodedOrdering(schema, "test", keys);
   }
 
+  @Test
+  public void decodeBasicTimestampMutationGroup() {
+SpannerSchema spannerSchemaTimestamp =
+SpannerSchema.builder().addColumn("timestampTest", "timestamp", 
"TIMESTAMP").build();
+Timestamp timestamp1 = Timestamp.now();
+Mutation mutation1 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp1).build();
+encodeAndVerify(g(mutation1), spannerSchemaTimestamp);
+
+Timestamp timestamp2 = Timestamp.parseTimestamp("2001-01-01T00:00:00Z");
+Mutation mutation2 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp2).build();
+encodeAndVerify(g(mutation2), spannerSchemaTimestamp);
+  }
+
+  @Test
+  public void decodeMinAndMaxTimestampMutationGroup() {
+SpannerSchema spannerSchemaTimestamp =
+SpannerSchema.builder().addColumn("timestampTest", "timestamp", 
"TIMESTAMP").build();
+Timestamp timestamp1 = Timestamp.MIN_VALUE;
+Mutation mutation1 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp1).build();
+encodeAndVerify(g(mutation1), spannerSchemaTimestamp);
+
+Timestamp timestamp2 = Timestamp.MAX_VALUE;
+Mutation mutation2 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp2).build();
+encodeAndVerify(g(mutation2), spannerSchemaTimestamp);
+  }
+
   @Test
   public void timestampKeys() throws Exception {
 SpannerSchema.Builder builder = SpannerSchema.builder();


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128968)
Time Spent: 2h 40m  (was: 2.5h)
Remaining Estimate: 21h 20m  (was: 21.5h)

> varint overflow -62135596800 exception with Cloud Spanner Timestamp 
> 0001-01-01T00:00:00Z
> 
>
> Key: BEAM-4862
> URL: https://issues.apache.org/jira/browse/BEAM-4862
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Eric Beach
>Assignee: Chamikara Jayalath
>Priority: Minor
>   Original 

[jira] [Work logged] (BEAM-4862) varint overflow -62135596800 exception with Cloud Spanner Timestamp 0001-01-01T00:00:00Z

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4862?focusedWorklogId=128969=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128969
 ]

ASF GitHub Bot logged work on BEAM-4862:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:45
Start Date: 30/Jul/18 21:45
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #6095: [BEAM-4862] Fixes bug 
in Spanner's MutationGroupEncoder by converting timestamps into Long and not 
Int.
URL: https://github.com/apache/beam/pull/6095#issuecomment-409022104
 
 
   Merged. Only test failure is wordcount due to missing containers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128969)
Time Spent: 2h 50m  (was: 2h 40m)
Remaining Estimate: 21h 10m  (was: 21h 20m)

> varint overflow -62135596800 exception with Cloud Spanner Timestamp 
> 0001-01-01T00:00:00Z
> 
>
> Key: BEAM-4862
> URL: https://issues.apache.org/jira/browse/BEAM-4862
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Eric Beach
>Assignee: Chamikara Jayalath
>Priority: Minor
>   Original Estimate: 24h
>  Time Spent: 2h 50m
>  Remaining Estimate: 21h 10m
>
> tl;dr - If you try to write a Timestamp of value "0001-01-01T00:00:00Z" as a 
> Spanner Mutation, you get an overflow error.
>  
> The crux of the issue appears to be that 0001-01-01T00:00:00Z, which is a 
> valid Timestamp per 
> [https://cloud.google.com/spanner/docs/data-types#timestamp-type], is too 
> large for an integer. See the two lines of code below. 
> [https://github.com/apache/beam/blob/release-2.5.0/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java#L453]
> [https://github.com/apache/beam/blob/279a05604b83a54e8e5a79e13d8761f94841f326/sdks/java/core/src/main/java/org/apache/beam/sdk/util/VarInt.java#L58]
>  
>  
> Stack Trade
> {{Caused by: java.io.IOException: varint overflow -62135596800 at 
> org.apache.beam.sdk.util.VarInt.decodeInt(VarInt.java:65) at 
> org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decodePrimitive(MutationGroupEncoder.java:453)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decodeModification(MutationGroupEncoder.java:326)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decodeMutation(MutationGroupEncoder.java:280)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.MutationGroupEncoder.decode(MutationGroupEncoder.java:264)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.SpannerIO$BatchFn.processElement(SpannerIO.java:1030)
>  at 
> org.apache.beam.sdk.io.gcp.spanner.SpannerIO$BatchFn$DoFnInvoker.invokeProcessElement(Unknown
>  Source) at 
> org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:185)
>  at 
> org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:146)
>  at 
> com.google.cloud.dataflow.worker.SimpleParDoFn.processElement(SimpleParDoFn.java:323)
>  at 
> com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
>  at 
> com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
>  at 
> com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:181)
>  at 
> com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:102)
>  at 
> com.google.cloud.dataflow.worker.util.BatchGroupAlsoByWindowViaIteratorsFn.processElement(BatchGroupAlsoByWindowViaIteratorsFn.java:124)
>  at 
> com.google.cloud.dataflow.worker.util.BatchGroupAlsoByWindowViaIteratorsFn.processElement(BatchGroupAlsoByWindowViaIteratorsFn.java:53)
>  at 
> com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:115)
>  at 
> com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
>  at 
> com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:113)
>  at 
> com.google.cloud.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:43)
>  at 
> com.google.cloud.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:48)
>  at 
> 

[beam] branch release-2.6.0 updated: [BEAM-4862] Fixes bug in Spanner's MutationGroupEncoder by converting timestamps into Long and not Int.

2018-07-30 Thread pabloem
This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a commit to branch release-2.6.0
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/release-2.6.0 by this push:
 new 29dd649  [BEAM-4862] Fixes bug in Spanner's MutationGroupEncoder by 
converting timestamps into Long and not Int.
29dd649 is described below

commit 29dd64913804974f248036535f4c3af4b0dd4ba1
Author: Eric Beach 
AuthorDate: Mon Jul 30 12:24:18 2018 -0400

[BEAM-4862] Fixes bug in Spanner's MutationGroupEncoder by converting 
timestamps into Long and not Int.
---
 .../sdk/io/gcp/spanner/MutationGroupEncoder.java   |  2 +-
 .../io/gcp/spanner/MutationGroupEncoderTest.java   | 30 ++
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
index 77ede3e..4c97fac 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoder.java
@@ -478,7 +478,7 @@ class MutationGroupEncoder {
   if (isNull) {
 m.set(fieldName).to((Timestamp) null);
   } else {
-int seconds = VarInt.decodeInt(bis);
+long seconds = VarInt.decodeLong(bis);
 int nanoseconds = VarInt.decodeInt(bis);
 m.set(fieldName).to(Timestamp.ofTimeSecondsAndNanos(seconds, 
nanoseconds));
   }
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
index 2509f4d..a600551 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/MutationGroupEncoderTest.java
@@ -529,6 +529,36 @@ public class MutationGroupEncoderTest {
   }
 
   @Test
+  public void decodeBasicTimestampMutationGroup() {
+SpannerSchema spannerSchemaTimestamp =
+SpannerSchema.builder().addColumn("timestampTest", "timestamp", 
"TIMESTAMP").build();
+Timestamp timestamp1 = Timestamp.now();
+Mutation mutation1 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp1).build();
+encodeAndVerify(g(mutation1), spannerSchemaTimestamp);
+
+Timestamp timestamp2 = Timestamp.parseTimestamp("2001-01-01T00:00:00Z");
+Mutation mutation2 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp2).build();
+encodeAndVerify(g(mutation2), spannerSchemaTimestamp);
+  }
+
+  @Test
+  public void decodeMinAndMaxTimestampMutationGroup() {
+SpannerSchema spannerSchemaTimestamp =
+SpannerSchema.builder().addColumn("timestampTest", "timestamp", 
"TIMESTAMP").build();
+Timestamp timestamp1 = Timestamp.MIN_VALUE;
+Mutation mutation1 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp1).build();
+encodeAndVerify(g(mutation1), spannerSchemaTimestamp);
+
+Timestamp timestamp2 = Timestamp.MAX_VALUE;
+Mutation mutation2 =
+
Mutation.newInsertOrUpdateBuilder("timestampTest").set("timestamp").to(timestamp2).build();
+encodeAndVerify(g(mutation2), spannerSchemaTimestamp);
+  }
+
+  @Test
   public void timestampKeys() throws Exception {
 SpannerSchema.Builder builder = SpannerSchema.builder();
 



[jira] [Work logged] (BEAM-5040) BigQueryIO retries infinitely in WriteTable and WriteRename

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5040?focusedWorklogId=128958=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128958
 ]

ASF GitHub Bot logged work on BEAM-5040:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:38
Start Date: 30/Jul/18 21:38
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #6080: [BEAM-5040] Fix 
retry bug for BigQuery jobs.
URL: https://github.com/apache/beam/pull/6080#issuecomment-409020265
 
 
   LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128958)
Time Spent: 1h 20m  (was: 1h 10m)

> BigQueryIO retries infinitely in WriteTable and WriteRename
> ---
>
> Key: BEAM-5040
> URL: https://issues.apache.org/jira/browse/BEAM-5040
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.5.0
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> BigQueryIO retries infinitely in WriteTable and WriteRename
> Several failure scenarios with the current code:
>  # It's possible for a load job to return failure even though it actually 
> succeeded (e.g. the reply might have timed out). In this case, BigQueryIO 
> will retry the job which will fail again (because the job id has already been 
> used), leading to indefinite retries. Correct behavior is to stop retrying as 
> the load job has succeeded.
>  # It's possible for a load job to be accepted by BigQuery, but then to fail 
> on the BigQuery side. In this case a retry with the same job id will fail as 
> that job id has already been used. BigQueryIO will sometimes detect this, but 
> if the worker has restarted it will instead issue a load with the old job id 
> and go into a retry loop. Correct behavior is to generate a new deterministic 
> job id and retry using that new job id.
>  # In many cases of worker restart, BigQueryIO ends up in infinite retry 
> loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5023) BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5023?focusedWorklogId=128955=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128955
 ]

ASF GitHub Bot logged work on BEAM-5023:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:34
Start Date: 30/Jul/18 21:34
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6066: [BEAM-5023] Send 
HarnessId in DataClient
URL: https://github.com/apache/beam/pull/6066#issuecomment-409019269
 
 
   Run Java Precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128955)
Time Spent: 0.5h  (was: 20m)

> BeamFnDataGrpcClient should pass the worker_id when connecting to the 
> RunnerHarness
> ---
>
> Key: BEAM-5023
> URL: https://issues.apache.org/jira/browse/BEAM-5023
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128953=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128953
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:33
Start Date: 30/Jul/18 21:33
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206329468
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=128952=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128952
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:24
Start Date: 30/Jul/18 21:24
Worklog Time Spent: 10m 
  Work Description: tweise commented on issue #6082: [BEAM-2930] Side 
inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#issuecomment-409016094
 
 
   Run Java PreCommit
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128952)
Time Spent: 1h 10m  (was: 1h)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4176) Java: Portable batch runner passes all ValidatesRunner tests that non-portable runner passes

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4176?focusedWorklogId=128950=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128950
 ]

ASF GitHub Bot logged work on BEAM-4176:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:21
Start Date: 30/Jul/18 21:21
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6073: [BEAM-4176] Validate 
Runner Tests generalization and enable for local reference runner
URL: https://github.com/apache/beam/pull/6073#issuecomment-409014910
 
 
   @bsidhom Ping for the review :)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128950)
Time Spent: 6h 20m  (was: 6h 10m)

> Java: Portable batch runner passes all ValidatesRunner tests that 
> non-portable runner passes
> 
>
> Key: BEAM-4176
> URL: https://issues.apache.org/jira/browse/BEAM-4176
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Ben Sidhom
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> We need this as a sanity check that runner execution is correct.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=128945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128945
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:02
Start Date: 30/Jul/18 21:02
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6082: 
[BEAM-2930] Side inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#discussion_r206320308
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -433,12 +433,17 @@ private void translateImpulse(
   throw new RuntimeException(e);
 }
 
-String inputPCollectionId = 
Iterables.getOnlyElement(transform.getInputsMap().values());
+String inputPCollectionId = stagePayload.getInput();
+// TODO: BEAM-2930
+if (stagePayload.getSideInputsCount() > 0) {
+  throw new UnsupportedOperationException(
+  "[BEAM-2930] streaming translator does not support side inputs: " + 
transform);
+}
 
 Map, OutputTag>> tagsToOutputTags = 
Maps.newLinkedHashMap();
 Map, Coder>> tagsToCoders = 
Maps.newLinkedHashMap();
 // TODO: does it matter which output we designate as "main"
-TupleTag mainOutputTag;
+final TupleTag mainOutputTag;
 
 Review comment:
   fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128945)
Time Spent: 50m  (was: 40m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128947
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:02
Start Date: 30/Jul/18 21:02
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206319736
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=128946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128946
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:02
Start Date: 30/Jul/18 21:02
Worklog Time Spent: 10m 
  Work Description: tweise commented on a change in pull request #6082: 
[BEAM-2930] Side inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#discussion_r206320374
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -433,12 +433,17 @@ private void translateImpulse(
   throw new RuntimeException(e);
 }
 
-String inputPCollectionId = 
Iterables.getOnlyElement(transform.getInputsMap().values());
+String inputPCollectionId = stagePayload.getInput();
+// TODO: BEAM-2930
 
 Review comment:
   fixed


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128946)
Time Spent: 1h  (was: 50m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128944
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 21:00
Start Date: 30/Jul/18 21:00
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206319736
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128943=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128943
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:53
Start Date: 30/Jul/18 20:53
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206317603
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-5023) BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5023?focusedWorklogId=128942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128942
 ]

ASF GitHub Bot logged work on BEAM-5023:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:52
Start Date: 30/Jul/18 20:52
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #6066: [BEAM-5023] Send 
HarnessId in DataClient
URL: https://github.com/apache/beam/pull/6066#issuecomment-409006333
 
 
   Run Java Precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128942)
Time Spent: 20m  (was: 10m)

> BeamFnDataGrpcClient should pass the worker_id when connecting to the 
> RunnerHarness
> ---
>
> Key: BEAM-5023
> URL: https://issues.apache.org/jira/browse/BEAM-5023
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4159) Add testing for Pubsub attributes

2018-07-30 Thread Udi Meiri (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562475#comment-16562475
 ] 

Udi Meiri commented on BEAM-4159:
-

This is done for Python in: https://github.com/apache/beam/pull/5952

> Add testing for Pubsub attributes
> -
>
> Key: BEAM-4159
> URL: https://issues.apache.org/jira/browse/BEAM-4159
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Jason Kuster
>Priority: Major
>
> Request is to add an integration test that exercises reading and writing 
> pubsub message attributes.
> Platform: Java SDK
> Stretch goals: ID attribute, timestamp attribute, Python SDK and Go SDK (both 
> using the Java runner)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128934=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128934
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:35
Start Date: 30/Jul/18 20:35
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206312157
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128933=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128933
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:34
Start Date: 30/Jul/18 20:34
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206311952
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
 
 Review comment:
   Agree


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128933)
Time Spent: 14h 20m  (was: 14h 10m)

> Add an integration test for BeamSqlLine
> ---
>
> Key: BEAM-4808
> URL: https://issues.apache.org/jira/browse/BEAM-4808
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 14h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128931
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:27
Start Date: 30/Jul/18 20:27
Worklog Time Spent: 10m 
  Work Description: amaliujia commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206309980
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128927
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:22
Start Date: 30/Jul/18 20:22
Worklog Time Spent: 10m 
  Work Description: amaliujia removed a comment on issue #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#issuecomment-408778919
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128927)
Time Spent: 14h  (was: 13h 50m)

> Add an integration test for BeamSqlLine
> ---
>
> Key: BEAM-4808
> URL: https://issues.apache.org/jira/browse/BEAM-4808
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 14h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128925
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:21
Start Date: 30/Jul/18 20:21
Worklog Time Spent: 10m 
  Work Description: amaliujia removed a comment on issue #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#issuecomment-408772279
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128925)
Time Spent: 13h 40m  (was: 13.5h)

> Add an integration test for BeamSqlLine
> ---
>
> Key: BEAM-4808
> URL: https://issues.apache.org/jira/browse/BEAM-4808
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 13h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128926=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128926
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:21
Start Date: 30/Jul/18 20:21
Worklog Time Spent: 10m 
  Work Description: amaliujia removed a comment on issue #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#issuecomment-408778876
 
 
   ran java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128926)
Time Spent: 13h 50m  (was: 13h 40m)

> Add an integration test for BeamSqlLine
> ---
>
> Key: BEAM-4808
> URL: https://issues.apache.org/jira/browse/BEAM-4808
> Project: Beam
>  Issue Type: Improvement
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>  Time Spent: 13h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5046) Implement Missing Standard SQL functions in BeamSQL

2018-07-30 Thread Rui Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Wang updated BEAM-5046:
---
Issue Type: Task  (was: Improvement)

> Implement Missing Standard SQL functions in BeamSQL
> ---
>
> Key: BEAM-5046
> URL: https://issues.apache.org/jira/browse/BEAM-5046
> Project: Beam
>  Issue Type: Task
>  Components: dsl-sql
>Reporter: Rui Wang
>Assignee: Rui Wang
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated: Add GitScm poll trigger for post-commit tests.

2018-07-30 Thread pabloem
This is an automated email from the ASF dual-hosted git repository.

pabloem pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 01414c5  Add GitScm poll trigger for post-commit tests.
01414c5 is described below

commit 01414c5d3e723d2ceb4a4224f67c3f4332ac5cd0
Author: Mikhail Gryzykhin 
AuthorDate: Mon Jul 30 11:28:31 2018 -0700

Add GitScm poll trigger for post-commit tests.
---
 .test-infra/jenkins/CommonJobProperties.groovy  | 7 +++
 .test-infra/jenkins/PostcommitJobBuilder.groovy | 3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/.test-infra/jenkins/CommonJobProperties.groovy 
b/.test-infra/jenkins/CommonJobProperties.groovy
index 99745f8..a804304 100644
--- a/.test-infra/jenkins/CommonJobProperties.groovy
+++ b/.test-infra/jenkins/CommonJobProperties.groovy
@@ -224,6 +224,7 @@ class CommonJobProperties {
 
   // Sets common config for jobs which run on a schedule; optionally on push
   static void setAutoJob(context,
+ triggerOnCommit = false,
  String buildSchedule = '0 */6 * * *',
  notifyAddress = 'commits@beam.apache.org') {
 
@@ -231,8 +232,14 @@ class CommonJobProperties {
 context.triggers {
   // By default runs every 6 hours.
   cron(buildSchedule)
+
+  if (triggerOnCommit){
+githubPush()
+  }
 }
 
+
+
 context.publishers {
   // Notify an email address for each failed build (defaults to commits@).
   mailer(
diff --git a/.test-infra/jenkins/PostcommitJobBuilder.groovy 
b/.test-infra/jenkins/PostcommitJobBuilder.groovy
index 9ca88bc..3235582 100644
--- a/.test-infra/jenkins/PostcommitJobBuilder.groovy
+++ b/.test-infra/jenkins/PostcommitJobBuilder.groovy
@@ -47,8 +47,9 @@ class PostcommitJobBuilder {
 
   void defineAutoPostCommitJob(name) {
 def autoBuilds = scope.job(name) {
-  commonJobProperties.setAutoJob delegate
+  commonJobProperties.setAutoJob delegate, true
 }
+
 autoBuilds.with(jobDefinition)
   }
 



[jira] [Created] (BEAM-5046) Implement Missing Standard SQL functions in BeamSQL

2018-07-30 Thread Rui Wang (JIRA)
Rui Wang created BEAM-5046:
--

 Summary: Implement Missing Standard SQL functions in BeamSQL
 Key: BEAM-5046
 URL: https://issues.apache.org/jira/browse/BEAM-5046
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Rui Wang
Assignee: Rui Wang






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=128918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128918
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 30/Jul/18 20:03
Start Date: 30/Jul/18 20:03
Worklog Time Spent: 10m 
  Work Description: bsidhom commented on a change in pull request #6082: 
[BEAM-2930] Side inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#discussion_r206302099
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -433,12 +433,17 @@ private void translateImpulse(
   throw new RuntimeException(e);
 }
 
-String inputPCollectionId = 
Iterables.getOnlyElement(transform.getInputsMap().values());
+String inputPCollectionId = stagePayload.getInput();
+// TODO: BEAM-2930
 
 Review comment:
   This is easier to reference if you include a full link rather than just the 
issue number.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128918)
Time Spent: 40m  (was: 0.5h)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5023) BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5023?focusedWorklogId=128915=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128915
 ]

ASF GitHub Bot logged work on BEAM-5023:


Author: ASF GitHub Bot
Created on: 30/Jul/18 19:57
Start Date: 30/Jul/18 19:57
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6066: 
[BEAM-5023] Send HarnessId in DataClient
URL: https://github.com/apache/beam/pull/6066#discussion_r206300830
 
 

 ##
 File path: 
sdks/java/harness/src/main/java/org/apache/beam/fn/harness/FnHarness.java
 ##
 @@ -150,7 +152,11 @@ public static void main(
 
   RegisterHandler fnApiRegistry = new RegisterHandler();
   BeamFnDataGrpcClient beamFnDataMultiplexer =
-  new BeamFnDataGrpcClient(options, channelFactory::forDescriptor, 
outboundObserverFactory);
+  new BeamFnDataGrpcClient(
+  options,
+  
channelFactory.withInterceptors(ImmutableList.of(AddHarnessIdInterceptor.create(id)))
 
 Review comment:
   Makes sense.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128915)
Time Spent: 10m
Remaining Estimate: 0h

> BeamFnDataGrpcClient should pass the worker_id when connecting to the 
> RunnerHarness
> ---
>
> Key: BEAM-5023
> URL: https://issues.apache.org/jira/browse/BEAM-5023
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-harness
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4748) Flaky post-commit test org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactServicesTest.putArtifactsMultipleFilesConcurrentlyTest

2018-07-30 Thread Ankur Goenka (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ankur Goenka resolved BEAM-4748.

   Resolution: Duplicate
Fix Version/s: 2.7.0

> Flaky post-commit test 
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactServicesTest.putArtifactsMultipleFilesConcurrentlyTest
> 
>
> Key: BEAM-4748
> URL: https://issues.apache.org/jira/browse/BEAM-4748
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Ankur Goenka
>Priority: Major
> Fix For: 2.7.0
>
>
> Test flaked on following job:
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1040/testReport/junit/org.apache.beam.runners.fnexecution.artifact/BeamFileSystemArtifactServicesTest/putArtifactsMultipleFilesConcurrentlyTest/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4748) Flaky post-commit test org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactServicesTest.putArtifactsMultipleFilesConcurrentlyTest

2018-07-30 Thread Ankur Goenka (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16562415#comment-16562415
 ] 

Ankur Goenka commented on BEAM-4748:


This flaky test was fixed in [https://github.com/apache/beam/pull/5975] 

Please try with the latest code base.

> Flaky post-commit test 
> org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactServicesTest.putArtifactsMultipleFilesConcurrentlyTest
> 
>
> Key: BEAM-4748
> URL: https://issues.apache.org/jira/browse/BEAM-4748
> Project: Beam
>  Issue Type: Bug
>  Components: test-failures
>Reporter: Mikhail Gryzykhin
>Assignee: Ankur Goenka
>Priority: Major
>
> Test flaked on following job:
> https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Java_GradleBuild/1040/testReport/junit/org.apache.beam.runners.fnexecution.artifact/BeamFileSystemArtifactServicesTest/putArtifactsMultipleFilesConcurrentlyTest/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-5028) getPerDestinationOutputFilenames() is getting processed before write is finished

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5028?focusedWorklogId=128909=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128909
 ]

ASF GitHub Bot logged work on BEAM-5028:


Author: ASF GitHub Bot
Created on: 30/Jul/18 19:45
Start Date: 30/Jul/18 19:45
Worklog Time Spent: 10m 
  Work Description: JozoVilcek commented on a change in pull request #6075: 
[BEAM-5028] Finalize writes before returning filenames via WriteFileResults
URL: https://github.com/apache/beam/pull/6075#discussion_r206297334
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java
 ##
 @@ -794,11 +794,11 @@ public void process(ProcessContext c) throws Exception {
 ? writeOperation.finalizeDestination(
 defaultDest, GlobalWindow.INSTANCE, fixedNumShards, 
fileResults)
 : finalizeAllDestinations(fileResults, fixedNumShards);
+writeOperation.moveToOutputFiles(resultsToFinalFilenames);
 
 Review comment:
   Strange. I do not know about internals of beam soI can not comment. But 
empirical observation is, that my code is not runnable without this patch. I am 
running Beam 2.5.0 on Flink 1.4.0.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128909)
Time Spent: 40m  (was: 0.5h)

> getPerDestinationOutputFilenames() is getting processed before write is 
> finished
> 
>
> Key: BEAM-5028
> URL: https://issues.apache.org/jira/browse/BEAM-5028
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-files
>Affects Versions: 2.5.0
>Reporter: Jozef Vilcek
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> I believe I am hitting a race condition on finalising writes vs notification 
> events about frites being done.
> It looks to me that WriteFilesResult.getPerDestinationOutputFilenames() can 
> announce files before they are finished:
> [https://github.com/apache/beam/blob/release-2.5.0/sdks/java/core/src/main/java/org/apache/beam/sdk/io/WriteFiles.java#L799]
> This is related to BEAM-3268 where it appears to not be fully fixed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2930) Flink support for portable side input

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-2930?focusedWorklogId=128908=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128908
 ]

ASF GitHub Bot logged work on BEAM-2930:


Author: ASF GitHub Bot
Created on: 30/Jul/18 19:43
Start Date: 30/Jul/18 19:43
Worklog Time Spent: 10m 
  Work Description: angoenka commented on a change in pull request #6082: 
[BEAM-2930] Side inputs are not yet supported in streaming mode.
URL: https://github.com/apache/beam/pull/6082#discussion_r206296850
 
 

 ##
 File path: 
runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingPortablePipelineTranslator.java
 ##
 @@ -433,12 +433,17 @@ private void translateImpulse(
   throw new RuntimeException(e);
 }
 
-String inputPCollectionId = 
Iterables.getOnlyElement(transform.getInputsMap().values());
+String inputPCollectionId = stagePayload.getInput();
+// TODO: BEAM-2930
+if (stagePayload.getSideInputsCount() > 0) {
+  throw new UnsupportedOperationException(
+  "[BEAM-2930] streaming translator does not support side inputs: " + 
transform);
+}
 
 Map, OutputTag>> tagsToOutputTags = 
Maps.newLinkedHashMap();
 Map, Coder>> tagsToCoders = 
Maps.newLinkedHashMap();
 // TODO: does it matter which output we designate as "main"
-TupleTag mainOutputTag;
+final TupleTag mainOutputTag;
 
 Review comment:
   We can single line this assignment for better readability.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128908)
Time Spent: 0.5h  (was: 20m)

> Flink support for portable side input
> -
>
> Key: BEAM-2930
> URL: https://issues.apache.org/jira/browse/BEAM-2930
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4778) Less wasteful ArtifactStagingService

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4778?focusedWorklogId=128907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128907
 ]

ASF GitHub Bot logged work on BEAM-4778:


Author: ASF GitHub Bot
Created on: 30/Jul/18 19:41
Start Date: 30/Jul/18 19:41
Worklog Time Spent: 10m 
  Work Description: angoenka commented on issue #5958: [BEAM-4778] add 
option to flink job server to clean staged artifacts per-job
URL: https://github.com/apache/beam/pull/5958#issuecomment-408985359
 
 
   The PR looks good. 
   We can simplify it a bit by reusing FinkJobInvokation#addStateListener to 
listen to the termination instead of adding a separate listener. 
   But we can take it up later.
   Otherwise, the PR looks good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 128907)
Time Spent: 4h 40m  (was: 4.5h)

> Less wasteful ArtifactStagingService
> 
>
> Key: BEAM-4778
> URL: https://issues.apache.org/jira/browse/BEAM-4778
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Eugene Kirpichov
>Assignee: Ryan Williams
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/artifact/BeamFileSystemArtifactStagingService.java]
>  is the main implementation of ArtifactStagingService.
> It stages artifacts into a directory; and in practice the passed staging 
> session token is such that the directory is different for every job. This 
> leads to 2 issues:
>  * It doesn't get cleaned up when the job finishes or even when the 
> JobService shuts down, so we have disk space leaks if running a lot of jobs 
> (e.g. a suite of ValidatesRunner tests)
>  * We repeatedly re-stage the same artifacts. Instead, ideally, we should 
> identify that some artifacts don't need to be staged - based on knowing their 
> md5. The artifact staging protocol has rudimentary support for this but may 
> need to be modified.
> CC: [~angoenka]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128900=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128900
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 19:37
Start Date: 30/Jul/18 19:37
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206284246
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

[jira] [Work logged] (BEAM-4808) Add an integration test for BeamSqlLine

2018-07-30 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-4808?focusedWorklogId=128905=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-128905
 ]

ASF GitHub Bot logged work on BEAM-4808:


Author: ASF GitHub Bot
Created on: 30/Jul/18 19:37
Start Date: 30/Jul/18 19:37
Worklog Time Spent: 10m 
  Work Description: akedin commented on a change in pull request #6006: 
[BEAM-4808][SQL] add e2e test for BeamSqlLine.
URL: https://github.com/apache/beam/pull/6006#discussion_r206290959
 
 

 ##
 File path: 
sdks/java/extensions/sql/jdbc/src/test/java/org/apache/beam/sdk/extensions/sql/jdbc/BeamSqlLineIT.java
 ##
 @@ -0,0 +1,328 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.extensions.sql.jdbc;
+
+import static java.nio.charset.StandardCharsets.UTF_8;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.buildArgs;
+import static 
org.apache.beam.sdk.extensions.sql.jdbc.BeamSqlLineTestingUtils.toLines;
+import static 
org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.projectPathFromPath;
+import static org.hamcrest.CoreMatchers.everyItem;
+import static org.junit.Assert.assertThat;
+import static org.junit.Assert.assertTrue;
+
+import com.alibaba.fastjson.JSONObject;
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.ImmutableMap;
+import java.io.ByteArrayOutputStream;
+import java.io.IOException;
+import java.io.Serializable;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.util.Arrays;
+import java.util.List;
+import java.util.TimeZone;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import java.util.concurrent.TimeUnit;
+import org.apache.beam.sdk.extensions.gcp.options.GcpOptions;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubClient.SubscriptionPath;
+import org.apache.beam.sdk.io.gcp.pubsub.PubsubMessage;
+import org.apache.beam.sdk.io.gcp.pubsub.TestPubsub;
+import org.apache.beam.sdk.testing.TestPipeline;
+import org.hamcrest.collection.IsIn;
+import org.junit.BeforeClass;
+import org.junit.Rule;
+import org.junit.Test;
+
+/** BeamSqlLine integration tests. */
+public class BeamSqlLineIT implements Serializable {
+
+  @Rule public transient TestPubsub eventsTopic = TestPubsub.create();
+
+  private static String project =
+  TestPipeline.testingPipelineOptions().as(GcpOptions.class).getProject();
+  private static String createPubsubTableStatement;
+  private static String setProject;
+  private static PubsubMessageJSONStringConstructor constructor;
+  private static final String publicTopic = 
"projects/pubsub-public-data/topics/taxirides-realtime";
+
+  @BeforeClass
+  public static void setUp() {
+setProject = String.format("SET project = '%s';", project);
+
+createPubsubTableStatement =
+"CREATE TABLE taxi_rides (\n"
++ " event_timestamp TIMESTAMP,\n"
++ " attributes MAP,\n"
++ " payload ROW<\n"
++ "   ride_id VARCHAR,\n"
++ "   point_idx INT,\n"
++ "   latitude DOUBLE,\n"
++ "   longitude DOUBLE,\n"
++ "   meter_reading DOUBLE,\n"
++ "   meter_increment DOUBLE,\n"
++ "   ride_status VARCHAR,\n"
++ "   passenger_count TINYINT>)\n"
++ "   TYPE pubsub \n"
++ "   LOCATION '%s'\n"
++ "   TBLPROPERTIES '{\"timestampAttributeKey\": \"ts\"}';";
+
+constructor =
+new PubsubMessageJSONStringConstructor(
+"ride_id",
+"point_idx",
+"latitude",
+"longitude",
+"meter_reading",
+"meter_increment",
+"ride_status",
+"passenger_count");
+  }
+
+  @Test
+  public void testSelectFromPubsub() throws Exception {
+ExecutorService pool = Executors.newFixedThreadPool(1);
+
+Future>> expectedResult =
+

  1   2   3   >