[beam] branch master updated (a2a5d3d -> 50139b4)

2018-04-04 Thread jbonofre
This is an automated email from the ASF dual-hosted git repository.

jbonofre pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from a2a5d3d  [BEAM-3257] Add Python precommit gradle config
 add 124ead5  [BEAM-3409] waitUntilFinish should wait teardown even for the 
direct runner
 new 50139b4  Merge pull request #4790 from 
rmannibucau/fix/BEAM-3409_wait-for-teardown-execution-in-direct-runner

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../org/apache/beam/runners/core/DoFnRunner.java   |  6 ++
 .../runners/core/LateDataDroppingDoFnRunner.java   |  5 ++
 .../apache/beam/runners/core/ProcessFnRunner.java  |  6 ++
 .../runners/core/PushbackSideInputDoFnRunner.java  |  6 ++
 .../apache/beam/runners/core/SimpleDoFnRunner.java |  5 ++
 .../core/SimplePushbackSideInputDoFnRunner.java|  8 +-
 .../beam/runners/core/StatefulDoFnRunner.java  |  5 ++
 .../SimplePushbackSideInputDoFnRunnerTest.java |  6 ++
 ...LifecycleManagerRemovingTransformEvaluator.java |  4 +
 .../direct/ExecutorServiceParallelExecutor.java| 42 --
 .../apache/beam/runners/direct/ParDoEvaluator.java |  8 ++
 .../SplittableProcessElementsEvaluatorFactory.java | 95 +++---
 .../beam/runners/direct/DirectRunnerTest.java  | 35 
 .../flink/metrics/DoFnRunnerWithMetricsUpdate.java |  6 ++
 .../spark/translation/DoFnRunnerWithMetrics.java   |  6 ++
 .../apache/beam/fn/harness/FnApiDoFnRunner.java|  5 ++
 16 files changed, 193 insertions(+), 55 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
jbono...@apache.org.


[jira] [Work logged] (BEAM-3409) Unexpected behavior of DoFn teardown method running in unit tests

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3409?focusedWorklogId=87881=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87881
 ]

ASF GitHub Bot logged work on BEAM-3409:


Author: ASF GitHub Bot
Created on: 05/Apr/18 04:09
Start Date: 05/Apr/18 04:09
Worklog Time Spent: 10m 
  Work Description: jbonofre closed pull request #4790: [BEAM-3409] 
waitUntilFinish() doesn't wait for the teardown execution on Direct runner - 
fixing compilation issue on flink
URL: https://github.com/apache/beam/pull/4790
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
index 30648f6e582..fd36318e5af 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/DoFnRunner.java
@@ -49,4 +49,10 @@
* additional tasks, such as flushing in-memory states.
*/
   void finishBundle();
+
+  /**
+   * @since 2.5.0
+   * @return the underlying fn instance.
+   */
+  DoFn getFn();
 }
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/LateDataDroppingDoFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/LateDataDroppingDoFnRunner.java
index f89aa4e839c..d101cc57b27 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/LateDataDroppingDoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/LateDataDroppingDoFnRunner.java
@@ -60,6 +60,11 @@ public LateDataDroppingDoFnRunner(
 lateDataFilter = new LateDataFilter(windowingStrategy, timerInternals);
   }
 
+  @Override
+  public DoFn, KV> getFn() {
+return doFnRunner.getFn();
+  }
+
   @Override
   public void startBundle() {
 doFnRunner.startBundle();
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/ProcessFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/ProcessFnRunner.java
index e4dfd132e2d..8c360ef0bd0 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/ProcessFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/ProcessFnRunner.java
@@ -25,6 +25,7 @@
 import org.apache.beam.runners.core.StateNamespaces.WindowNamespace;
 import org.apache.beam.runners.core.TimerInternals.TimerData;
 import org.apache.beam.sdk.state.TimeDomain;
+import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
 import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
 import org.apache.beam.sdk.util.WindowedValue;
@@ -52,6 +53,11 @@ public ProcessFnRunner(
 this.sideInputReader = sideInputReader;
   }
 
+  @Override
+  public DoFn>, OutputT> 
getFn() {
+return underlying.getFn();
+  }
+
   @Override
   public void startBundle() {
 underlying.startBundle();
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/PushbackSideInputDoFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/PushbackSideInputDoFnRunner.java
index 8f21086794d..f3013eff0cb 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/PushbackSideInputDoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/PushbackSideInputDoFnRunner.java
@@ -48,4 +48,10 @@ void onTimer(String timerId, BoundedWindow window, Instant 
timestamp,
 
   /** Calls the underlying {@link DoFn.FinishBundle} method. */
   void finishBundle();
+
+  /**
+   * @since 2.5.0
+   * @return the underlying fn instance.
+   */
+  DoFn getFn();
 }
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
index d4c5775464b..7e60b033e54 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SimpleDoFnRunner.java
@@ -120,6 +120,11 @@ public SimpleDoFnRunner(
 this.allowedLateness = windowingStrategy.getAllowedLateness();
   }
 
+  @Override
+  public DoFn getFn() {
+return fn;
+  }
+
   @Override
   public void startBundle() {
 // This can contain user code. Wrap it in case it throws an exception.
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SimplePushbackSideInputDoFnRunner.java
 

[beam] 01/01: Merge pull request #4790 from rmannibucau/fix/BEAM-3409_wait-for-teardown-execution-in-direct-runner

2018-04-04 Thread jbonofre
This is an automated email from the ASF dual-hosted git repository.

jbonofre pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 50139b4395584513099094445f57c495b515
Merge: a2a5d3d 124ead5
Author: Jean-Baptiste Onofré 
AuthorDate: Thu Apr 5 06:09:24 2018 +0200

Merge pull request #4790 from 
rmannibucau/fix/BEAM-3409_wait-for-teardown-execution-in-direct-runner

[BEAM-3409] waitUntilFinish() doesn't wait for the teardown execution on 
Direct runner - fixing compilation issue on flink

 .../org/apache/beam/runners/core/DoFnRunner.java   |  6 ++
 .../runners/core/LateDataDroppingDoFnRunner.java   |  5 ++
 .../apache/beam/runners/core/ProcessFnRunner.java  |  6 ++
 .../runners/core/PushbackSideInputDoFnRunner.java  |  6 ++
 .../apache/beam/runners/core/SimpleDoFnRunner.java |  5 ++
 .../core/SimplePushbackSideInputDoFnRunner.java|  8 +-
 .../beam/runners/core/StatefulDoFnRunner.java  |  5 ++
 .../SimplePushbackSideInputDoFnRunnerTest.java |  6 ++
 ...LifecycleManagerRemovingTransformEvaluator.java |  4 +
 .../direct/ExecutorServiceParallelExecutor.java| 42 --
 .../apache/beam/runners/direct/ParDoEvaluator.java |  8 ++
 .../SplittableProcessElementsEvaluatorFactory.java | 95 +++---
 .../beam/runners/direct/DirectRunnerTest.java  | 35 
 .../flink/metrics/DoFnRunnerWithMetricsUpdate.java |  6 ++
 .../spark/translation/DoFnRunnerWithMetrics.java   |  6 ++
 .../apache/beam/fn/harness/FnApiDoFnRunner.java|  5 ++
 16 files changed, 193 insertions(+), 55 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
jbono...@apache.org.


[jira] [Work logged] (BEAM-3409) Unexpected behavior of DoFn teardown method running in unit tests

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3409?focusedWorklogId=87880=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87880
 ]

ASF GitHub Bot logged work on BEAM-3409:


Author: ASF GitHub Bot
Created on: 05/Apr/18 04:08
Start Date: 05/Apr/18 04:08
Worklog Time Spent: 10m 
  Work Description: jbonofre commented on issue #4790: [BEAM-3409] 
waitUntilFinish() doesn't wait for the teardown execution on Direct runner - 
fixing compilation issue on flink
URL: https://github.com/apache/beam/pull/4790#issuecomment-378814811
 
 
   Run Java Gradle PreCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87880)
Time Spent: 7h  (was: 6h 50m)

> Unexpected behavior of DoFn teardown method running in unit tests 
> --
>
> Key: BEAM-3409
> URL: https://issues.apache.org/jira/browse/BEAM-3409
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Alexey Romanenko
>Assignee: Romain Manni-Bucau
>Priority: Blocker
>  Labels: test
> Fix For: 2.5.0
>
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Writing a unit test, I found out a strange behaviour of Teardown method of 
> DoFn implementation when I run this method in unit tests using TestPipeline.
> To be more precise, it doesn’t wait until teardown() method will be finished, 
> it just exits from this method after about 1 sec (on my machine) even if it 
> should take longer (very simple example - running infinite loop inside this 
> method or put thread in sleep). In the same time, when I run the same code 
> from main() with ordinary Pipeline and direct runner, then it’s ok and it 
> works as expected - teardown() method will be performed completely despite 
> how much time it will take.
> I created two test cases to reproduce this issue - the first one to run with 
> main() and the second one to run with junit. They use the same implementation 
> of DoFn (class LongTearDownFn) and expects that teardown method will be 
> running at least for SLEEP_TIME ms. In case of running as junit test it's not 
> a case (see output log).
> - run with main()
> https://github.com/aromanenko-dev/beam-samples/blob/master/runners-tests/src/main/java/TearDown.java
> - run with junit
> https://github.com/aromanenko-dev/beam-samples/blob/master/runners-tests/src/test/java/TearDownTest.java



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87865
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 05/Apr/18 03:07
Start Date: 05/Apr/18 03:07
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5029: 
[BEAM-3250] Migrate Dataflow ValidatesRunner test to Gradle
URL: https://github.com/apache/beam/pull/5029#discussion_r179340496
 
 

 ##
 File path: runners/google-cloud-dataflow-java/build.gradle
 ##
 @@ -70,23 +77,60 @@ dependencies {
   shadow library.java.jackson_annotations
   shadow library.java.jackson_databind
   shadow library.java.slf4j_api
-  testCompile library.java.hamcrest_core
-  testCompile library.java.junit
-  testCompile 
project(":sdks:java:io:google-cloud-platform").sourceSets.test.output
-  testCompile project(path: ":sdks:java:core", configuration: "shadowTest")
-  testCompile 
project(":sdks:java:extensions:google-cloud-platform-core").sourceSets.test.output
-  testCompile library.java.guava_testlib
-  testCompile library.java.slf4j_jdk14
-  testCompile library.java.mockito_core
-  testCompile library.java.google_cloud_dataflow_java_proto_library_all
-  testCompile library.java.datastore_v1_protos
-  testCompile library.java.jackson_dataformat_yaml
+  shadowTest library.java.hamcrest_core
+  shadowTest library.java.junit
+  shadowTest 
project(":sdks:java:io:google-cloud-platform").sourceSets.test.output
+  shadowTest project(path: ":sdks:java:core", configuration: "shadowTest")
+  shadowTest 
project(":sdks:java:extensions:google-cloud-platform-core").sourceSets.test.output
+  shadowTest library.java.guava_testlib
+  shadowTest library.java.slf4j_jdk14
+  shadowTest library.java.mockito_core
+  shadowTest library.java.google_cloud_dataflow_java_proto_library_all
+  shadowTest library.java.datastore_v1_protos
+  shadowTest library.java.jackson_dataformat_yaml
+  validatesRunner project(path: ":sdks:java:core", configuration: "shadowTest")
+  validatesRunner project(path: project.path, configuration: "shadow")
 }
 
 test {
   systemProperties = [ "beamUseDummyRunner" : "true" ]
 }
 
+task validatesRunnerTest(type: Test) {
+  group = "Verification"
+  def dataflowProject = project.findProperty('dataflowProject') ?: 
'apache-beam-testing'
+  def dataflowTempRoot = project.findProperty('dataflowTempRoot') ?: 
'gs://temp-storage-for-validates-runner-tests/'
+  systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  "--runner=TestDataflowRunner",
+  "--project=${dataflowProject}",
+  "--tempRoot=${dataflowTempRoot}",
+  ])
+
+
+  classpath = configurations.validatesRunner
+  testClassesDirs = 
files(project(":sdks:java:core").sourceSets.test.output.classesDirs)
+  useJUnit {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 'org.apache.beam.sdk.testing.LargeKeys$Above10MB'
 
 Review comment:
   I don't see any excludes defined here: 
https://github.com/apache/beam/blob/a2a5d3d7aa59b5cfde1c47a6286bcb3ccd7f8c85/runners/google-cloud-dataflow-java/pom.xml#L64
   
   Is there another place where you got this list from?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87865)
Time Spent: 2h 20m  (was: 2h 10m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #5281

2018-04-04 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-4014) Migrate MavenInstall Jenkins PostCommits to Gradle

2018-04-04 Thread Henning Rohde (JIRA)
Henning Rohde created BEAM-4014:
---

 Summary: Migrate MavenInstall Jenkins PostCommits to Gradle
 Key: BEAM-4014
 URL: https://issues.apache.org/jira/browse/BEAM-4014
 Project: Beam
  Issue Type: Sub-task
  Components: build-system, testing
Reporter: Henning Rohde






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87857=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87857
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 05/Apr/18 02:20
Start Date: 05/Apr/18 02:20
Worklog Time Spent: 10m 
  Work Description: herohde opened a new pull request #5029: [BEAM-3250] 
Migrate Dataflow ValidatesRunner test to Gradle
URL: https://github.com/apache/beam/pull/5029
 
 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87857)
Time Spent: 2h 10m  (was: 2h)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1551

2018-04-04 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] Fixing lint errrors

[markliu] [BEAM-3946] Fix pubsub_matcher_test which depends on

[amyrvold] Rename flink job to fix seed

[ehudm] Add HadoopFileSystemOptions support for Dataflow.

[ehudm] Fix HadoopFileSystem.match bugs.

[ehudm] Test HDFS reads in integration test.

[ehudm] Fix linter errors and add missing license.

[github] [BEAM-3774] Adds support for reading from/writing to more BQ

--
[...truncated 70.19 KB...]
2018-04-05 01:00:10,821 3b321c3d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-05 01:00:36,173 3b321c3d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-05 01:00:39,639 3b321c3d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r6f0dc9f6ed1cf97f_01629351d585_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload complete.Waiting on bqjob_r6f0dc9f6ed1cf97f_01629351d585_1 
... (0s) Current status: RUNNING
  Waiting on 
bqjob_r6f0dc9f6ed1cf97f_01629351d585_1 ... (0s) Current status: DONE   
2018-04-05 01:00:39,640 3b321c3d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-05 01:01:00,297 3b321c3d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-05 01:01:03,934 3b321c3d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r9a6302013cba5a9_0162935233cd_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload complete.Waiting on bqjob_r9a6302013cba5a9_0162935233cd_1 
... (0s) Current status: RUNNING
 Waiting on 
bqjob_r9a6302013cba5a9_0162935233cd_1 ... (0s) Current status: DONE   
2018-04-05 01:01:03,935 3b321c3d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-05 01:01:20,709 3b321c3d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-05 01:01:23,975 3b321c3d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r3cbee139df3f0d92_016293528368_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload complete.Waiting on bqjob_r3cbee139df3f0d92_016293528368_1 
... (0s) Current status: RUNNING
  Waiting on 
bqjob_r3cbee139df3f0d92_016293528368_1 ... (0s) Current status: DONE   
2018-04-05 01:01:23,975 3b321c3d MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-05 01:01:41,550 3b321c3d MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-05 01:01:44,773 3b321c3d MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r4e914b8698da37b3_01629352d4d3_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: Upload 

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle #1

2018-04-04 Thread Apache Jenkins Server
See 


--
Started by GitHub push by aaltay
[EnvInject] - Loading node environment variables.
Building remotely on beam2 (beam) in workspace 

Cloning the remote Git repository
Cloning repository https://github.com/apache/beam.git
 > git init 
 > 
 >  # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/gearpump-runner^{commit} # timeout=10
 > git rev-parse gearpump-runner^{commit} # timeout=10
ERROR: Couldn't find any revision to build. Verify the repository and branch 
configuration for this job.
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/gearpump-runner^{commit} # timeout=10
 > git rev-parse gearpump-runner^{commit} # timeout=10
ERROR: Couldn't find any revision to build. Verify the repository and branch 
configuration for this job.
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/${ghprbPullId}/*:refs/remotes/origin/pr/${ghprbPullId}/*
 > git rev-parse origin/gearpump-runner^{commit} # timeout=10
 > git rev-parse gearpump-runner^{commit} # timeout=10
ERROR: Couldn't find any revision to build. Verify the repository and branch 
configuration for this job.


Build failed in Jenkins: beam_PerformanceTests_XmlIOIT_HDFS #9

2018-04-04 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] Fixing lint errrors

[markliu] [BEAM-3946] Fix pubsub_matcher_test which depends on

[amyrvold] Rename flink job to fix seed

[ehudm] Add HadoopFileSystemOptions support for Dataflow.

[ehudm] Fix HadoopFileSystem.match bugs.

[ehudm] Test HDFS reads in integration test.

[ehudm] Fix linter errors and add missing license.

[github] [BEAM-3774] Adds support for reading from/writing to more BQ

--
[...truncated 207.76 KB...]
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy61.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:248)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:235)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy60.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy61.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
 

Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #11

2018-04-04 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] Fixing lint errrors

[markliu] [BEAM-3946] Fix pubsub_matcher_test which depends on

[amyrvold] Rename flink job to fix seed

[ehudm] Add HadoopFileSystemOptions support for Dataflow.

[ehudm] Fix HadoopFileSystem.match bugs.

[ehudm] Test HDFS reads in integration test.

[ehudm] Fix linter errors and add missing license.

[github] [BEAM-3774] Adds support for reading from/writing to more BQ

--
[...truncated 733.88 KB...]
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at com.mongodb.Mongo$2.execute(Mongo.java:759)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:275)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:197)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:181)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:160)
at 
com.google.cloud.dataflow.worker.WorkerCustomSourceOperationExecutor.execute(WorkerCustomSourceOperationExecutor.java:75)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:381)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:353)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:284)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.225.183.202:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at com.mongodb.Mongo$2.execute(Mongo.java:759)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:275)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:197)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:181)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:160)
at 

[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87840
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:51
Start Date: 05/Apr/18 00:51
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #5010: [BEAM-3257] Add 
Python precommit gradle config
URL: https://github.com/apache/beam/pull/5010
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/job_beam_PreCommit_Python_GradleBuild.groovy 
b/.test-infra/jenkins/job_beam_PreCommit_Python_GradleBuild.groovy
new file mode 100644
index 000..d2fdef72ad3
--- /dev/null
+++ b/.test-infra/jenkins/job_beam_PreCommit_Python_GradleBuild.groovy
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import common_job_properties
+
+// This is the Python precommit which runs a Gradle build, and the current set
+// of precommit tests.
+job('beam_PreCommit_Python_GradleBuild') {
+  description('Runs Python PreCommit tests for the current GitHub Pull 
Request.')
+
+  // Execute concurrent builds if necessary.
+  concurrentBuild()
+
+  // Set common parameters.
+  common_job_properties.setTopLevelMainJobProperties(
+delegate,
+'master',
+90)
+
+  def gradle_switches = [
+// Gradle log verbosity enough to diagnose basic build issues
+"--info",
+// Continue the build even if there is a failure to show as many potential 
failures as possible.
+'--continue',
+// Until we verify the build cache is working appropriately, force 
rerunning all tasks
+'--rerun-tasks',
+  ]
+
+  def gradle_command_line = './gradlew ' + gradle_switches.join(' ') + ' 
:pythonPreCommit'
+  // Sets that this is a PreCommit job.
+  common_job_properties.setPreCommit(delegate, gradle_command_line, 'Run 
Python Gradle PreCommit')
+  steps {
+gradle {
+  rootBuildScriptDir(common_job_properties.checkoutDir)
+  tasks(':pythonPreCommit')
+  for (String gradle_switch : gradle_switches) {
+switches(gradle_switch)
+  }
+}
+  }
+}
diff --git a/build.gradle b/build.gradle
index 602f8e11a33..7b8c60c169e 100644
--- a/build.gradle
+++ b/build.gradle
@@ -126,3 +126,9 @@ task goPreCommit() {
   dependsOn ":rat"
   dependsOn ":sdks:go:test"
 }
+
+task pythonPreCommit() {
+  dependsOn ":rat"
+  dependsOn ":sdks:python:check"
+}
+
diff --git a/sdks/python/build.gradle b/sdks/python/build.gradle
index 2c6637a5373..be17644e8d3 100644
--- a/sdks/python/build.gradle
+++ b/sdks/python/build.gradle
@@ -25,75 +25,123 @@ apply plugin: "base"
 task test {}
 check.dependsOn test
 
-task setupTest {
+def envdir = "${project.buildDir}/gradleenv"
+
+task setupVirtualenv {
   doLast {
+exec {
+  commandLine 'virtualenv', "${envdir}"
+}
 exec {
   executable 'sh'
-  args '-c', 'which tox || pip install --user --upgrade tox'
+  args '-c', ". ${envdir}/bin/activate && pip install --upgrade tox"
 }
   }
+  outputs.files("${envdir}/bin/tox")
 }
 
-task sdist {
+task sdist(dependsOn: 'setupVirtualenv') {
   doLast {
 exec {
-  commandLine 'python', 'setup.py', 'sdist', '--formats', 'zip,gztar', 
'--dist-dir', project.buildDir
+  executable 'sh'
+  args '-c', ". ${envdir}/bin/activate && python setup.py sdist --formats 
zip,gztar --dist-dir ${project.buildDir}"
 }
   }
 }
 
-task cleanPython {
+task cleanPython(dependsOn: 'setupVirtualenv') {
   doLast {
 exec {
-  commandLine 'python', 'setup.py', 'clean'
+  executable 'sh'
+  args '-c', ". ${envdir}/bin/activate && python setup.py clean"
 }
   }
 }
 clean.dependsOn cleanPython
 
-task buildPython {
+task buildPython(dependsOn: 'setupVirtualenv') {
   doLast {
 println 'Building Python Dependencies'
 exec {
-  commandLine 'python', 'setup.py', 'build', 

[beam] branch master updated (a4fb844 -> a2a5d3d)

2018-04-04 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from a4fb844  Secure GRPC channel for SDK worker (#4984)
 add 76472ff  Update Python Gradle tasks to run in a venv.
 add 84a491e  Add Gradle based Python precommit.
 new a2a5d3d  [BEAM-3257] Add Python precommit gradle config

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ...> job_beam_PreCommit_Python_GradleBuild.groovy} | 14 ++--
 build.gradle   |  6 ++
 sdks/python/build.gradle   | 90 +-
 3 files changed, 82 insertions(+), 28 deletions(-)
 copy .test-infra/jenkins/{job_beam_PreCommit_Go_GradleBuild.groovy => 
job_beam_PreCommit_Python_GradleBuild.groovy} (85%)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] 01/01: [BEAM-3257] Add Python precommit gradle config

2018-04-04 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit a2a5d3d7aa59b5cfde1c47a6286bcb3ccd7f8c85
Merge: a4fb844 84a491e
Author: Lukasz Cwik 
AuthorDate: Wed Apr 4 17:51:13 2018 -0700

[BEAM-3257] Add Python precommit gradle config

 .../job_beam_PreCommit_Python_GradleBuild.groovy   | 56 ++
 build.gradle   |  6 ++
 sdks/python/build.gradle   | 90 +-
 3 files changed, 131 insertions(+), 21 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] branch master updated: Secure GRPC channel for SDK worker (#4984)

2018-04-04 Thread robertwb
This is an automated email from the ASF dual-hosted git repository.

robertwb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new a4fb844  Secure GRPC channel for SDK worker (#4984)
a4fb844 is described below

commit a4fb844df82051ef93bb7e2d47967e143eabcc5c
Author: ananvay 
AuthorDate: Wed Apr 4 17:50:04 2018 -0700

Secure GRPC channel for SDK worker (#4984)
---
 .../apache_beam/runners/worker/data_plane.py   | 25 +++---
 .../apache_beam/runners/worker/sdk_worker.py   | 15 ++---
 2 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/sdks/python/apache_beam/runners/worker/data_plane.py 
b/sdks/python/apache_beam/runners/worker/data_plane.py
index 7c79c4c..1ff60aa 100644
--- a/sdks/python/apache_beam/runners/worker/data_plane.py
+++ b/sdks/python/apache_beam/runners/worker/data_plane.py
@@ -295,9 +295,13 @@ class GrpcClientDataChannelFactory(DataChannelFactory):
   Caches the created channels by ``data descriptor url``.
   """
 
-  def __init__(self):
+  def __init__(self, credentials=None):
 self._data_channel_cache = {}
 self._lock = threading.Lock()
+self._credentials = None
+if credentials is not None:
+  logging.info('Using secure channel creds.')
+  self._credentials = credentials
 
   def create_data_channel(self, remote_grpc_port):
 url = remote_grpc_port.api_service_descriptor.url
@@ -305,18 +309,23 @@ class GrpcClientDataChannelFactory(DataChannelFactory):
   with self._lock:
 if url not in self._data_channel_cache:
   logging.info('Creating channel for %s', url)
-  grpc_channel = grpc.insecure_channel(
-  url,
-  # Options to have no limits (-1) on the size of the messages
-  # received or sent over the data plane. The actual buffer size is
-  # controlled in a layer above.
-  options=[("grpc.max_receive_message_length", -1),
-   ("grpc.max_send_message_length", -1)])
+  # Options to have no limits (-1) on the size of the messages
+  # received or sent over the data plane. The actual buffer size
+  # is controlled in a layer above.
+  channel_options = [("grpc.max_receive_message_length", -1),
+ ("grpc.max_send_message_length", -1)]
+  grpc_channel = None
+  if self._credentials is None:
+grpc_channel = grpc.insecure_channel(url, options=channel_options)
+  else:
+grpc_channel = grpc.secure_channel(
+url, self._credentials, options=channel_options)
   # Add workerId to the grpc channel
   grpc_channel = grpc.intercept_channel(grpc_channel,
 WorkerIdInterceptor())
   self._data_channel_cache[url] = GrpcClientDataChannel(
   beam_fn_api_pb2_grpc.BeamFnDataStub(grpc_channel))
+
 return self._data_channel_cache[url]
 
   def close(self):
diff --git a/sdks/python/apache_beam/runners/worker/sdk_worker.py 
b/sdks/python/apache_beam/runners/worker/sdk_worker.py
index c77659b..3b6ed65 100644
--- a/sdks/python/apache_beam/runners/worker/sdk_worker.py
+++ b/sdks/python/apache_beam/runners/worker/sdk_worker.py
@@ -40,12 +40,21 @@ from apache_beam.runners.worker.worker_id_interceptor 
import WorkerIdInterceptor
 class SdkHarness(object):
   REQUEST_METHOD_PREFIX = '_request_'
 
-  def __init__(self, control_address, worker_count):
+  def __init__(self, control_address, worker_count, credentials=None):
 self._worker_count = worker_count
 self._worker_index = 0
+if credentials is None:
+  logging.info('Creating insecure channel.')
+  self._control_channel = grpc.insecure_channel(control_address)
+else:
+  logging.info('Creating secure channel.')
+  self._control_channel = grpc.secure_channel(control_address, credentials)
+  grpc.channel_ready_future(self._control_channel).result()
+  logging.info('Secure channel established.')
 self._control_channel = grpc.intercept_channel(
-grpc.insecure_channel(control_address), WorkerIdInterceptor())
-self._data_channel_factory = data_plane.GrpcClientDataChannelFactory()
+self._control_channel, WorkerIdInterceptor())
+self._data_channel_factory = data_plane.GrpcClientDataChannelFactory(
+credentials)
 self.workers = queue.Queue()
 # one thread is enough for getting the progress report.
 # Assumption:

-- 
To stop receiving notification emails like this one, please contact
rober...@apache.org.


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87839
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:49
Start Date: 05/Apr/18 00:49
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #5025: [BEAM-3250] Migrate 
Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
index c16a1e2f9d0..512cfa976a1 100644
--- a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
+++ b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
@@ -19,15 +19,22 @@
 import common_job_properties
 
 // This job runs the suite of ValidatesRunner tests against the Apex runner.
-mavenJob('beam_PostCommit_Java_ValidatesRunner_Apex') {
+job('beam_PostCommit_Java_ValidatesRunner_Apex_Gradle') {
   description('Runs the ValidatesRunner suite on the Apex runner.')
+  previousNames('beam_PostCommit_Java_ValidatesRunner_Apex')
   previousNames('beam_PostCommit_Java_RunnableOnService_Apex')
 
   // Set common parameters.
   common_job_properties.setTopLevelMainJobProperties(delegate)
 
-  // Set maven parameters.
-  common_job_properties.setMavenConfig(delegate)
+  def gradle_switches = [
+// Gradle log verbosity enough to diagnose basic build issues
+"--info",
+// Continue the build even if there is a failure to show as many potential 
failures as possible.
+'--continue',
+// Until we verify the build cache is working appropriately, force 
rerunning all tasks
+'--rerun-tasks',
+  ]
 
   // Sets that this is a PostCommit job.
   common_job_properties.setPostCommit(delegate)
@@ -38,11 +45,14 @@ mavenJob('beam_PostCommit_Java_ValidatesRunner_Apex') {
 'Apache Apex Runner ValidatesRunner Tests',
 'Run Apex ValidatesRunner')
 
-  // Maven goals for this job.
-  goals('''clean verify --projects runners/apex \
-  --also-make \
-  --batch-mode \
-  --errors \
-  --activate-profiles validates-runner-tests \
-  --activate-profiles local-validates-runner-tests''')
+  // Gradle goals for this job.
+  steps {
+gradle {
+  rootBuildScriptDir(common_job_properties.checkoutDir)
+  tasks(':runners:apex:validatesRunner')
+  for (String gradle_switch : gradle_switches) {
+switches(gradle_switch)
+  }
+}
+  }
 }
diff --git 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
index e1cbafe6e4b..8ba0a71dc59 100644
--- 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
+++ 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
@@ -20,9 +20,9 @@ import common_job_properties
 
 // This job runs the suite of ValidatesRunner tests against the Gearpump
 // runner.
-mavenJob('beam_PostCommit_Java_ValidatesRunner_Gearpump') {
+job('beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle') {
   description('Runs the ValidatesRunner suite on the Gearpump runner.')
-
+  previousNames('beam_PostCommit_Java_ValidatesRunner_Gearpump')
   previousNames('beam_PostCommit_Java_RunnableOnService_Gearpump')
 
   // Set common parameters.
@@ -30,8 +30,14 @@ mavenJob('beam_PostCommit_Java_ValidatesRunner_Gearpump') {
   delegate,
   'gearpump-runner')
 
-  // Set maven parameters.
-  common_job_properties.setMavenConfig(delegate)
+  def gradle_switches = [
+// Gradle log verbosity enough to diagnose basic build issues
+"--info",
+// Continue the build even if there is a failure to show as many potential 
failures as possible.
+'--continue',
+// Until we verify the build cache is working appropriately, force 
rerunning all tasks
+'--rerun-tasks',
+  ]
 
   // Sets that this is a PostCommit job.
   // 0 5 31 2 * will run on Feb 31 (i.e. never) according to job properties.
@@ -44,6 +50,14 @@ mavenJob('beam_PostCommit_Java_ValidatesRunner_Gearpump') {
 'Apache Gearpump Runner ValidatesRunner Tests',
 'Run Gearpump ValidatesRunner')
 
-  // Maven goals for this job.
-  goals('-B -e clean verify -am -pl runners/gearpump -DforkCount=0 
-DvalidatesRunnerPipelineOptions=\'[ "--runner=TestGearpumpRunner"]\'')
+  // Gradle goals for this job.
+  steps {
+gradle {
+  rootBuildScriptDir(common_job_properties.checkoutDir)
+  

[beam] branch master updated: [BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle (#5025)

2018-04-04 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new e42278c  [BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests 
to Gradle (#5025)
e42278c is described below

commit e42278c485152fac204e8018cf3134d9d35048fe
Author: Henning Rohde 
AuthorDate: Wed Apr 4 17:49:30 2018 -0700

[BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle 
(#5025)

* [BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle
---
 ...eam_PostCommit_Java_ValidatesRunner_Apex.groovy | 30 -
 ...PostCommit_Java_ValidatesRunner_Gearpump.groovy | 26 ---
 runners/apex/build.gradle  | 47 +---
 runners/flink/build.gradle |  2 +-
 runners/gearpump/build.gradle  | 51 +++---
 .../beam/sdk/testing/UsesParDoLifecycle.java   | 24 ++
 .../beam/sdk/transforms/ParDoLifecycleTest.java| 25 ++-
 7 files changed, 164 insertions(+), 41 deletions(-)

diff --git 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
index c16a1e2..512cfa9 100644
--- a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
+++ b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Apex.groovy
@@ -19,15 +19,22 @@
 import common_job_properties
 
 // This job runs the suite of ValidatesRunner tests against the Apex runner.
-mavenJob('beam_PostCommit_Java_ValidatesRunner_Apex') {
+job('beam_PostCommit_Java_ValidatesRunner_Apex_Gradle') {
   description('Runs the ValidatesRunner suite on the Apex runner.')
+  previousNames('beam_PostCommit_Java_ValidatesRunner_Apex')
   previousNames('beam_PostCommit_Java_RunnableOnService_Apex')
 
   // Set common parameters.
   common_job_properties.setTopLevelMainJobProperties(delegate)
 
-  // Set maven parameters.
-  common_job_properties.setMavenConfig(delegate)
+  def gradle_switches = [
+// Gradle log verbosity enough to diagnose basic build issues
+"--info",
+// Continue the build even if there is a failure to show as many potential 
failures as possible.
+'--continue',
+// Until we verify the build cache is working appropriately, force 
rerunning all tasks
+'--rerun-tasks',
+  ]
 
   // Sets that this is a PostCommit job.
   common_job_properties.setPostCommit(delegate)
@@ -38,11 +45,14 @@ mavenJob('beam_PostCommit_Java_ValidatesRunner_Apex') {
 'Apache Apex Runner ValidatesRunner Tests',
 'Run Apex ValidatesRunner')
 
-  // Maven goals for this job.
-  goals('''clean verify --projects runners/apex \
-  --also-make \
-  --batch-mode \
-  --errors \
-  --activate-profiles validates-runner-tests \
-  --activate-profiles local-validates-runner-tests''')
+  // Gradle goals for this job.
+  steps {
+gradle {
+  rootBuildScriptDir(common_job_properties.checkoutDir)
+  tasks(':runners:apex:validatesRunner')
+  for (String gradle_switch : gradle_switches) {
+switches(gradle_switch)
+  }
+}
+  }
 }
diff --git 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
index e1cbafe..8ba0a71 100644
--- 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
+++ 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Gearpump.groovy
@@ -20,9 +20,9 @@ import common_job_properties
 
 // This job runs the suite of ValidatesRunner tests against the Gearpump
 // runner.
-mavenJob('beam_PostCommit_Java_ValidatesRunner_Gearpump') {
+job('beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle') {
   description('Runs the ValidatesRunner suite on the Gearpump runner.')
-
+  previousNames('beam_PostCommit_Java_ValidatesRunner_Gearpump')
   previousNames('beam_PostCommit_Java_RunnableOnService_Gearpump')
 
   // Set common parameters.
@@ -30,8 +30,14 @@ mavenJob('beam_PostCommit_Java_ValidatesRunner_Gearpump') {
   delegate,
   'gearpump-runner')
 
-  // Set maven parameters.
-  common_job_properties.setMavenConfig(delegate)
+  def gradle_switches = [
+// Gradle log verbosity enough to diagnose basic build issues
+"--info",
+// Continue the build even if there is a failure to show as many potential 
failures as possible.
+'--continue',
+// Until we verify the build cache is working appropriately, force 
rerunning all tasks
+'--rerun-tasks',
+  ]
 
   // Sets that this is a PostCommit job.
   // 0 5 31 2 * will run on Feb 31 (i.e. never) according to job properties.
@@ -44,6 +50,14 @@ mavenJob('beam_PostCommit_Java_ValidatesRunner_Gearpump') {
 'Apache Gearpump Runner ValidatesRunner Tests',
   

Build failed in Jenkins: beam_PerformanceTests_Python #1107

2018-04-04 Thread Apache Jenkins Server
See 


Changes:

[ankurgoenka] Fixing lint errrors

[markliu] [BEAM-3946] Fix pubsub_matcher_test which depends on

[amyrvold] Rename flink job to fix seed

[ehudm] Add HadoopFileSystemOptions support for Dataflow.

[ehudm] Fix HadoopFileSystem.match bugs.

[ehudm] Test HDFS reads in integration test.

[ehudm] Fix linter errors and add missing license.

[github] [BEAM-3774] Adds support for reading from/writing to more BQ

--
[...truncated 62.93 KB...]
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:copy-resources (copy-go-cmd-source) @ 
beam-sdks-go ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 6 resources
[INFO] 
[INFO] --- maven-assembly-plugin:3.1.0:single (export-go-pkg-sources) @ 
beam-sdks-go ---
[INFO] Reading assembly descriptor: descriptor.xml
[INFO] Building zip: 

[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ beam-sdks-go ---
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:get (go-get-imports) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go get google.golang.org/grpc 
golang.org/x/oauth2/google google.golang.org/api/storage/v1 
github.com/spf13/cobra cloud.google.com/go/bigquery 
google.golang.org/api/googleapi google.golang.org/api/dataflow/v1b3
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go build -buildmode=default -o 

 github.com/apache/beam/sdks/go/cmd/beamctl
[INFO] The Result file has been successfuly created : 

[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:build (go-build-linux-amd64) @ beam-sdks-go 
---
[INFO] Prepared command line : bin/go build -buildmode=default -o 

 github.com/apache/beam/sdks/go/cmd/beamctl
[INFO] The Result file has been successfuly created : 

[INFO] 
[INFO] --- maven-checkstyle-plugin:3.0.0:check (default) @ beam-sdks-go ---
[INFO] 
[INFO] --- mvn-golang-wrapper:2.1.6:test (go-test) @ beam-sdks-go ---
[INFO] Prepared command line : bin/go test ./...
[INFO] 
[INFO] -Exec.Out-
[INFO] ?github.com/apache/beam/sdks/go/cmd/beamctl  [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/beamctl/cmd  [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/specialize   [no test files]
[INFO] ?github.com/apache/beam/sdks/go/cmd/symtab   [no test files]
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam 0.055s
[INFO] ok   github.com/apache/beam/sdks/go/pkg/beam/artifact0.100s
[INFO] 
[ERROR] 
[ERROR] -Exec.Err-
[ERROR] # github.com/apache/beam/sdks/go/pkg/beam/util/gcsx
[ERROR] github.com/apache/beam/sdks/go/pkg/beam/util/gcsx/gcs.go:46:37: 
undefined: option.WithoutAuthentication
[ERROR] 
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Beam :: Parent .. SUCCESS [  3.580 s]
[INFO] Apache Beam :: SDKs :: Java :: Build Tools . SUCCESS [  3.260 s]
[INFO] Apache Beam :: Model ... SUCCESS [  0.097 s]
[INFO] Apache Beam :: Model :: Pipeline ... SUCCESS [ 11.221 s]
[INFO] Apache Beam :: Model :: Job Management . SUCCESS [  3.714 s]
[INFO] Apache Beam :: Model :: Fn Execution ... SUCCESS [  3.753 s]
[INFO] Apache Beam :: SDKs  SUCCESS [  0.180 s]
[INFO] Apache Beam :: SDKs :: Go .. FAILURE [ 31.317 s]
[INFO] Apache Beam :: SDKs :: Go :: Container . SKIPPED
[INFO] Apache Beam :: SDKs :: Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Core  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Fn Execution  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions .. SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Extensions :: Google Cloud Platform Core 
SKIPPED
[INFO] Apache Beam :: Runners . SKIPPED
[INFO] Apache Beam :: Runners :: Core Construction Java ... SKIPPED
[INFO] Apache Beam :: Runners :: Core Java  SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Harness . SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: Container ... SKIPPED
[INFO] Apache Beam :: SDKs :: Java :: IO 

[jira] [Work logged] (BEAM-2927) Python SDK support for portable side input

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2927?focusedWorklogId=87837=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87837
 ]

ASF GitHub Bot logged work on BEAM-2927:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:35
Start Date: 05/Apr/18 00:35
Worklog Time Spent: 10m 
  Work Description: robertwb commented on issue #4983: [BEAM-2927] 
Re-enable side inputs for Fn API on Dataflow
URL: https://github.com/apache/beam/pull/4983#issuecomment-378786963
 
 
   run python postcommit
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87837)
Time Spent: 2h 10m  (was: 2h)

> Python SDK support for portable side input
> --
>
> Key: BEAM-2927
> URL: https://issues.apache.org/jira/browse/BEAM-2927
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Henning Rohde
>Assignee: Robert Bradshaw
>Priority: Major
>  Labels: portability
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=87836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87836
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:32
Start Date: 05/Apr/18 00:32
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-378786407
 
 
   Python Postcommit tests are passing. I'm not sure why the `mvn clean install 
-pl sdks/python -am -am...` are broken, as they are a subset of the Postcommit 
suite:
   
   
![image](https://user-images.githubusercontent.com/1301740/38341543-00e4e738-382e-11e8-9b24-35e45dffbc10.png)
   
   since all tests pass with Postcommit, @robertwb PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87836)
Time Spent: 6h 20m  (was: 6h 10m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3973?focusedWorklogId=87831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87831
 ]

ASF GitHub Bot logged work on BEAM-3973:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:15
Start Date: 05/Apr/18 00:15
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4946: 
[BEAM-3973] Adds a parameter to the Cloud Spanner read connector that can 
disable batch API
URL: https://github.com/apache/beam/pull/4946#discussion_r179319302
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadIT.java
 ##
 @@ -193,6 +196,52 @@ public void testQuery() throws Exception {
 p.run();
   }
 
+  @Test
+  public void testReadAll() throws Exception {
+DatabaseClient databaseClient =
+spanner.getDatabaseClient(
+DatabaseId.of(
+project, options.getInstanceId(), databaseName));
+
+List mutations = new ArrayList<>();
+for (int i = 0; i < 5L; i++) {
+  mutations.add(
+  Mutation.newInsertOrUpdateBuilder(options.getTable())
+  .set("key")
+  .to((long) i)
+  .set("value")
+  .to(RandomUtils.randomAlphaNumeric(100))
+  .build());
+}
+
+databaseClient.writeAtLeastOnce(mutations);
+
+SpannerConfig spannerConfig = SpannerConfig.create()
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87831)
Time Spent: 1h 20m  (was: 1h 10m)

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3973?focusedWorklogId=87832=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87832
 ]

ASF GitHub Bot logged work on BEAM-3973:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:15
Start Date: 05/Apr/18 00:15
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4946: 
[BEAM-3973] Adds a parameter to the Cloud Spanner read connector that can 
disable batch API
URL: https://github.com/apache/beam/pull/4946#discussion_r179319684
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadIT.java
 ##
 @@ -193,6 +196,52 @@ public void testQuery() throws Exception {
 p.run();
   }
 
+  @Test
+  public void testReadAll() throws Exception {
+DatabaseClient databaseClient =
+spanner.getDatabaseClient(
+DatabaseId.of(
+project, options.getInstanceId(), databaseName));
+
+List mutations = new ArrayList<>();
+for (int i = 0; i < 5L; i++) {
+  mutations.add(
+  Mutation.newInsertOrUpdateBuilder(options.getTable())
+  .set("key")
+  .to((long) i)
+  .set("value")
+  .to(RandomUtils.randomAlphaNumeric(100))
+  .build());
+}
+
+databaseClient.writeAtLeastOnce(mutations);
+
+SpannerConfig spannerConfig = SpannerConfig.create()
+.withProjectId(project)
+.withInstanceId(options.getInstanceId())
+.withDatabaseId(databaseName);
+
+PCollectionView tx =
+p.apply(
+SpannerIO.createTransaction()
+.withSpannerConfig(spannerConfig)
+.withTimestampBound(TimestampBound.strong()));
+
+PCollection allRecords = p.apply(SpannerIO.read()
+.withSpannerConfig(spannerConfig)
+.withBatching(false)
 
 Review comment:
   So the alternative would to catch the root partitionable exception and fall 
back to naive read. I prefer to keep this transparent flag here, we'd rather 
fail the pipeline and give the user feedback, than silently run the inefficient 
query.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87832)
Time Spent: 1.5h  (was: 1h 20m)

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3973?focusedWorklogId=87833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87833
 ]

ASF GitHub Bot logged work on BEAM-3973:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:15
Start Date: 05/Apr/18 00:15
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4946: 
[BEAM-3973] Adds a parameter to the Cloud Spanner read connector that can 
disable batch API
URL: https://github.com/apache/beam/pull/4946#discussion_r179319684
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadIT.java
 ##
 @@ -193,6 +196,52 @@ public void testQuery() throws Exception {
 p.run();
   }
 
+  @Test
+  public void testReadAll() throws Exception {
+DatabaseClient databaseClient =
+spanner.getDatabaseClient(
+DatabaseId.of(
+project, options.getInstanceId(), databaseName));
+
+List mutations = new ArrayList<>();
+for (int i = 0; i < 5L; i++) {
+  mutations.add(
+  Mutation.newInsertOrUpdateBuilder(options.getTable())
+  .set("key")
+  .to((long) i)
+  .set("value")
+  .to(RandomUtils.randomAlphaNumeric(100))
+  .build());
+}
+
+databaseClient.writeAtLeastOnce(mutations);
+
+SpannerConfig spannerConfig = SpannerConfig.create()
+.withProjectId(project)
+.withInstanceId(options.getInstanceId())
+.withDatabaseId(databaseName);
+
+PCollectionView tx =
+p.apply(
+SpannerIO.createTransaction()
+.withSpannerConfig(spannerConfig)
+.withTimestampBound(TimestampBound.strong()));
+
+PCollection allRecords = p.apply(SpannerIO.read()
+.withSpannerConfig(spannerConfig)
+.withBatching(false)
 
 Review comment:
   So the alternative would to be catch the root partitionable exception and 
fall back to naive read. I prefer to keep this transparent flag here, we'd 
rather fail the pipeline and give the user feedback, than silently run the 
inefficient query.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87833)
Time Spent: 1h 40m  (was: 1.5h)

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3973?focusedWorklogId=87829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87829
 ]

ASF GitHub Bot logged work on BEAM-3973:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:15
Start Date: 05/Apr/18 00:15
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4946: 
[BEAM-3973] Adds a parameter to the Cloud Spanner read connector that can 
disable batch API
URL: https://github.com/apache/beam/pull/4946#discussion_r179319227
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadIT.java
 ##
 @@ -193,6 +196,52 @@ public void testQuery() throws Exception {
 p.run();
   }
 
+  @Test
+  public void testReadAll() throws Exception {
+DatabaseClient databaseClient =
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87829)
Time Spent: 1h 10m  (was: 1h)

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3973) Allow to disable batch API in SpannerIO

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3973?focusedWorklogId=87830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87830
 ]

ASF GitHub Bot logged work on BEAM-3973:


Author: ASF GitHub Bot
Created on: 05/Apr/18 00:15
Start Date: 05/Apr/18 00:15
Worklog Time Spent: 10m 
  Work Description: mairbek commented on a change in pull request #4946: 
[BEAM-3973] Adds a parameter to the Cloud Spanner read connector that can 
disable batch API
URL: https://github.com/apache/beam/pull/4946#discussion_r179319153
 
 

 ##
 File path: 
sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/spanner/SpannerReadIT.java
 ##
 @@ -193,6 +196,52 @@ public void testQuery() throws Exception {
 p.run();
   }
 
+  @Test
+  public void testReadAll() throws Exception {
 
 Review comment:
   It does call read all later on, but I've made the name more descriptive 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87830)
Time Spent: 1h 10m  (was: 1h)

> Allow to disable batch API in SpannerIO
> ---
>
> Key: BEAM-3973
> URL: https://issues.apache.org/jira/browse/BEAM-3973
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-gcp
>Affects Versions: 2.4.0
>Reporter: Mairbek Khadikov
>Assignee: Mairbek Khadikov
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In 2.4.0, SpannerIO#read has been migrated to use batch API. The batch API 
> provides abstractions to scale out reads from Spanner, but it requires the 
> query to be root-partitionable. The root-partitionable queries cover majority 
> of the use cases, however there are examples when running arbitrary query is 
> useful. For example, reading all the table names from the 
> information_schema.* and reading the content of those tables in the next 
> step. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3257) Migrate Python Jenkins PreCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3257?focusedWorklogId=87826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87826
 ]

ASF GitHub Bot logged work on BEAM-3257:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:59
Start Date: 04/Apr/18 23:59
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5010: [BEAM-3257] Add Python 
precommit gradle config
URL: https://github.com/apache/beam/pull/5010#issuecomment-378781506
 
 
   run seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87826)
Time Spent: 2h 50m  (was: 2h 40m)

> Migrate Python Jenkins PreCommits to Gradle
> ---
>
> Key: BEAM-3257
> URL: https://issues.apache.org/jira/browse/BEAM-3257
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Code is here: 
> https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_PreCommit_Python_MavenInstall.groovy



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=87820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87820
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:44
Start Date: 04/Apr/18 23:44
Worklog Time Spent: 10m 
  Work Description: yifanzou commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r179315451
 
 

 ##
 File path: runners/direct-java/build.gradle
 ##
 @@ -131,5 +131,18 @@ artifacts {
   shadowTest shadowTestJar
 }
 
+def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
+def gcsBucket = project.findProperty('gcsBucket') ?: 
'temp-storage-for-release-validation-tests/nightly-snapshot-validation'
+def bqDataset = project.findProperty('bqDataset') ?: 
'beam_postrelease_mobile_gaming'
+def pubsubTopic = project.findProperty('pubsubTopic') ?: 
'java_mobile_gaming_topic'
+
 // Generates :runners:direct-java:runQuickstartJavaDirect
-createJavaQuickstartValidationTask(name: 'Direct')
+createJavaExamplesArchetypeValidationTask(type: 'Quickstart', runner: 'Direct')
+
+// Generates :runners:direct-java:runMobileGamingJavaDirect
+createJavaExamplesArchetypeValidationTask(type: 'MobileGaming',
 
 Review comment:
   discussed offline.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87820)
Time Spent: 80h 50m  (was: 80h 40m)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 80h 50m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_MavenInstall #6378

2018-04-04 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5280

2018-04-04 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3983) BigQuery writes from pure SQL

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3983?focusedWorklogId=87818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87818
 ]

ASF GitHub Bot logged work on BEAM-3983:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:33
Start Date: 04/Apr/18 23:33
Worklog Time Spent: 10m 
  Work Description: apilloud commented on issue #4991: [BEAM-3983] [SQL] 
Tables interface supports BigQuery
URL: https://github.com/apache/beam/pull/4991#issuecomment-378777223
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87818)
Time Spent: 50m  (was: 40m)

> BigQuery writes from pure SQL
> -
>
> Key: BEAM-3983
> URL: https://issues.apache.org/jira/browse/BEAM-3983
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> It would be nice if you could write to BigQuery in SQL without writing any 
> java code. For example:
> {code:java}
> INSERT INTO bigquery SELECT * FROM PCOLLECTION{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Flink_Gradle #4

2018-04-04 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87817
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:26
Start Date: 04/Apr/18 23:26
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5025: [BEAM-3250] Migrate 
Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#issuecomment-378776123
 
 
   Run seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87817)
Time Spent: 1h 50m  (was: 1h 40m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87815=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87815
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:25
Start Date: 04/Apr/18 23:25
Worklog Time Spent: 10m 
  Work Description: herohde commented on a change in pull request #5025: 
[BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#discussion_r179312315
 
 

 ##
 File path: runners/gearpump/build.gradle
 ##
 @@ -35,10 +43,40 @@ dependencies {
   shadow library.java.joda_time
   shadow library.java.jackson_annotations
   shadow library.java.findbugs_jsr305
-  testCompile project(path: ":sdks:java:core", configuration: "shadowTest")
-  testCompile library.java.junit
-  testCompile library.java.hamcrest_core
-  testCompile library.java.jackson_databind
-  testCompile library.java.jackson_dataformat_yaml
-  testCompile library.java.mockito_core
+  shadowTest project(path: ":sdks:java:core", configuration: "shadowTest")
+  shadowTest library.java.junit
+  shadowTest library.java.hamcrest_core
+  shadowTest library.java.jackson_databind
+  shadowTest library.java.jackson_dataformat_yaml
+  shadowTest library.java.mockito_core
+  validatesRunner project(path: ":sdks:java:core", configuration: "shadowTest")
+  validatesRunner project(path: project.path, configuration: "shadow")
+}
+
+task validatesRunnerStreaming (type: Test) {
+  group = "Verification"
+  systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  "--runner=TestGearpumpRunner",
+  "--streaming=true",
+  ])
+
+  classpath = configurations.validatesRunner
+  testClassesDirs = 
files(project(":sdks:java:core").sourceSets.test.output.classesDirs)
+  useJUnit {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 
'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders'
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87815)
Time Spent: 1.5h  (was: 1h 20m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87816=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87816
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:25
Start Date: 04/Apr/18 23:25
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5025: [BEAM-3250] Migrate 
Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#issuecomment-378775911
 
 
   Thanks @lukecwik. PTAL


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87816)
Time Spent: 1h 40m  (was: 1.5h)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3995) Launch Nexmark suites from gradle and update web page docs

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3995?focusedWorklogId=87813=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87813
 ]

ASF GitHub Bot logged work on BEAM-3995:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:18
Start Date: 04/Apr/18 23:18
Worklog Time Spent: 10m 
  Work Description: kennknowles closed pull request #5026: [BEAM-3995] 
Build self-contained Nexmark jar
URL: https://github.com/apache/beam/pull/5026
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/java/nexmark/build.gradle b/sdks/java/nexmark/build.gradle
index 078365df375..b2f8c3e4552 100644
--- a/sdks/java/nexmark/build.gradle
+++ b/sdks/java/nexmark/build.gradle
@@ -23,26 +23,32 @@ description = "Apache Beam :: SDKs :: Java :: Nexmark"
 
 dependencies {
   compile library.java.guava
-  shadow project(path: ":sdks:java:core", configuration: "shadow")
-  shadow project(path: ":sdks:java:io:google-cloud-platform", configuration: 
"shadow")
-  shadow project(path: ":sdks:java:extensions:google-cloud-platform-core", 
configuration: "shadow")
-  shadow project(path: ":sdks:java:extensions:sql", configuration: "shadow")
-  shadow library.java.google_api_services_bigquery
-  shadow library.java.jackson_core
-  shadow library.java.jackson_annotations
-  shadow library.java.jackson_databind
-  shadow library.java.avro
-  shadow library.java.joda_time
-  shadow library.java.slf4j_api
-  shadow library.java.findbugs_jsr305
-  shadow library.java.junit
-  shadow library.java.hamcrest_core
-  shadow library.java.commons_lang3
-  shadow project(path: ":runners:direct-java", configuration: "shadow")
-  shadow library.java.slf4j_jdk14
+  compile project(path: ":sdks:java:core")
+  compile project(path: ":sdks:java:io:google-cloud-platform")
+  compile project(path: ":sdks:java:extensions:google-cloud-platform-core")
+  compile project(path: ":sdks:java:extensions:sql")
+  compile library.java.google_api_services_bigquery
+  compile library.java.jackson_core
+  compile library.java.jackson_annotations
+  compile library.java.jackson_databind
+  compile library.java.avro
+  compile library.java.joda_time
+  compile library.java.slf4j_api
+  compile library.java.findbugs_jsr305
+  compile library.java.junit
+  compile library.java.hamcrest_core
+  compile library.java.commons_lang3
+  compile project(path: ":runners:direct-java")
+  runtime library.java.slf4j_jdk14
   testCompile library.java.hamcrest_core
 }
 
 test {
   jvmArgs "-da"
 }
+
+jar {
+  manifest {
+attributes 'Main-Class': 'org.apache.beam.sdk.nexmark.Main'
+  }
+}


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87813)
Time Spent: 40m  (was: 0.5h)

> Launch Nexmark suites from gradle and update web page docs
> --
>
> Key: BEAM-3995
> URL: https://issues.apache.org/jira/browse/BEAM-3995
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark, website
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently our instructions for running Nexmark benchmarks on various runners 
> is pretty tightly tied to Maven. We need a good story for running them with 
> gradle (or just building an executable with gradle and running that 
> standalone).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3995) Launch Nexmark suites from gradle and update web page docs

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3995?focusedWorklogId=87812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87812
 ]

ASF GitHub Bot logged work on BEAM-3995:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:18
Start Date: 04/Apr/18 23:18
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5026: [BEAM-3995] Build 
self-contained Nexmark jar
URL: https://github.com/apache/beam/pull/5026#issuecomment-378774775
 
 
   Ah, I think this is really not going to be workable. String munging to plumb 
cmdline args may still be what we need to do, in 2018.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87812)
Time Spent: 0.5h  (was: 20m)

> Launch Nexmark suites from gradle and update web page docs
> --
>
> Key: BEAM-3995
> URL: https://issues.apache.org/jira/browse/BEAM-3995
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark, website
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently our instructions for running Nexmark benchmarks on various runners 
> is pretty tightly tied to Maven. We need a good story for running them with 
> gradle (or just building an executable with gradle and running that 
> standalone).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3339) Create post-release testing of the nightly snapshots

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3339?focusedWorklogId=87811=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87811
 ]

ASF GitHub Bot logged work on BEAM-3339:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:17
Start Date: 04/Apr/18 23:17
Worklog Time Spent: 10m 
  Work Description: tgroh commented on a change in pull request #4788: 
[BEAM-3339] Mobile gaming automation for Java nightly snapshot on core runners
URL: https://github.com/apache/beam/pull/4788#discussion_r179310950
 
 

 ##
 File path: runners/direct-java/build.gradle
 ##
 @@ -131,5 +131,18 @@ artifacts {
   shadowTest shadowTestJar
 }
 
+def gcpProject = project.findProperty('gcpProject') ?: 'apache-beam-testing'
+def gcsBucket = project.findProperty('gcsBucket') ?: 
'temp-storage-for-release-validation-tests/nightly-snapshot-validation'
+def bqDataset = project.findProperty('bqDataset') ?: 
'beam_postrelease_mobile_gaming'
+def pubsubTopic = project.findProperty('pubsubTopic') ?: 
'java_mobile_gaming_topic'
+
 // Generates :runners:direct-java:runQuickstartJavaDirect
-createJavaQuickstartValidationTask(name: 'Direct')
+createJavaExamplesArchetypeValidationTask(type: 'Quickstart', runner: 'Direct')
+
+// Generates :runners:direct-java:runMobileGamingJavaDirect
+createJavaExamplesArchetypeValidationTask(type: 'MobileGaming',
 
 Review comment:
   I fear that this may choke without restricting the scope of the input, as 
these are moderately sized files. Have we run these tasks locally (with 
potentially modified staging, output, projects?)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87811)
Time Spent: 80h 40m  (was: 80.5h)

> Create post-release testing of the nightly snapshots
> 
>
> Key: BEAM-3339
> URL: https://issues.apache.org/jira/browse/BEAM-3339
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Alan Myrvold
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 80h 40m
>  Remaining Estimate: 0h
>
> The nightly java snapshots in 
> https://repository.apache.org/content/groups/snapshots/org/apache/beam should 
> be verified by following the 
> https://beam.apache.org/get-started/quickstart-java/ instructions, to verify 
> that the release is usable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3995) Launch Nexmark suites from gradle and update web page docs

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3995?focusedWorklogId=87810=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87810
 ]

ASF GitHub Bot logged work on BEAM-3995:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:14
Start Date: 04/Apr/18 23:14
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on issue #5026: [BEAM-3995] Build 
self-contained Nexmark jar
URL: https://github.com/apache/beam/pull/5026#issuecomment-378774094
 
 
   It is a big hammer; too big. We'll actually need to have a way to get the 
different runners on the classpath without bundling. But Gradle has nothing 
that I would care to use equivalent to `mvn exec:java`. Still poking around but 
the solutions I've found are not reasonable.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87810)
Time Spent: 20m  (was: 10m)

> Launch Nexmark suites from gradle and update web page docs
> --
>
> Key: BEAM-3995
> URL: https://issues.apache.org/jira/browse/BEAM-3995
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark, website
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently our instructions for running Nexmark benchmarks on various runners 
> is pretty tightly tied to Maven. We need a good story for running them with 
> gradle (or just building an executable with gradle and running that 
> standalone).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87808
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:12
Start Date: 04/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5025: 
[BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#discussion_r179309731
 
 

 ##
 File path: 
sdks/java/core/src/main/java/org/apache/beam/sdk/testing/UsesParDoLifecycle.java
 ##
 @@ -0,0 +1,24 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.sdk.testing;
+
+/**
+ * Category tag for the ParDoLifecycleTest for exclusion (BEAM-3241).
+ */
+public class UsesParDoLifecycle {}
 
 Review comment:
   Use interface instead of class


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87808)
Time Spent: 1h 10m  (was: 1h)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87809=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87809
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:12
Start Date: 04/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5025: 
[BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#discussion_r179310247
 
 

 ##
 File path: runners/apex/build.gradle
 ##
 @@ -73,5 +82,31 @@ task buildDependencyTree(type: DependencyReportTask) {
 }
 compileJava.dependsOn buildDependencyTree
 
+task validatesRunnerBatch (type: Test) {
 
 Review comment:
   nit: spacing `validatesRunnerBatch (` -> `validatesRunnerBatch(`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87809)
Time Spent: 1h 20m  (was: 1h 10m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3995) Launch Nexmark suites from gradle and update web page docs

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3995?focusedWorklogId=87805=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87805
 ]

ASF GitHub Bot logged work on BEAM-3995:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:12
Start Date: 04/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: kennknowles opened a new pull request #5026: 
[BEAM-3995] Build self-contained Nexmark jar
URL: https://github.com/apache/beam/pull/5026
 
 
   To run the Nexmark benchmarks, we will separate the build from the run step. 
The build step is owned by Gradle, while the run step is simply running a 
program. Nexmark is already designed this way.
   
   WIP: the expected full set of dependencies is _not_ contained in the shadow 
jar. I don't know why this is yet, but expect it has to do with how we manage 
the base setup. TBD.
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87805)
Time Spent: 10m
Remaining Estimate: 0h

> Launch Nexmark suites from gradle and update web page docs
> --
>
> Key: BEAM-3995
> URL: https://issues.apache.org/jira/browse/BEAM-3995
> Project: Beam
>  Issue Type: Sub-task
>  Components: examples-nexmark, website
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently our instructions for running Nexmark benchmarks on various runners 
> is pretty tightly tied to Maven. We need a good story for running them with 
> gradle (or just building an executable with gradle and running that 
> standalone).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87806
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:12
Start Date: 04/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5025: 
[BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#discussion_r179309485
 
 

 ##
 File path: runners/gearpump/build.gradle
 ##
 @@ -35,10 +43,40 @@ dependencies {
   shadow library.java.joda_time
   shadow library.java.jackson_annotations
   shadow library.java.findbugs_jsr305
-  testCompile project(path: ":sdks:java:core", configuration: "shadowTest")
-  testCompile library.java.junit
-  testCompile library.java.hamcrest_core
-  testCompile library.java.jackson_databind
-  testCompile library.java.jackson_dataformat_yaml
-  testCompile library.java.mockito_core
+  shadowTest project(path: ":sdks:java:core", configuration: "shadowTest")
+  shadowTest library.java.junit
+  shadowTest library.java.hamcrest_core
+  shadowTest library.java.jackson_databind
+  shadowTest library.java.jackson_dataformat_yaml
+  shadowTest library.java.mockito_core
+  validatesRunner project(path: ":sdks:java:core", configuration: "shadowTest")
+  validatesRunner project(path: project.path, configuration: "shadow")
+}
+
+task validatesRunnerStreaming (type: Test) {
+  group = "Verification"
+  systemProperty "beamTestPipelineOptions", JsonOutput.toJson([
+  "--runner=TestGearpumpRunner",
+  "--streaming=true",
+  ])
+
+  classpath = configurations.validatesRunner
+  testClassesDirs = 
files(project(":sdks:java:core").sourceSets.test.output.classesDirs)
+  useJUnit {
+includeCategories 'org.apache.beam.sdk.testing.ValidatesRunner'
+excludeCategories 
'org.apache.beam.sdk.testing.FlattenWithHeterogeneousCoders'
 
 Review comment:
   These seem like the wrong exclude categories when compared to 
https://github.com/apache/beam/blob/5caa883f191ca4cd9158694ae94a673edbc7b4d5/runners/gearpump/pom.xml#L72


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87806)
Time Spent: 1h 10m  (was: 1h)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87807
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:12
Start Date: 04/Apr/18 23:12
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on a change in pull request #5025: 
[BEAM-3250] Migrate Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#discussion_r179310219
 
 

 ##
 File path: runners/gearpump/build.gradle
 ##
 @@ -35,10 +43,40 @@ dependencies {
   shadow library.java.joda_time
   shadow library.java.jackson_annotations
   shadow library.java.findbugs_jsr305
-  testCompile project(path: ":sdks:java:core", configuration: "shadowTest")
-  testCompile library.java.junit
-  testCompile library.java.hamcrest_core
-  testCompile library.java.jackson_databind
-  testCompile library.java.jackson_dataformat_yaml
-  testCompile library.java.mockito_core
+  shadowTest project(path: ":sdks:java:core", configuration: "shadowTest")
+  shadowTest library.java.junit
+  shadowTest library.java.hamcrest_core
+  shadowTest library.java.jackson_databind
+  shadowTest library.java.jackson_dataformat_yaml
+  shadowTest library.java.mockito_core
+  validatesRunner project(path: ":sdks:java:core", configuration: "shadowTest")
+  validatesRunner project(path: project.path, configuration: "shadow")
+}
+
+task validatesRunnerStreaming (type: Test) {
 
 Review comment:
   nit: spacing `validatesRunnerStreaming (` -> `validatesRunnerStreaming(`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87807)
Time Spent: 1h 10m  (was: 1h)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87803=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87803
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:06
Start Date: 04/Apr/18 23:06
Worklog Time Spent: 10m 
  Work Description: shoyer commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r179308320
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -117,15 +117,14 @@ def get_responses():
   def _execute(self, task, request):
 try:
   response = task()
-except Exception as e:  # pylint: disable=broad-except
-  traceback.print_exc(file=sys.stderr)
+except Exception:  # pylint: disable=broad-except
+  traceback_string = traceback.format_exc()
+  print(traceback_string, file=sys.stderr)
 
 Review comment:
   This should print an exactly equivalent message to what we had before. I 
just didn't see any point in formatting the traceback multiple times.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87803)
Time Spent: 4h 40m  (was: 4.5h)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87804
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:06
Start Date: 04/Apr/18 23:06
Worklog Time Spent: 10m 
  Work Description: shoyer commented on a change in pull request #4959: 
[BEAM-3956] Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#discussion_r179308208
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/sdk_worker.py
 ##
 @@ -117,15 +117,14 @@ def get_responses():
   def _execute(self, task, request):
 try:
   response = task()
-except Exception as e:  # pylint: disable=broad-except
-  traceback.print_exc(file=sys.stderr)
+except Exception:  # pylint: disable=broad-except
+  traceback_string = traceback.format_exc()
+  print(traceback_string, file=sys.stderr)
   logging.error(
   'Error processing instruction %s. Original traceback is\n%s\n',
-  request.instruction_id,
-  traceback.format_exc(e),
 
 Review comment:
   Note: `traceback.format_exc(e)` was mistakenly using the error as an 
argument, but this argument should actually be a numeric `limit` instead: 
https://docs.python.org/2/library/traceback.html#traceback.format_exc


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87804)
Time Spent: 4h 50m  (was: 4h 40m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87802=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87802
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:02
Start Date: 04/Apr/18 23:02
Worklog Time Spent: 10m 
  Work Description: herohde commented on issue #5025: [BEAM-3250] Migrate 
Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025#issuecomment-378771854
 
 
   Run seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87802)
Time Spent: 1h  (was: 50m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87800=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87800
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:00
Start Date: 04/Apr/18 23:00
Worklog Time Spent: 10m 
  Work Description: lukecwik commented on issue #5023: [BEAM-4013] Rename 
flink job to fix seed
URL: https://github.com/apache/beam/pull/5023#issuecomment-378771318
 
 
   You'll want to rename this back once the old job doesn't exist.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87800)
Time Spent: 1h  (was: 50m)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87801=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87801
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 23:00
Start Date: 04/Apr/18 23:00
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #5023: [BEAM-4013] Rename 
flink job to fix seed
URL: https://github.com/apache/beam/pull/5023
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy 
b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy
index b9faeea6cff..af9855adc89 100644
--- a/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy
+++ b/.test-infra/jenkins/job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy
@@ -19,7 +19,7 @@
 import common_job_properties
 
 // This job runs the suite of ValidatesRunner tests against the Flink runner.
-job('beam_PostCommit_Java_ValidatesRunner_Flink') {
+job('beam_PostCommit_Java_ValidatesRunner_Flink_Gradle') {
   description('Runs the ValidatesRunner suite on the Flink runner.')
 
   // Set common parameters.


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87801)
Time Spent: 1h 10m  (was: 1h)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (177c1ba -> 5caa883)

2018-04-04 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 177c1ba  Merge pull request #5018
 add 6c7e17f  Rename flink job to fix seed
 new 5caa883  [BEAM-4013] Rename flink job to fix seed

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../jenkins/job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[beam] 01/01: [BEAM-4013] Rename flink job to fix seed

2018-04-04 Thread lcwik
This is an automated email from the ASF dual-hosted git repository.

lcwik pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 5caa883f191ca4cd9158694ae94a673edbc7b4d5
Merge: 177c1ba 6c7e17f
Author: Lukasz Cwik 
AuthorDate: Wed Apr 4 16:00:12 2018 -0700

[BEAM-4013] Rename flink job to fix seed

 .../jenkins/job_beam_PostCommit_Java_ValidatesRunner_Flink.groovy   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
To stop receiving notification emails like this one, please contact
lc...@apache.org.


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=87799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87799
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 04/Apr/18 22:59
Start Date: 04/Apr/18 22:59
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-378771136
 
 
   OK, I've implemented (2), using the stringified traceback as the error field 
of the `InstructionResponse` proto.
   
   I dropped the original error message because it can always be found on the 
last line of the traceback. 
   
   Example of tracebacks sent back from a worker:
   
   
   
   Note the "RuntimeError: Traceback" line about 12 lines down:
   ```python-traceback
   Traceback (most recent call last):
 File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
   self.run()
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/universal_local_runner.py",
 line 245, in run
   ).run_via_runner_api(self._pipeline_proto)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 218, in run_via_runner_api
   return self.run_stages(*self.create_stages(pipeline_proto))
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 837, in run_stages
   pcoll_buffers, safe_coders).process_bundle.metrics
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 938, in run_stage
   self._progress_frequency).process_bundle(data_input, data_output)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/portability/fn_api_runner.py",
 line 1131, in process_bundle
   raise RuntimeError(result.error)
   RuntimeError: Traceback (most recent call last):
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py",
 line 119, in _execute
   response = task()
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py",
 line 154, in 
   self._execute(lambda: worker.do_instruction(work), work)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py",
 line 200, in do_instruction
   request.instruction_id)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/sdk_worker.py",
 line 217, in process_bundle
   processor.process_bundle(instruction_id)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/bundle_processor.py",
 line 286, in process_bundle
   op.start()
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/operations.py",
 line 238, in start
   self.output(windowed_value)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/operations.py",
 line 159, in output
   cython.cast(Receiver, 
self.receivers[output_index]).receive(windowed_value)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/operations.py",
 line 85, in receive
   cython.cast(Operation, consumer).process(windowed_value)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/worker/operations.py",
 line 392, in process
   self.dofn_receiver.receive(o)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/common.py",
 line 488, in receive
   self.process(windowed_value)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/common.py",
 line 496, in process
   self._reraise_augmented(exn)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/common.py",
 line 540, in _reraise_augmented
   six.reraise(type(new_exn), new_exn, original_traceback)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/common.py",
 line 494, in process
   self.do_fn_invoker.invoke_process(windowed_value)
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/common.py",
 line 284, in invoke_process
   windowed_value, self.process_method(windowed_value.value))
 File 
"/usr/local/google/home/shoyer/open-source/beam/sdks/python/apache_beam/runners/common.py",
 line 595, in process_outputs
   self.main_receivers.receive(windowed_value)
 File 

[jira] [Work logged] (BEAM-3977) Member classes of SdkHarnessClient should have their own files.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3977?focusedWorklogId=87797=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87797
 ]

ASF GitHub Bot logged work on BEAM-3977:


Author: ASF GitHub Bot
Created on: 04/Apr/18 22:51
Start Date: 04/Apr/18 22:51
Worklog Time Spent: 10m 
  Work Description: axelmagn commented on issue #4988: [BEAM-3977] Move out 
nested classes from SdkHarnessClient.
URL: https://github.com/apache/beam/pull/4988#issuecomment-378769685
 
 
   putting this on hold this week in favor of more urgent tasks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87797)
Time Spent: 1h 50m  (was: 1h 40m)

> Member classes of SdkHarnessClient should have their own files.
> ---
>
> Key: BEAM-3977
> URL: https://issues.apache.org/jira/browse/BEAM-3977
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core
>Reporter: Axel Magnuson
>Assignee: Axel Magnuson
>Priority: Minor
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> SdkHarnessClient contains quite a few nested classes that could be split out. 
>  of these, BundleProcessor and ActiveBundle have grown up to be first class 
> concepts that we interact with just as much as the SdkHarnessClient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3250) Migrate ValidatesRunner Jenkins PostCommits to Gradle

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3250?focusedWorklogId=87794=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87794
 ]

ASF GitHub Bot logged work on BEAM-3250:


Author: ASF GitHub Bot
Created on: 04/Apr/18 22:47
Start Date: 04/Apr/18 22:47
Worklog Time Spent: 10m 
  Work Description: herohde opened a new pull request #5025: [BEAM-3250] 
Migrate Apex and Gearpump ValidatesRunner tests to Gradle
URL: https://github.com/apache/beam/pull/5025
 
 
* Added a new test category to exclude ParDoLifecycleTest
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87794)
Time Spent: 50m  (was: 40m)

> Migrate ValidatesRunner Jenkins PostCommits to Gradle
> -
>
> Key: BEAM-3250
> URL: https://issues.apache.org/jira/browse/BEAM-3250
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Henning Rohde
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Update these targets to execute ValidatesRunner tests: 
> https://github.com/apache/beam/search?l=Groovy=ValidatesRunner==%E2%9C%93



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_ValidatesRunner_Dataflow #1257

2018-04-04 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3737) Key-aware batching function

2018-04-04 Thread Robert Bradshaw (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426269#comment-16426269
 ] 

Robert Bradshaw commented on BEAM-3737:
---

It seems that GroupByKey() would already give you values batched per key, 
right? Or are you looking for something you can place before the GBK that 
enables combiner lifting? 

> Key-aware batching function
> ---
>
> Key: BEAM-3737
> URL: https://issues.apache.org/jira/browse/BEAM-3737
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py-core
>Reporter: Chuan Yu Foo
>Priority: Major
>
> I have a CombineFn for which add_input has very large overhead. I would like 
> to batch the incoming elements into a large batch before each call to 
> add_input to reduce this overhead. In other words, I would like to do 
> something like: 
> {{elements | GroupByKey() | BatchElements() | CombineValues(MyCombineFn())}}
> Unfortunately, BatchElements is not key-aware, and can't be used after a 
> GroupByKey to batch elements per key. I'm working around this by doing the 
> batching within CombineValues, which makes the CombineFn rather messy. It 
> would be nice if there were a key-aware BatchElements transform which could 
> be used in this context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3950) Dataflow Runner should supply a wheel version of Python SDK if it is available

2018-04-04 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426263#comment-16426263
 ] 

Valentyn Tymofieiev commented on BEAM-3950:
---

Thank you. Fair point. 

As for staging all wheel files, I couldn't find a way to download all available 
wheels using pip. Looks like there was a PR to add an --all option, but it was 
not merged: 
[https://github.com/pypa/pip/issues/4422|https://github.com/pypa/pip/issues/4422.],
 but choosing a sane default or picking one of the wheels should be doable.

Also, the codepath that downloads the SDK from pypi is not tested during 
release qualification since we pass --sdk_location, so we should test the 
changes to the logic carefully. 

> Dataflow Runner should supply a wheel version of Python SDK if it is available
> --
>
> Key: BEAM-3950
> URL: https://issues.apache.org/jira/browse/BEAM-3950
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (4641fdb -> 177c1ba)

2018-04-04 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 4641fdb  Merge pull request #4979: [BEAM-3965] HDFS Read fixes
 add 0884a53  Fixing lint errrors
 new 177c1ba  Merge pull request #5018

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../fnexecution/control/SdkHarnessClient.java  | 63 +++---
 .../fnexecution/state/GrpcStateService.java|  3 --
 2 files changed, 33 insertions(+), 33 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[beam] 01/01: Merge pull request #5018

2018-04-04 Thread tgroh
This is an automated email from the ASF dual-hosted git repository.

tgroh pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 177c1ba7f043c01d60e201110ce12b49cd192da3
Merge: 4641fdb 0884a53
Author: Thomas Groh 
AuthorDate: Wed Apr 4 15:08:31 2018 -0700

Merge pull request #5018

Fixing lint errrors

 .../fnexecution/control/SdkHarnessClient.java  | 63 +++---
 .../fnexecution/state/GrpcStateService.java|  3 --
 2 files changed, 33 insertions(+), 33 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
tg...@apache.org.


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87782=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87782
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:57
Start Date: 04/Apr/18 21:57
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5023: [BEAM-4013] Rename flink 
job to fix seed
URL: https://github.com/apache/beam/pull/5023#issuecomment-378758419
 
 
   Please merge this PR: I'm blocked on the bug.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87782)
Time Spent: 50m  (was: 40m)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87780=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87780
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:51
Start Date: 04/Apr/18 21:51
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5022: Do not merge, 
test [BEAM-3938] Publish nightly snapshot with gradle
URL: https://github.com/apache/beam/pull/5022#issuecomment-378756977
 
 
   Run Maven Publish


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87780)
Time Spent: 1h 40m  (was: 1.5h)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3950) Dataflow Runner should supply a wheel version of Python SDK if it is available

2018-04-04 Thread Ahmet Altay (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426222#comment-16426222
 ] 

Ahmet Altay commented on BEAM-3950:
---

I assume that the SDK will know whether the default container accepts wheel 
files or not. Based on that it can choose a sane default. If custom containers 
are used, it could be up to the user to pick the right version of the SDK 
(including the wheel files).

It is also fine to stage both tar file and whl file, but if both SDK and user 
has collectively no idea about the target platform than all variations of the 
wheel files need to be staged.

> Dataflow Runner should supply a wheel version of Python SDK if it is available
> --
>
> Key: BEAM-3950
> URL: https://issues.apache.org/jira/browse/BEAM-3950
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87774=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87774
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:44
Start Date: 04/Apr/18 21:44
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5022: Do not merge, 
test [BEAM-3938] Publish nightly snapshot with gradle
URL: https://github.com/apache/beam/pull/5022#issuecomment-378755088
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87774)
Time Spent: 1.5h  (was: 1h 20m)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_ValidatesRunner_Dataflow #1256

2018-04-04 Thread Apache Jenkins Server
See 


Changes:

[markliu] [BEAM-3946] Fix pubsub_matcher_test which depends on

--
[...truncated 253.07 KB...]
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s12"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s14", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Unkey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s13"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Unkey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s15", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "match"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 

[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87773
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:41
Start Date: 04/Apr/18 21:41
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #4979: [BEAM-3965] HDFS 
Read fixes
URL: https://github.com/apache/beam/pull/4979
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/filesystem.py 
b/sdks/python/apache_beam/io/filesystem.py
index 28a0c434dc5..3f7e9aba847 100644
--- a/sdks/python/apache_beam/io/filesystem.py
+++ b/sdks/python/apache_beam/io/filesystem.py
@@ -437,7 +437,8 @@ class FileSystem(BeamPlugin):
   def __init__(self, pipeline_options):
 """
 Args:
-  pipeline_options: Instance of ``PipelineOptions``.
+  pipeline_options: Instance of ``PipelineOptions`` or dict of options and
+values (like ``RuntimeValueProvider.runtime_options``).
 """
 
   @staticmethod
diff --git a/sdks/python/apache_beam/io/filesystems.py 
b/sdks/python/apache_beam/io/filesystems.py
index 17d8d37a061..5bc195bb0d9 100644
--- a/sdks/python/apache_beam/io/filesystems.py
+++ b/sdks/python/apache_beam/io/filesystems.py
@@ -24,6 +24,7 @@
 from apache_beam.io.filesystem import BeamIOError
 from apache_beam.io.filesystem import CompressionTypes
 from apache_beam.io.filesystem import FileSystem
+from apache_beam.options.value_provider import RuntimeValueProvider
 
 # All filesystem implements should be added here as
 # best effort imports. We don't want to force loading
@@ -85,7 +86,11 @@ def get_filesystem(path):
   if len(systems) == 0:
 raise ValueError('Unable to get the Filesystem for path %s' % path)
   elif len(systems) == 1:
-return systems[0](pipeline_options=FileSystems._pipeline_options)
+# Pipeline options could come either from the Pipeline itself (using
+# direct runner), or via RuntimeValueProvider (other runners).
+options = (FileSystems._pipeline_options or
+   RuntimeValueProvider.runtime_options)
+return systems[0](pipeline_options=options)
   else:
 raise ValueError('Found more than one filesystem for path %s' % path)
 except ValueError:
diff --git a/sdks/python/apache_beam/io/hadoopfilesystem.py 
b/sdks/python/apache_beam/io/hadoopfilesystem.py
index bff243aa555..7382c3c8ade 100644
--- a/sdks/python/apache_beam/io/hadoopfilesystem.py
+++ b/sdks/python/apache_beam/io/hadoopfilesystem.py
@@ -35,6 +35,7 @@
 from apache_beam.io.filesystem import FileSystem
 from apache_beam.io.filesystem import MatchResult
 from apache_beam.options.pipeline_options import HadoopFileSystemOptions
+from apache_beam.options.pipeline_options import PipelineOptions
 
 __all__ = ['HadoopFileSystem']
 
@@ -48,12 +49,11 @@
 _FILE_CHECKSUM_BYTES = 'bytes'
 _FILE_CHECKSUM_LENGTH = 'length'
 # WebHDFS FileStatus property constants.
-_FILE_STATUS_NAME = 'name'
+_FILE_STATUS_LENGTH = 'length'
 _FILE_STATUS_PATH_SUFFIX = 'pathSuffix'
 _FILE_STATUS_TYPE = 'type'
 _FILE_STATUS_TYPE_DIRECTORY = 'DIRECTORY'
 _FILE_STATUS_TYPE_FILE = 'FILE'
-_FILE_STATUS_SIZE = 'size'
 
 
 class HdfsDownloader(filesystemio.Downloader):
@@ -61,7 +61,7 @@ class HdfsDownloader(filesystemio.Downloader):
   def __init__(self, hdfs_client, path):
 self._hdfs_client = hdfs_client
 self._path = path
-self._size = self._hdfs_client.status(path)[_FILE_STATUS_SIZE]
+self._size = self._hdfs_client.status(path)[_FILE_STATUS_LENGTH]
 
   @property
   def size(self):
@@ -106,20 +106,26 @@ def __init__(self, pipeline_options):
 """
 super(HadoopFileSystem, self).__init__(pipeline_options)
 logging.getLogger('hdfs.client').setLevel(logging.WARN)
-
 if pipeline_options is None:
   raise ValueError('pipeline_options is not set')
-hdfs_options = pipeline_options.view_as(HadoopFileSystemOptions)
-if hdfs_options.hdfs_host is None:
+if isinstance(pipeline_options, PipelineOptions):
+  hdfs_options = pipeline_options.view_as(HadoopFileSystemOptions)
+  hdfs_host = hdfs_options.hdfs_host
+  hdfs_port = hdfs_options.hdfs_port
+  hdfs_user = hdfs_options.hdfs_user
+else:
+  hdfs_host = pipeline_options.get('hdfs_host')
+  hdfs_port = pipeline_options.get('hdfs_port')
+  hdfs_user = pipeline_options.get('hdfs_user')
+
+if hdfs_host is None:
   raise ValueError('hdfs_host is not set')
-if hdfs_options.hdfs_port is None:
+if hdfs_port is None:
   raise ValueError('hdfs_port is not set')
-if hdfs_options.hdfs_user is 

[beam] 01/01: Merge pull request #4979: [BEAM-3965] HDFS Read fixes

2018-04-04 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit 4641fdbf1533c26dff38cc7a4a8228e6f34bb0db
Merge: 20405c3 9ed922f
Author: Chamikara Jayalath 
AuthorDate: Wed Apr 4 14:41:40 2018 -0700

Merge pull request #4979: [BEAM-3965] HDFS Read fixes

 sdks/python/apache_beam/io/filesystem.py   |  3 +-
 sdks/python/apache_beam/io/filesystems.py  |  7 ++-
 sdks/python/apache_beam/io/hadoopfilesystem.py | 42 +++--
 .../python/apache_beam/io/hadoopfilesystem_test.py | 68 +-
 .../io/hdfs_integration_test/Dockerfile| 12 +++-
 .../io/hdfs_integration_test/hdfscli.cfg   | 22 +++
 sdks/python/run_postcommit.sh  |  2 +-
 7 files changed, 119 insertions(+), 37 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[beam] branch master updated (20405c3 -> 4641fdb)

2018-04-04 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 20405c3  [BEAM-3774] Adds support for reading from/writing to more BQ 
geographical locations (#5001)
 add f078f4a  Add HadoopFileSystemOptions support for Dataflow.
 add 6caab3b  Fix HadoopFileSystem.match bugs.
 add 825e797  Test HDFS reads in integration test.
 add 9ed922f  Fix linter errors and add missing license.
 new 4641fdb  Merge pull request #4979: [BEAM-3965] HDFS Read fixes

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/io/filesystem.py   |  3 +-
 sdks/python/apache_beam/io/filesystems.py  |  7 ++-
 sdks/python/apache_beam/io/hadoopfilesystem.py | 42 +++--
 .../python/apache_beam/io/hadoopfilesystem_test.py | 68 +-
 .../io/hdfs_integration_test/Dockerfile| 12 +++-
 .../io/hdfs_integration_test/hdfscli.cfg}  | 10 ++--
 sdks/python/run_postcommit.sh  |  2 +-
 7 files changed, 102 insertions(+), 42 deletions(-)
 copy sdks/{java/core/src/main/resources/org/apache/beam/sdk/sdk.properties => 
python/apache_beam/io/hdfs_integration_test/hdfscli.cfg} (90%)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[jira] [Commented] (BEAM-3950) Dataflow Runner should supply a wheel version of Python SDK if it is available

2018-04-04 Thread Valentyn Tymofieiev (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426212#comment-16426212
 ] 

Valentyn Tymofieiev commented on BEAM-3950:
---

We should start with allowing --sdk_location to point to a wheel file and 
handle that correctly, without renaming to tar. This will unblock release 
qualification of wheels, and allow users to pass wheels if they want to, 
although just that is not a very convenient user experience. 

If we want to stage a wheel by default, I think we should stage both source 
tarball and wheel(s). Then, worker container should decide what to use. It can 
try to use the wheel, if it does not work out, fall back to use tarball. 
Reasons:
 * Custom containers may have a platform that is incompatible with the wheel we 
choose to stage.
 * Python 3 containers may choose to install SDK from sources for sometime, 
until we start building Python3 wheels.
 * Wheels may not be immediately recognized by Dataflow worker containers, 
although this is not critical if we can wait with SDK changes.

Starting from version 2.4, Dataflow SDK should be installing it's dependency 
apache-beam[gcp] on Dataflow workers from wheels already. 

> Dataflow Runner should supply a wheel version of Python SDK if it is available
> --
>
> Key: BEAM-3950
> URL: https://issues.apache.org/jira/browse/BEAM-3950
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow, sdk-py-core
>Reporter: Valentyn Tymofieiev
>Assignee: Valentyn Tymofieiev
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Flink_Gradle #1

2018-04-04 Thread Apache Jenkins Server
See 


--
[...truncated 67.16 MB...]
04/04/2018 21:39:45 PAssert$168/GroupGlobally/GatherAllOutputs/GroupByKey 
-> 
PAssert$168/GroupGlobally/GatherAllOutputs/Values/Values/Map/ParMultiDo(Anonymous)
 -> PAssert$168/GroupGlobally/RewindowActuals/Window.Assign.out -> 
PAssert$168/GroupGlobally/KeyForDummy/AddKeys/Map/ParMultiDo(Anonymous)(1/1) 
switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_ERROR
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.client.JobClientActor 
logAndPrintMessage
INFO: 04/04/2018 21:39:45   ToKeyedWorkItem(1/1) switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_OUT
04/04/2018 21:39:45 ToKeyedWorkItem(1/1) switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_ERROR
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task 
transitionState
INFO: Combine.perKey(TestCombineFnWithContext) -> 
PAssert$167/GroupGlobally/Window.Into()/Window.Assign.out -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous)/ParMultiDo(Anonymous)
 -> 
PAssert$167/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign.out 
-> ToKeyedWorkItem (1/1) (092c66234519bd2f28c9f1b120d61c19) switched from 
RUNNING to FINISHED.
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task run
INFO: Freeing task resources for Combine.perKey(TestCombineFnWithContext) 
-> PAssert$167/GroupGlobally/Window.Into()/Window.Assign.out -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous)/ParMultiDo(Anonymous)
 -> 
PAssert$167/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign.out 
-> ToKeyedWorkItem (1/1) (092c66234519bd2f28c9f1b120d61c19).
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task run
INFO: Ensuring all FileSystem streams are closed for task 
Combine.perKey(TestCombineFnWithContext) -> 
PAssert$167/GroupGlobally/Window.Into()/Window.Assign.out -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous)/ParMultiDo(Anonymous)
 -> 
PAssert$167/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign.out 
-> ToKeyedWorkItem (1/1) (092c66234519bd2f28c9f1b120d61c19) [FINISHED]
Apr 04, 2018 9:39:45 PM grizzled.slf4j.Logger info
INFO: Un-registering task and sending final execution state FINISHED to 
JobManager for task Combine.perKey(TestCombineFnWithContext) -> 
PAssert$167/GroupGlobally/Window.Into()/Window.Assign.out -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous)/ParMultiDo(Anonymous)
 -> 
PAssert$167/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign.out 
-> ToKeyedWorkItem (092c66234519bd2f28c9f1b120d61c19)
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task 
transitionState
INFO: PAssert$167/GroupGlobally/GatherAllOutputs/GroupByKey -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Values/Values/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/RewindowActuals/Window.Assign.out -> 
PAssert$167/GroupGlobally/KeyForDummy/AddKeys/Map/ParMultiDo(Anonymous) (1/1) 
(33898a90dc5185139b4d884e66af6c70) switched from RUNNING to FINISHED.
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task run
INFO: Freeing task resources for 
PAssert$167/GroupGlobally/GatherAllOutputs/GroupByKey -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Values/Values/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/RewindowActuals/Window.Assign.out -> 
PAssert$167/GroupGlobally/KeyForDummy/AddKeys/Map/ParMultiDo(Anonymous) (1/1) 
(33898a90dc5185139b4d884e66af6c70).
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task run
INFO: Ensuring all FileSystem streams are closed for task 
PAssert$167/GroupGlobally/GatherAllOutputs/GroupByKey -> 
PAssert$167/GroupGlobally/GatherAllOutputs/Values/Values/Map/ParMultiDo(Anonymous)
 -> PAssert$167/GroupGlobally/RewindowActuals/Window.Assign.out -> 
PAssert$167/GroupGlobally/KeyForDummy/AddKeys/Map/ParMultiDo(Anonymous) (1/1) 
(33898a90dc5185139b4d884e66af6c70) [FINISHED]
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.taskmanager.Task 
transitionState
INFO: ToKeyedWorkItem (1/1) (7c2e02e598286e6d11c673178bb780db) switched 
from RUNNING to FINISHED.
Apr 04, 2018 9:39:45 PM org.apache.flink.runtime.executiongraph.Execution 
transitionState
INFO: 

[jira] [Resolved] (BEAM-3946) Python SDK tests are failing if no GOOGLE_APPLICATION_CREDENTIALS was set

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-3946.

   Resolution: Fixed
Fix Version/s: Not applicable

> Python SDK tests are failing if no GOOGLE_APPLICATION_CREDENTIALS was set
> -
>
> Key: BEAM-3946
> URL: https://issues.apache.org/jira/browse/BEAM-3946
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: Alexey Romanenko
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running locally mvn clean install fails on following Apache Beam :: SDKs :: 
> Python tests:
> {{ERROR: test_message_matcher_mismatch 
> (apache_beam.io.gcp.tests.pubsub_matcher_test.PubSubMatcherTest)}}
>  {{ERROR: test_message_matcher_success 
> (apache_beam.io.gcp.tests.pubsub_matcher_test.PubSubMatcherTest)}}
>  {{ERROR: test_message_metcher_timeout 
> (apache_beam.io.gcp.tests.pubsub_matcher_test.PubSubMatcherTest)}}
>  
> with an error:
> DefaultCredentialsError: Could not automatically determine credentials. 
> Please set GOOGLE_APPLICATION_CREDENTIALS or
>  explicitly create credential and re-run the application. For more
>  information, please see
>  
> [https://developers.google.com/accounts/docs/application-default-credentials].
>   >> begin captured logging << 
>  google.auth.transport._http_client: DEBUG: Making request: GET 
> [http://169.254.169.254|http://169.254.169.254/]
>  google.auth.compute_engine._metadata: INFO: Compute Engine Metadata server 
> unavailable.
>  - >> end captured logging << -
>  
> It looks like it's a regression and it was caused by this commit: 
> [301853647f2c726c04c5bdb02cab6ff6b39f09d0|https://github.com/apache/beam/commit/301853647f2c726c04c5bdb02cab6ff6b39f09d0]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3946) Python SDK tests are failing if no GOOGLE_APPLICATION_CREDENTIALS was set

2018-04-04 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426204#comment-16426204
 ] 

Mark Liu commented on BEAM-3946:


https://github.com/apache/beam/pull/5021 should fix this problem. 

> Python SDK tests are failing if no GOOGLE_APPLICATION_CREDENTIALS was set
> -
>
> Key: BEAM-3946
> URL: https://issues.apache.org/jira/browse/BEAM-3946
> Project: Beam
>  Issue Type: Bug
>  Components: examples-python
>Reporter: Alexey Romanenko
>Assignee: Mark Liu
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running locally mvn clean install fails on following Apache Beam :: SDKs :: 
> Python tests:
> {{ERROR: test_message_matcher_mismatch 
> (apache_beam.io.gcp.tests.pubsub_matcher_test.PubSubMatcherTest)}}
>  {{ERROR: test_message_matcher_success 
> (apache_beam.io.gcp.tests.pubsub_matcher_test.PubSubMatcherTest)}}
>  {{ERROR: test_message_metcher_timeout 
> (apache_beam.io.gcp.tests.pubsub_matcher_test.PubSubMatcherTest)}}
>  
> with an error:
> DefaultCredentialsError: Could not automatically determine credentials. 
> Please set GOOGLE_APPLICATION_CREDENTIALS or
>  explicitly create credential and re-run the application. For more
>  information, please see
>  
> [https://developers.google.com/accounts/docs/application-default-credentials].
>   >> begin captured logging << 
>  google.auth.transport._http_client: DEBUG: Making request: GET 
> [http://169.254.169.254|http://169.254.169.254/]
>  google.auth.compute_engine._metadata: INFO: Compute Engine Metadata server 
> unavailable.
>  - >> end captured logging << -
>  
> It looks like it's a regression and it was caused by this commit: 
> [301853647f2c726c04c5bdb02cab6ff6b39f09d0|https://github.com/apache/beam/commit/301853647f2c726c04c5bdb02cab6ff6b39f09d0]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #5279

2018-04-04 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-04 Thread Chamikara Jayalath (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chamikara Jayalath resolved BEAM-3774.
--
   Resolution: Fixed
Fix Version/s: 2.5.0

> Update BigQuery jobs to explicitly specify the region
> -
>
> Key: BEAM-3774
> URL: https://issues.apache.org/jira/browse/BEAM-3774
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> This is needed to support BQ regions other than US and EU. Region can be 
> obtained by a Dataset.get() request so no need to update the user API.
> Both Python and Java SDKs have to be updated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3774) Update BigQuery jobs to explicitly specify the region

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3774?focusedWorklogId=87767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87767
 ]

ASF GitHub Bot logged work on BEAM-3774:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:20
Start Date: 04/Apr/18 21:20
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #5001: [BEAM-3774] Adds 
support for reading from/writing to more BQ geographical locations
URL: https://github.com/apache/beam/pull/5001
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
index 96a06229713..29b405bf368 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
@@ -20,6 +20,7 @@
 
 import static com.google.common.base.Preconditions.checkState;
 
+import com.google.api.services.bigquery.model.Dataset;
 import com.google.api.services.bigquery.model.Job;
 import com.google.api.services.bigquery.model.JobStatus;
 import com.google.api.services.bigquery.model.TableReference;
@@ -203,12 +204,28 @@ static void verifyDatasetPresence(DatasetService 
datasetService, TableReference
   } else {
 throw new RuntimeException(
 String.format(
-UNABLE_TO_CONFIRM_PRESENCE_OF_RESOURCE_ERROR, "dataset", 
toTableSpec(table)),
-e);
+UNABLE_TO_CONFIRM_PRESENCE_OF_RESOURCE_ERROR, "dataset", 
toTableSpec(table)), e);
   }
 }
   }
 
+  static String getDatasetLocation(
+  DatasetService datasetService, String projectId, String datasetId) {
+Dataset dataset;
+try {
+  dataset = datasetService.getDataset(projectId, datasetId);
+} catch (Exception e) {
+  if (e instanceof InterruptedException) {
+Thread.currentThread().interrupt();
+  }
+  throw new RuntimeException(
+  String.format(
+  "unable to obtain dataset for dataset %s in project %s", 
datasetId, projectId),
+  e);
+}
+return dataset.getLocation();
+  }
+
   static void verifyTablePresence(DatasetService datasetService, 
TableReference table) {
 try {
   datasetService.getTable(table);
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
index 88de9b4e505..fab238cb788 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
@@ -170,6 +170,13 @@
  * .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]"));
  * }
  *
+ * Users can optionally specify a query priority using {@link 
TypedRead#withQueryPriority(
+ * TypedRead.QueryPriority)} and a geographic location where the query will be 
executed using {@link
+ * TypedRead#withQueryLocation(String)}. Query location must be specified for 
jobs that are not
+ * executed in US or EU. See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query;>BigQuery
 Jobs:
+ * query.
+ *
  * Writing
  *
  * To write to a BigQuery table, apply a {@link BigQueryIO.Write} 
transformation. This consumes a
@@ -549,6 +556,7 @@ public Read withTemplateCompatibility() {
   abstract Builder setWithTemplateCompatibility(Boolean 
useTemplateCompatibility);
   abstract Builder setBigQueryServices(BigQueryServices 
bigQueryServices);
   abstract Builder setQueryPriority(QueryPriority priority);
+  abstract Builder setQueryLocation(String location);
   abstract TypedRead build();
 
   abstract Builder setParseFn(
@@ -570,6 +578,8 @@ public Read withTemplateCompatibility() {
 
 @Nullable abstract QueryPriority getQueryPriority();
 
+@Nullable abstract String getQueryLocation();
+
 @Nullable abstract Coder getCoder();
 
 /**
@@ -632,7 +642,8 @@ public Read withTemplateCompatibility() {
 getBigQueryServices(),
 coder,
 getParseFn(),
-MoreObjects.firstNonNull(getQueryPriority(), 
QueryPriority.BATCH));
+MoreObjects.firstNonNull(getQueryPriority(), 
QueryPriority.BATCH),
+getQueryLocation());
   }
   return source;
 }
@@ -687,7 

[beam] branch master updated: [BEAM-3774] Adds support for reading from/writing to more BQ geographical locations (#5001)

2018-04-04 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
 new 20405c3  [BEAM-3774] Adds support for reading from/writing to more BQ 
geographical locations (#5001)
20405c3 is described below

commit 20405c3eb5d5a58176ab93e62fa730f76758e208
Author: Chamikara Jayalath 
AuthorDate: Wed Apr 4 14:20:23 2018 -0700

[BEAM-3774] Adds support for reading from/writing to more BQ geographical 
locations (#5001)

* Adds support for reading from/writing to BigQuery datasets that are not 
in US or EU locations.

* Addressing reviewer comments.
---
 .../beam/sdk/io/gcp/bigquery/BigQueryHelpers.java  | 21 -
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java   | 28 ++-
 .../sdk/io/gcp/bigquery/BigQueryQuerySource.java   | 54 +++---
 .../beam/sdk/io/gcp/bigquery/BigQueryServices.java |  2 +-
 .../sdk/io/gcp/bigquery/BigQueryServicesImpl.java  | 10 ++--
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java| 17 ---
 .../beam/sdk/io/gcp/bigquery/WriteTables.java  | 11 +++--
 .../sdk/io/gcp/bigquery/BigQueryIOReadTest.java|  6 ++-
 .../beam/sdk/io/gcp/bigquery/FakeJobService.java   |  2 +-
 9 files changed, 115 insertions(+), 36 deletions(-)

diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
index 96a0622..29b405b 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryHelpers.java
@@ -20,6 +20,7 @@ package org.apache.beam.sdk.io.gcp.bigquery;
 
 import static com.google.common.base.Preconditions.checkState;
 
+import com.google.api.services.bigquery.model.Dataset;
 import com.google.api.services.bigquery.model.Job;
 import com.google.api.services.bigquery.model.JobStatus;
 import com.google.api.services.bigquery.model.TableReference;
@@ -203,12 +204,28 @@ public class BigQueryHelpers {
   } else {
 throw new RuntimeException(
 String.format(
-UNABLE_TO_CONFIRM_PRESENCE_OF_RESOURCE_ERROR, "dataset", 
toTableSpec(table)),
-e);
+UNABLE_TO_CONFIRM_PRESENCE_OF_RESOURCE_ERROR, "dataset", 
toTableSpec(table)), e);
   }
 }
   }
 
+  static String getDatasetLocation(
+  DatasetService datasetService, String projectId, String datasetId) {
+Dataset dataset;
+try {
+  dataset = datasetService.getDataset(projectId, datasetId);
+} catch (Exception e) {
+  if (e instanceof InterruptedException) {
+Thread.currentThread().interrupt();
+  }
+  throw new RuntimeException(
+  String.format(
+  "unable to obtain dataset for dataset %s in project %s", 
datasetId, projectId),
+  e);
+}
+return dataset.getLocation();
+  }
+
   static void verifyTablePresence(DatasetService datasetService, 
TableReference table) {
 try {
   datasetService.getTable(table);
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
index 88de9b4..fab238c 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java
@@ -170,6 +170,13 @@ import org.slf4j.LoggerFactory;
  * .fromQuery("SELECT year, mean_temp FROM [samples.weather_stations]"));
  * }
  *
+ * Users can optionally specify a query priority using {@link 
TypedRead#withQueryPriority(
+ * TypedRead.QueryPriority)} and a geographic location where the query will be 
executed using {@link
+ * TypedRead#withQueryLocation(String)}. Query location must be specified for 
jobs that are not
+ * executed in US or EU. See https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query;>BigQuery
 Jobs:
+ * query.
+ *
  * Writing
  *
  * To write to a BigQuery table, apply a {@link BigQueryIO.Write} 
transformation. This consumes a
@@ -549,6 +556,7 @@ public class BigQueryIO {
   abstract Builder setWithTemplateCompatibility(Boolean 
useTemplateCompatibility);
   abstract Builder setBigQueryServices(BigQueryServices 
bigQueryServices);
   abstract Builder setQueryPriority(QueryPriority priority);
+  abstract Builder setQueryLocation(String location);
   abstract TypedRead build();
 
   abstract Builder setParseFn(
@@ -570,6 +578,8 @@ public class BigQueryIO {
 
 @Nullable abstract QueryPriority 

[jira] [Assigned] (BEAM-3252) Update contributors guide to discuss Gradle

2018-04-04 Thread Daniel Oliveira (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Oliveira reassigned BEAM-3252:
-

Assignee: Daniel Oliveira

> Update contributors guide to discuss Gradle
> ---
>
> Key: BEAM-3252
> URL: https://issues.apache.org/jira/browse/BEAM-3252
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3965) HDFS read broken in python

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3965?focusedWorklogId=87762=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87762
 ]

ASF GitHub Bot logged work on BEAM-3965:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:06
Start Date: 04/Apr/18 21:06
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #4979: [BEAM-3965] HDFS Read 
fixes
URL: https://github.com/apache/beam/pull/4979#issuecomment-378745249
 
 
   Ready to merge - postcommit passed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87762)
Time Spent: 3.5h  (was: 3h 20m)

> HDFS read broken in python
> --
>
> Key: BEAM-3965
> URL: https://issues.apache.org/jira/browse/BEAM-3965
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> When running a command like:
> {noformat}
> python setup.py sdist > /dev/null && python -m apache_beam.examples.wordcount 
> --output gs://.../py-wordcount-output \
>   --hdfs_host ... --hdfs_port 50070 --hdfs_user ehudm --runner DataflowRunner 
> --project ... \
>   --temp_location gs://.../temp-hdfs-int --staging_location 
> gs://.../staging-hdfs-int \
>   --sdk_location dist/apache-beam-2.5.0.dev0.tar.gz --input 
> hdfs://kinglear.txt
> {noformat}
> I get:
> {noformat}
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
> "__main__", fname, loader, pkg_name)
>   File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
> exec code in run_globals
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 136, in 
> run()
>   File 
> "/usr/local/google/home/ehudm/src/beam/sdks/python/apache_beam/examples/wordcount.py",
>  line 90, in run
> lines = p | 'read' >> ReadFromText(known_args.input)
>   File "apache_beam/io/textio.py", line 522, in __init__
> skip_header_lines=skip_header_lines)
>   File "apache_beam/io/textio.py", line 117, in __init__
> validate=validate)
>   File "apache_beam/io/filebasedsource.py", line 119, in __init__
> self._validate()
>   File "apache_beam/options/value_provider.py", line 124, in _f
> return fnc(self, *args, **kwargs)
>   File "apache_beam/io/filebasedsource.py", line 176, in _validate
> match_result = FileSystems.match([pattern], limits=[1])[0]
>   File "apache_beam/io/filesystems.py", line 159, in match
> return filesystem.match(patterns, limits)
>   File "apache_beam/io/hadoopfilesystem.py", line 221, in match
> raise BeamIOError('Match operation failed', exceptions)
> apache_beam.io.filesystem.BeamIOError: Match operation failed with exceptions 
> {'hdfs://kinglear.txt': KeyError('name',)}
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87761=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87761
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 04/Apr/18 21:04
Start Date: 04/Apr/18 21:04
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5022: Do not merge, 
test [BEAM-3938] Publish nightly snapshot with gradle
URL: https://github.com/apache/beam/pull/5022#issuecomment-378744619
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87761)
Time Spent: 1h 20m  (was: 1h 10m)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87758=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87758
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 04/Apr/18 20:38
Start Date: 04/Apr/18 20:38
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5022: Do not merge, 
test [BEAM-3938] Publish nightly snapshot with gradle
URL: https://github.com/apache/beam/pull/5022#issuecomment-378737235
 
 
   Run Gradle Publish


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87758)
Time Spent: 1h 10m  (was: 1h)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87756=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87756
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 20:33
Start Date: 04/Apr/18 20:33
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #5023: [BEAM-4013] Rename 
flink job to fix seed
URL: https://github.com/apache/beam/pull/5023#issuecomment-378735722
 
 
   Run seed job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87756)
Time Spent: 40m  (was: 0.5h)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87742=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87742
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:53
Start Date: 04/Apr/18 19:53
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5023: [BEAM-4013] 
Rename flink job to fix seed
URL: https://github.com/apache/beam/pull/5023#issuecomment-378724665
 
 
   @udim 
   +R: @tgroh PTAL ?
   
   Sample failure:
   https://builds.apache.org/view/A-D/view/Beam/job/beam_SeedJob/1425/console


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87742)
Time Spent: 0.5h  (was: 20m)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87741=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87741
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:53
Start Date: 04/Apr/18 19:53
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5023: [BEAM-4013] 
Rename flink job to fix seed
URL: https://github.com/apache/beam/pull/5023#issuecomment-378707932
 
 
   Run Seed Job
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87741)
Time Spent: 20m  (was: 10m)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?focusedWorklogId=87740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87740
 ]

ASF GitHub Bot logged work on BEAM-4013:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:53
Start Date: 04/Apr/18 19:53
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5023: [BEAM-4013] 
Rename flink job to fix seed
URL: https://github.com/apache/beam/pull/5023#issuecomment-378707873
 
 
   Trying to fix seed job
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87740)
Time Spent: 10m
Remaining Estimate: 0h

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread Alan Myrvold (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Myrvold reassigned BEAM-4013:
--

Assignee: Alan Myrvold  (was: Davor Bonaci)

> Seed job is failing due to job type change from mavenJob to job.
> 
>
> Key: BEAM-4013
> URL: https://issues.apache.org/jira/browse/BEAM-4013
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>
> ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not 
> match existing type, item type can not be changed
> https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4013) Seed job is failing due to job type change from mavenJob to job.

2018-04-04 Thread Alan Myrvold (JIRA)
Alan Myrvold created BEAM-4013:
--

 Summary: Seed job is failing due to job type change from mavenJob 
to job.
 Key: BEAM-4013
 URL: https://issues.apache.org/jira/browse/BEAM-4013
 Project: Beam
  Issue Type: Bug
  Components: build-system
Reporter: Alan Myrvold
Assignee: Davor Bonaci


ERROR: Type of item "beam_PostCommit_Java_ValidatesRunner_Flink" does not match 
existing type, item type can not be changed

https://github.com/apache/beam/commit/4479148a209a2b8226ab43cfae9ce2f413f08064



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4011) Python SDK: add glob support for HDFS

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4011?focusedWorklogId=87735=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87735
 ]

ASF GitHub Bot logged work on BEAM-4011:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:26
Start Date: 04/Apr/18 19:26
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5024: [BEAM-4011] Normalize 
Filesystems.match() glob behavior.
URL: https://github.com/apache/beam/pull/5024#issuecomment-378716918
 
 
   Do not merge before https://github.com/apache/beam/pull/4979


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87735)
Time Spent: 20m  (was: 10m)

> Python SDK: add glob support for HDFS
> -
>
> Key: BEAM-4011
> URL: https://issues.apache.org/jira/browse/BEAM-4011
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3946) Python SDK tests are failing if no GOOGLE_APPLICATION_CREDENTIALS was set

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3946?focusedWorklogId=87734=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87734
 ]

ASF GitHub Bot logged work on BEAM-3946:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:23
Start Date: 04/Apr/18 19:23
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #5021: [BEAM-3946] Fix 
pubsub_matcher_test which depends on GOOGLE_APPLICATION_CREDENTIALS
URL: https://github.com/apache/beam/pull/5021
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py 
b/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py
index 1fb712fbf67..695bfcd70f6 100644
--- a/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py
+++ b/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher.py
@@ -73,14 +73,14 @@ def __init__(self, project, sub_name, expected_msg, 
timeout=DEFAULT_TIMEOUT):
 
   def _matches(self, _):
 if self.messages is None:
-  subscription = (pubsub
-  .Client(project=self.project)
-  .subscription(self.sub_name))
-  self.messages = self._wait_for_messages(subscription,
+  self.messages = self._wait_for_messages(self._get_subscription(),
   len(self.expected_msg),
   self.timeout)
 return Counter(self.messages) == Counter(self.expected_msg)
 
+  def _get_subscription(self):
+return pubsub.Client(project=self.project).subscription(self.sub_name)
+
   def _wait_for_messages(self, subscription, expected_num, timeout):
 """Wait for messages from given subscription."""
 logging.debug('Start pulling messages from %s', subscription.full_name)
diff --git a/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher_test.py 
b/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher_test.py
index a7fd310c7dd..6bb780cb714 100644
--- a/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher_test.py
+++ b/sdks/python/apache_beam/io/gcp/tests/pubsub_matcher_test.py
@@ -44,10 +44,11 @@ def setUp(self):
['mock_expected_msg'])
 
   @mock.patch('time.sleep', return_value=None)
-  @mock.patch('google.cloud.pubsub.Client.subscription')
-  def test_message_matcher_success(self, mock_sub_cls, unsued_mock):
+  @mock.patch('apache_beam.io.gcp.tests.pubsub_matcher.'
+  'PubSubMessageMatcher._get_subscription')
+  def test_message_matcher_success(self, mock_get_sub, unsued_mock):
 self.pubsub_matcher.expected_msg = ['a', 'b']
-mock_sub = mock_sub_cls.return_value
+mock_sub = mock_get_sub.return_value
 mock_sub.pull.side_effect = [
 [(1, pubsub.message.Message(b'a', 'unused_id'))],
 [(2, pubsub.message.Message(b'b', 'unused_id'))],
@@ -56,10 +57,11 @@ def test_message_matcher_success(self, mock_sub_cls, 
unsued_mock):
 self.assertEqual(mock_sub.pull.call_count, 2)
 
   @mock.patch('time.sleep', return_value=None)
-  @mock.patch('google.cloud.pubsub.Client.subscription')
-  def test_message_matcher_mismatch(self, mock_sub_cls, unused_mock):
+  @mock.patch('apache_beam.io.gcp.tests.pubsub_matcher.'
+  'PubSubMessageMatcher._get_subscription')
+  def test_message_matcher_mismatch(self, mock_get_sub, unused_mock):
 self.pubsub_matcher.expected_msg = ['a']
-mock_sub = mock_sub_cls.return_value
+mock_sub = mock_get_sub.return_value
 mock_sub.pull.return_value = [
 (1, pubsub.message.Message(b'c', 'unused_id')),
 (1, pubsub.message.Message(b'd', 'unused_id')),
@@ -73,9 +75,10 @@ def test_message_matcher_mismatch(self, mock_sub_cls, 
unused_mock):
 in str(error.exception.args[0]))
 
   @mock.patch('time.sleep', return_value=None)
-  @mock.patch('google.cloud.pubsub.Client.subscription')
-  def test_message_metcher_timeout(self, mock_sub_cls, unused_mock):
-mock_sub = mock_sub_cls.return_value
+  @mock.patch('apache_beam.io.gcp.tests.pubsub_matcher.'
+  'PubSubMessageMatcher._get_subscription')
+  def test_message_metcher_timeout(self, mock_get_sub, unused_mock):
+mock_sub = mock_get_sub.return_value
 mock_sub.return_value.full_name.return_value = 'mock_sub'
 self.pubsub_matcher.timeout = 0.1
 with self.assertRaises(AssertionError) as error:


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time 

[beam] 01/01: Merge pull request #5021 from markflyhigh/fix-unittest

2018-04-04 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit cf19a07566548cd1112095334bc0cddf7a8aa9cb
Merge: e80babb 631cbb7
Author: Ahmet Altay 
AuthorDate: Wed Apr 4 12:23:54 2018 -0700

Merge pull request #5021 from markflyhigh/fix-unittest

[BEAM-3946] Fix pubsub_matcher_test which depends on 
GOOGLE_APPLICATION_CREDENTIALS

 .../apache_beam/io/gcp/tests/pubsub_matcher.py  |  8 
 .../apache_beam/io/gcp/tests/pubsub_matcher_test.py | 21 -
 2 files changed, 16 insertions(+), 13 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


[beam] branch master updated (e80babb -> cf19a07)

2018-04-04 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from e80babb  [BEAM-3982] Register Go transform types and functions
 add 631cbb7  [BEAM-3946] Fix pubsub_matcher_test which depends on 
GOOGLE_APPLICATION_CREDENTIALS
 new cf19a07  Merge pull request #5021 from markflyhigh/fix-unittest

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../apache_beam/io/gcp/tests/pubsub_matcher.py  |  8 
 .../apache_beam/io/gcp/tests/pubsub_matcher_test.py | 21 -
 2 files changed, 16 insertions(+), 13 deletions(-)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87733=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87733
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:21
Start Date: 04/Apr/18 19:21
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5022: Do not merge, 
test [BEAM-3938] Publish nightly snapshot with gradle
URL: https://github.com/apache/beam/pull/5022#issuecomment-378715507
 
 
   Run Seed Job


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87733)
Time Spent: 1h  (was: 50m)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3938) Gradle publish task should authenticate when run from jenkins

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3938?focusedWorklogId=87732=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87732
 ]

ASF GitHub Bot logged work on BEAM-3938:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:21
Start Date: 04/Apr/18 19:21
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5022: Do not merge, 
test [BEAM-3938] Publish nightly snapshot with gradle
URL: https://github.com/apache/beam/pull/5022#issuecomment-378699194
 
 
   Run Seed Job
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87732)
Time Spent: 50m  (was: 40m)

> Gradle publish task should authenticate when run from jenkins
> -
>
> Key: BEAM-3938
> URL: https://issues.apache.org/jira/browse/BEAM-3938
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Jean-Baptiste Onofré
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> ./gradlew publish should be able to write to 
> [https://repository.apache.org/content/repositories/snapshots] when run from 
> jenkins, as the maven 
> [job_beam_Release_NightlySnapshot.groovy|https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_beam_Release_NightlySnapshot.groovy]
>  does.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-2861) test_delete_bq_table_succeeds fails with GOOGLE_APPLICATION_CREDENTIALS

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-2861.

   Resolution: Fixed
Fix Version/s: Not applicable

> test_delete_bq_table_succeeds fails with GOOGLE_APPLICATION_CREDENTIALS
> ---
>
> Key: BEAM-2861
> URL: https://issues.apache.org/jira/browse/BEAM-2861
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> This is a variation of the https://issues.apache.org/jira/browse/BEAM-2101
> The tests are not skipped if the GCP libraries are installed. But the tests 
> also require GCP authentication. We should probably also skip the tests if 
> GCP is installed but user is not authenticated.
> cc: [~pei...@gmail.com]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-2861) test_delete_bq_table_succeeds fails with GOOGLE_APPLICATION_CREDENTIALS

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-2861.
--

> test_delete_bq_table_succeeds fails with GOOGLE_APPLICATION_CREDENTIALS
> ---
>
> Key: BEAM-2861
> URL: https://issues.apache.org/jira/browse/BEAM-2861
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Ahmet Altay
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> This is a variation of the https://issues.apache.org/jira/browse/BEAM-2101
> The tests are not skipped if the GCP libraries are installed. But the tests 
> also require GCP authentication. We should probably also skip the tests if 
> GCP is installed but user is not authenticated.
> cc: [~pei...@gmail.com]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-1033) BigQueryMatcher is flaky

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-1033.

   Resolution: Fixed
Fix Version/s: Not applicable

> BigQueryMatcher is flaky
> 
>
> Key: BEAM-1033
> URL: https://issues.apache.org/jira/browse/BEAM-1033
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Pei He
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> Jenkins link:
> https://builds.apache.org/job/beam_PreCommit_MavenVerify/5145/console
> Running org.apache.beam.examples.WindowedWordCountIT
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 304.282 sec 
> <<< FAILURE! - in org.apache.beam.examples.WindowedWordCountIT
> testWindowedWordCountInBatch(org.apache.beam.examples.WindowedWordCountIT)  
> Time elapsed: 304.282 sec  <<< FAILURE!
> java.lang.AssertionError: 
> Expected: Expected checksum is (cd5b52939257e12428a9fa085c32a84dd209b180)
>  but: Invalid BigQuery response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_0STNX_OD83tQOzo6MvmqXCrk61U","projectId":"apache-beam-testing"},"kind":"bigquery#queryResponse"}
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>   at 
> org.apache.beam.runners.dataflow.testing.TestDataflowRunner.run(TestDataflowRunner.java:164)
>   at 
> org.apache.beam.runners.dataflow.testing.TestDataflowRunner.run(TestDataflowRunner.java:93)
>   at 
> org.apache.beam.runners.dataflow.testing.TestDataflowRunner.run(TestDataflowRunner.java:61)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:179)
>   at 
> org.apache.beam.examples.WindowedWordCount.main(WindowedWordCount.java:224)
>   at 
> org.apache.beam.examples.WindowedWordCountIT.testWindowedWordCountPipeline(WindowedWordCountIT.java:88)
>   at 
> org.apache.beam.examples.WindowedWordCountIT.testWindowedWordCountInBatch(WindowedWordCountIT.java:59)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Results :
> Failed tests: 
>   
> WindowedWordCountIT.testWindowedWordCountInBatch:59->testWindowedWordCountPipeline:88
>  
> Expected: Expected checksum is (cd5b52939257e12428a9fa085c32a84dd209b180)
>  but: Invalid BigQuery response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_0STNX_OD83tQOzo6MvmqXCrk61U","projectId":"apache-beam-testing"},"kind":"bigquery#queryResponse"}
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-1033) BigQueryMatcher is flaky

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-1033.
--

> BigQueryMatcher is flaky
> 
>
> Key: BEAM-1033
> URL: https://issues.apache.org/jira/browse/BEAM-1033
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Pei He
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> Jenkins link:
> https://builds.apache.org/job/beam_PreCommit_MavenVerify/5145/console
> Running org.apache.beam.examples.WindowedWordCountIT
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 304.282 sec 
> <<< FAILURE! - in org.apache.beam.examples.WindowedWordCountIT
> testWindowedWordCountInBatch(org.apache.beam.examples.WindowedWordCountIT)  
> Time elapsed: 304.282 sec  <<< FAILURE!
> java.lang.AssertionError: 
> Expected: Expected checksum is (cd5b52939257e12428a9fa085c32a84dd209b180)
>  but: Invalid BigQuery response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_0STNX_OD83tQOzo6MvmqXCrk61U","projectId":"apache-beam-testing"},"kind":"bigquery#queryResponse"}
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
>   at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
>   at 
> org.apache.beam.runners.dataflow.testing.TestDataflowRunner.run(TestDataflowRunner.java:164)
>   at 
> org.apache.beam.runners.dataflow.testing.TestDataflowRunner.run(TestDataflowRunner.java:93)
>   at 
> org.apache.beam.runners.dataflow.testing.TestDataflowRunner.run(TestDataflowRunner.java:61)
>   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:179)
>   at 
> org.apache.beam.examples.WindowedWordCount.main(WindowedWordCount.java:224)
>   at 
> org.apache.beam.examples.WindowedWordCountIT.testWindowedWordCountPipeline(WindowedWordCountIT.java:88)
>   at 
> org.apache.beam.examples.WindowedWordCountIT.testWindowedWordCountInBatch(WindowedWordCountIT.java:59)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> Results :
> Failed tests: 
>   
> WindowedWordCountIT.testWindowedWordCountInBatch:59->testWindowedWordCountPipeline:88
>  
> Expected: Expected checksum is (cd5b52939257e12428a9fa085c32a84dd209b180)
>  but: Invalid BigQuery response: 
> {"jobComplete":false,"jobReference":{"jobId":"job_0STNX_OD83tQOzo6MvmqXCrk61U","projectId":"apache-beam-testing"},"kind":"bigquery#queryResponse"}
> Tests run: 3, Failures: 1, Errors: 0, Skipped: 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-1583) Separate GCP test required packages from general GCP dependencies

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu resolved BEAM-1583.

   Resolution: Done
Fix Version/s: Not applicable

> Separate GCP test required packages from general GCP dependencies
> -
>
> Key: BEAM-1583
> URL: https://issues.apache.org/jira/browse/BEAM-1583
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> This issue comes from discussion under:
> https://github.com/apache/beam/pull/2064#discussion_r103755653
> If more GCP dependencies introduced for test only purpose, thinking to move 
> them to a separate group.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-1583) Separate GCP test required packages from general GCP dependencies

2018-04-04 Thread Mark Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16426032#comment-16426032
 ] 

Mark Liu commented on BEAM-1583:


Currently, there is a separate dependency group GCP_REQUIREMENTS in setup.py. 
People can choose to install GCP related dependencies separately.

I think it's a good situation for this JIRA. Will close it and feel free to 
reopen if we need improvement.

> Separate GCP test required packages from general GCP dependencies
> -
>
> Key: BEAM-1583
> URL: https://issues.apache.org/jira/browse/BEAM-1583
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> This issue comes from discussion under:
> https://github.com/apache/beam/pull/2064#discussion_r103755653
> If more GCP dependencies introduced for test only purpose, thinking to move 
> them to a separate group.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-1583) Separate GCP test required packages from general GCP dependencies

2018-04-04 Thread Mark Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Liu closed BEAM-1583.
--

> Separate GCP test required packages from general GCP dependencies
> -
>
> Key: BEAM-1583
> URL: https://issues.apache.org/jira/browse/BEAM-1583
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-core
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Major
> Fix For: Not applicable
>
>
> This issue comes from discussion under:
> https://github.com/apache/beam/pull/2064#discussion_r103755653
> If more GCP dependencies introduced for test only purpose, thinking to move 
> them to a separate group.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1550

2018-04-04 Thread Apache Jenkins Server
See 


Changes:

[rmannibucau] [BEAM-3993] read gitignore and add it in rat exclusions

[lcwik] [BEAM-3993] Remove duplicate definitions between .gitignore and

[herohde] [BEAM-3982] Register transform types and functions

[lcwik] BEAM-3256 Add archetype testing/generation to existing GradleBuild

[lcwik] [BEAM-3250] Creating a gradle Jenkins config for Flink PostCommit.

[lcwik] Replace Maven based Flink ValidatesRunner postcommit with Gradle based

[lcwik] [BEAM-3249] Drop Java Maven PreCommit.

[aaltay] Fix Python streaming sordcount IT to unblock PostCommit (#5015)

--
[...truncated 89.79 KB...]
'apache-beam-testing:bqjob_r233028e4761af89e_0162920d9004_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-04 19:06:26,971 56d5f986 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-04 19:06:49,404 56d5f986 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-04 19:06:51,609 56d5f986 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r462c8d483d1b33c5_0162920deffe_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r462c8d483d1b33c5_0162920deffe_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r462c8d483d1b33c5_0162920deffe_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-04 19:06:51,609 56d5f986 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-04 19:07:06,702 56d5f986 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-04 19:07:08,891 56d5f986 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r16723ef8fe95e11c_0162920e3363_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r16723ef8fe95e11c_0162920e3363_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r16723ef8fe95e11c_0162920e3363_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-04 19:07:08,892 56d5f986 MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-04 19:07:26,414 56d5f986 MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-04 19:07:28,599 56d5f986 MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

[jira] [Work logged] (BEAM-4011) Python SDK: add glob support for HDFS

2018-04-04 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4011?focusedWorklogId=87724=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-87724
 ]

ASF GitHub Bot logged work on BEAM-4011:


Author: ASF GitHub Bot
Created on: 04/Apr/18 19:03
Start Date: 04/Apr/18 19:03
Worklog Time Spent: 10m 
  Work Description: udim opened a new pull request #5024: [BEAM-4011] 
Normalize Filesystems.match() glob behavior.
URL: https://github.com/apache/beam/pull/5024
 
 
   - Introduces FileSystem.list() abstract method. Lists a directory or
   prefix.
   - Implement FileSystem.match() - no longer abstract, unifies glob
   behavior using fnmatch.fnmatch.
   
   DESCRIPTION HERE
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 87724)
Time Spent: 10m
Remaining Estimate: 0h

> Python SDK: add glob support for HDFS
> -
>
> Key: BEAM-4011
> URL: https://issues.apache.org/jira/browse/BEAM-4011
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >