[jira] [Commented] (BEAM-3396) Update docker development images to use the Gradle build

2018-04-11 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434606#comment-16434606
 ] 

Ismaël Mejía commented on BEAM-3396:


Yes I refer to [these 
images|https://github.com/apache/beam/tree/master/sdks/java/build-tools/src/main/resources/docker]
 and the [documentation|https://beam.apache.org/contribute/docker-images/]. 
Changes in images will be low. These images will probably be migrated/merged 
with the ongoing work on using docker agents on jenkins by [~alanmyrvold] and 
[~yifanzou] in the future.
I will move it into BEAM-4045 because it is definitely not a blocker.

> Update docker development images to use the Gradle build
> 
>
> Key: BEAM-3396
> URL: https://issues.apache.org/jira/browse/BEAM-3396
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Priority: Minor
>
> The docker development images introduced recently are part of the ongoing 
> work on getting reproducible builds on Beam and they should be updated as 
> part of the move to gradle.
> https://beam.apache.org/contribute/docker-images/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3714?focusedWorklogId=90203=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90203
 ]

ASF GitHub Bot logged work on BEAM-3714:


Author: ASF GitHub Bot
Created on: 11/Apr/18 22:38
Start Date: 11/Apr/18 22:38
Worklog Time Spent: 10m 
  Work Description: jkff commented on issue #4786: [BEAM-3714]modified 
result set to be forward only and read only
URL: https://github.com/apache/beam/pull/4786#issuecomment-380617977
 
 
   (sorry, was on leave) - Okay, let's keep the parameter then. I'll be happy 
to merge after the following:
   
   - Please rebase
   - Please document the `withFetchSize` method and say that it should be used 
ONLY if the default value produces out-of-memory errors.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90203)
Time Spent: 2h 10m  (was: 2h)

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Innocent
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4055) Migrate Python SDK off apitools

2018-04-11 Thread Chamikara Jayalath (JIRA)
Chamikara Jayalath created BEAM-4055:


 Summary: Migrate Python SDK off apitools
 Key: BEAM-4055
 URL: https://issues.apache.org/jira/browse/BEAM-4055
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Chamikara Jayalath


apitools library used by Python SDK for connecting to various GCP services 
(Dataflow, GCS, BigQuery) is not being activaly developed. We should move off 
this library to better supported clients. For example [1][2].

[1] [https://developers.google.com/api-client-library/python/]

[2] [https://googlecloudplatform.github.io/google-cloud-python/]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4055) Migrate Python SDK off apitools

2018-04-11 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-4055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434667#comment-16434667
 ] 

Chamikara Jayalath commented on BEAM-4055:
--

cc: [~altay]

> Migrate Python SDK off apitools
> ---
>
> Key: BEAM-4055
> URL: https://issues.apache.org/jira/browse/BEAM-4055
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Chamikara Jayalath
>Priority: Major
>
> apitools library used by Python SDK for connecting to various GCP services 
> (Dataflow, GCS, BigQuery) is not being activaly developed. We should move off 
> this library to better supported clients. For example [1][2].
> [1] [https://developers.google.com/api-client-library/python/]
> [2] [https://googlecloudplatform.github.io/google-cloud-python/]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3937) Track performance of output_counter before feature enabled

2018-04-11 Thread Boyuan Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boyuan Zhang reassigned BEAM-3937:
--

Assignee: Boyuan Zhang  (was: Robert Bradshaw)

> Track performance of output_counter before feature enabled
> --
>
> Key: BEAM-3937
> URL: https://issues.apache.org/jira/browse/BEAM-3937
> Project: Beam
>  Issue Type: Task
>  Components: sdk-py-harness
>Reporter: Boyuan Zhang
>Assignee: Boyuan Zhang
>Priority: Major
>
> To track performance of output_counter before feature enabled by default.
> Need to make sure there is no adversely reflect before enbale
> Related PR: [https://github.com/apache/beam/pull/4741]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (BEAM-3779) Enable deserialization of a non-java Pipeline

2018-04-11 Thread Ben Sidhom (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Sidhom closed BEAM-3779.

   Resolution: Won't Do
Fix Version/s: Not applicable

> Enable deserialization of a non-java Pipeline
> -
>
> Key: BEAM-3779
> URL: https://issues.apache.org/jira/browse/BEAM-3779
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Ben Sidhom
>Priority: Major
>  Labels: portability
> Fix For: Not applicable
>
>
> Currently, rehydrating a Pipeline works on the PCollection and PTransform 
> levels with the use of RawPTransform, but the runner-core-construction-java 
> utilities will throw if the runner attempts to deserialize a 
> WindowingStrategy or Coder which contains non-Java custom (or otherwise 
> unknown) Coders or WindowFns.
>  
> Use a strategy like RawPTransform to deserialize WindowFns and Coders, so 
> they can be interacted with as intermediate tokens in the java form.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Flink_Gradle #86

2018-04-11 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-3396) Update docker development images to use the Gradle build

2018-04-11 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía updated BEAM-3396:
---
Parent Issue: BEAM-4045  (was: BEAM-3249)

> Update docker development images to use the Gradle build
> 
>
> Key: BEAM-3396
> URL: https://issues.apache.org/jira/browse/BEAM-3396
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system
>Reporter: Ismaël Mejía
>Priority: Minor
>
> The docker development images introduced recently are part of the ongoing 
> work on getting reproducible builds on Beam and they should be updated as 
> part of the move to gradle.
> https://beam.apache.org/contribute/docker-images/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4054) Generalize DATAFLOW_DISTRIBUTION counter if necessary

2018-04-11 Thread Boyuan Zhang (JIRA)
Boyuan Zhang created BEAM-4054:
--

 Summary: Generalize DATAFLOW_DISTRIBUTION counter if necessary
 Key: BEAM-4054
 URL: https://issues.apache.org/jira/browse/BEAM-4054
 Project: Beam
  Issue Type: Task
  Components: sdk-py-core
Reporter: Boyuan Zhang
Assignee: Ahmet Altay


generalize DATAFLOW_DISTRIBUTION counter for other cases (e.g. when Beam on 
Flink / portability emerges) if necessary



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4053) Go should have a postcommit run on a cron schedule

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4053?focusedWorklogId=90159=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90159
 ]

ASF GitHub Bot logged work on BEAM-4053:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:36
Start Date: 11/Apr/18 20:36
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5101: [BEAM-4053] Add a 
Go postcommit
URL: https://github.com/apache/beam/pull/5101#issuecomment-380587409
 
 
   Passed, 
https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Go_GradleBuild/
   @aaltay - can you merge?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90159)
Time Spent: 1.5h  (was: 1h 20m)

> Go should have a postcommit run on a cron schedule
> --
>
> Key: BEAM-4053
> URL: https://issues.apache.org/jira/browse/BEAM-4053
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> To allow assessing if it is reliable, there should be a Go postcommit, 
> initially the same as the precommit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Python_Verify #4665

2018-04-11 Thread Apache Jenkins Server
See 


--
[...truncated 1.04 MB...]
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.pxd -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.pxd -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operation_specs.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.pxd -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_fast.pyx -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_slow.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/worker_id_interceptor.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/worker_id_interceptor_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/testing/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/util.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/util_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/data/standard_coders.yaml -> 
apache-beam-2.5.0.dev0/apache_beam/testing/data
copying apache_beam/testing/data/trigger_transcripts.yaml -> 
apache-beam-2.5.0.dev0/apache_beam/testing/data
copying apache_beam/tools/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/tools
copying apache_beam/tools/distribution_counter_microbenchmark.py -> 
apache-beam-2.5.0.dev0/apache_beam/tools
copying apache_beam/tools/map_fn_microbenchmark.py -> 
apache-beam-2.5.0.dev0/apache_beam/tools
copying apache_beam/tools/utils.py -> apache-beam-2.5.0.dev0/apache_beam/tools

[jira] [Work logged] (BEAM-4053) Go should have a postcommit run on a cron schedule

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4053?focusedWorklogId=90169=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90169
 ]

ASF GitHub Bot logged work on BEAM-4053:


Author: ASF GitHub Bot
Created on: 11/Apr/18 21:13
Start Date: 11/Apr/18 21:13
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #5101: [BEAM-4053] Add a Go 
postcommit
URL: https://github.com/apache/beam/pull/5101
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/.test-infra/jenkins/job_beam_PostCommit_Go_GradleBuild.groovy 
b/.test-infra/jenkins/job_beam_PostCommit_Go_GradleBuild.groovy
new file mode 100644
index 000..0e667b4c109
--- /dev/null
+++ b/.test-infra/jenkins/job_beam_PostCommit_Go_GradleBuild.groovy
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import common_job_properties
+
+// This is the Go postcommit which runs a gradle build, and the current set
+// of postcommit tests.
+job('beam_PostCommit_Go_GradleBuild') {
+  description('Runs Go PostCommit tests against master.')
+
+  // Execute concurrent builds if necessary.
+  concurrentBuild()
+
+  // Set common parameters.
+  common_job_properties.setTopLevelMainJobProperties(
+delegate,
+'master',
+150)
+
+  // Sets that this is a PostCommit job.
+  common_job_properties.setPostCommit(delegate, '15 */6 * * *', false)
+
+  def gradle_command_line = './gradlew ' + 
common_job_properties.gradle_switches.join(' ') + ' :goPostCommit'
+
+  // Allows triggering this build against pull requests.
+  common_job_properties.enablePhraseTriggeringFromPullRequest(
+  delegate,
+  gradle_command_line,
+  'Run Go PostCommit')
+
+  steps {
+gradle {
+  rootBuildScriptDir(common_job_properties.checkoutDir)
+  tasks(':goPostCommit')
+  common_job_properties.setGradleSwitches(delegate)
+}
+  }
+}
diff --git a/build.gradle b/build.gradle
index 102fffbe687..128ac26e686 100644
--- a/build.gradle
+++ b/build.gradle
@@ -132,6 +132,11 @@ task goPreCommit() {
   dependsOn ":beam-sdks-go:test"
 }
 
+task goPostCommit() {
+  // Same currently as precommit, but duplicated to provide a clearer signal 
of reliability.
+  dependsOn ":goPreCommit"
+}
+
 task pythonPreCommit() {
   dependsOn ":rat"
   dependsOn ":beam-sdks-python:check"


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90169)
Time Spent: 1h 40m  (was: 1.5h)

> Go should have a postcommit run on a cron schedule
> --
>
> Key: BEAM-4053
> URL: https://issues.apache.org/jira/browse/BEAM-4053
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> To allow assessing if it is reliable, there should be a Go postcommit, 
> initially the same as the precommit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (13a65c6 -> c85f998)

2018-04-11 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from 13a65c6  [BEAM-3249] Only generate all artifacts when publishing.
 add f7a4192  [BEAM-4053] Add a Go postcommit
 new c85f998  Merge pull request #5101 from alanmyrvold/alan-pre-post

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ...y => job_beam_PostCommit_Go_GradleBuild.groovy} | 24 ++
 build.gradle   |  5 +
 2 files changed, 21 insertions(+), 8 deletions(-)
 copy .test-infra/jenkins/{job_beam_PreCommit_Go_GradleBuild.groovy => 
job_beam_PostCommit_Go_GradleBuild.groovy} (68%)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


[beam] 01/01: Merge pull request #5101 from alanmyrvold/alan-pre-post

2018-04-11 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit c85f9988728c61f87714a329bf920640a3443c5a
Merge: 13a65c6 f7a4192
Author: Ahmet Altay 
AuthorDate: Wed Apr 11 14:13:51 2018 -0700

Merge pull request #5101 from alanmyrvold/alan-pre-post

[BEAM-4053] Add a Go postcommit

 .../job_beam_PostCommit_Go_GradleBuild.groovy  | 53 ++
 build.gradle   |  5 ++
 2 files changed, 58 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=90172=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90172
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 11/Apr/18 21:24
Start Date: 11/Apr/18 21:24
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-380600389
 
 
   Is there anything else I can do to help move this along?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90172)
Time Spent: 6h 40m  (was: 6.5h)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Flink_Gradle #85

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3249] Only generate all artifacts when publishing.

--
[...truncated 72.67 MB...]
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.executiongraph.Execution 
transitionState
INFO: ToKeyedWorkItem (1/1) (00526b19da0e45dede51898be5e1b66b) switched 
from RUNNING to FINISHED.
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.client.JobClientActor 
logAndPrintMessage
INFO: 04/11/2018 20:27:30   ToKeyedWorkItem(1/1) switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_OUT
04/11/2018 20:27:30 ToKeyedWorkItem(1/1) switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_ERROR
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.taskmanager.Task 
transitionState
INFO: 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map) (1/1) (6d0046157b699886a0f6bd0dfaa376c5) switched from RUNNING 
to FINISHED.
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.taskmanager.Task run
INFO: Freeing task resources for 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map) (1/1) (6d0046157b699886a0f6bd0dfaa376c5).
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.taskmanager.Task run
INFO: Ensuring all FileSystem streams are closed for task 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map) (1/1) (6d0046157b699886a0f6bd0dfaa376c5) [FINISHED]
Apr 11, 2018 8:27:30 PM grizzled.slf4j.Logger info
INFO: Un-registering task and sending final execution state FINISHED to 
JobManager for task 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map) (6d0046157b699886a0f6bd0dfaa376c5)
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.executiongraph.Execution 
transitionState
INFO: 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map) (1/1) (6d0046157b699886a0f6bd0dfaa376c5) switched from RUNNING 
to FINISHED.
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.client.JobClientActor 
logAndPrintMessage
INFO: 04/11/2018 20:27:30   
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map)(1/1) switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_OUT
04/11/2018 20:27:30 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Combine.perKey(Concatenate)
 -> 
View.AsSingleton/Combine.GloballyAsSingletonView/View.CreatePCollectionView/Combine.globally(Concatenate)/Values/Values/Map/ParMultiDo(Anonymous)
 -> (Map, Map)(1/1) switched to FINISHED 

org.apache.beam.sdk.transforms.CombineTest > testSimpleCombineWithContextEmpty 
STANDARD_ERROR
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.taskmanager.Task 
transitionState
INFO: 
Combine.perKey(TestCombineFnWithContext)/Combine.GroupedValues/ParDo(Anonymous)/ParMultiDo(Anonymous)
 -> PAssert$165/GroupGlobally/Window.Into()/Window.Assign.out -> 
PAssert$165/GroupGlobally/GatherAllOutputs/Reify.Window/ParDo(Anonymous)/ParMultiDo(Anonymous)
 -> 
PAssert$165/GroupGlobally/GatherAllOutputs/WithKeys/AddKeys/Map/ParMultiDo(Anonymous)
 -> PAssert$165/GroupGlobally/GatherAllOutputs/Window.Into()/Window.Assign.out 
-> ToKeyedWorkItem (1/1) (15c50fd3efc7f4c97b56727aba168f5d) switched from 
RUNNING to FINISHED.
Apr 11, 2018 8:27:30 PM org.apache.flink.runtime.taskmanager.Task 
transitionState

[jira] [Commented] (BEAM-3521) Review and update the references of maven to gradle in the website

2018-04-11 Thread Scott Wegner (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434556#comment-16434556
 ] 

Scott Wegner commented on BEAM-3521:


I am working on migrating Eclipse documentation.

> Review and update the references of maven to gradle in the website
> --
>
> Key: BEAM-3521
> URL: https://issues.apache.org/jira/browse/BEAM-3521
> Project: Beam
>  Issue Type: Sub-task
>  Components: build-system, website
>Reporter: Ismaël Mejía
>Assignee: Scott Wegner
>Priority: Major
>
> I suppose that the only maven reference that will stay probably should be the 
> one on the maven-archetype but I suppose this will be decided later on.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #47

2018-04-11 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3479) Add a regression test for the DoFn classloader selection

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3479?focusedWorklogId=90144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90144
 ]

ASF GitHub Bot logged work on BEAM-3479:


Author: ASF GitHub Bot
Created on: 11/Apr/18 19:36
Start Date: 11/Apr/18 19:36
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request 
#4412: [BEAM-3479] adding a test to ensure the right classloader is used to 
defined the dofninvoker
URL: https://github.com/apache/beam/pull/4412#discussion_r180873630
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactoryTest.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms.reflect;
+
+import static java.util.Arrays.asList;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.fail;
+
+import org.apache.beam.sdk.testing.InterceptingUrlClassLoader;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.junit.Test;
+
+/**
+ * Tests for the proxy generator based on byte buddy.
+ */
+public class ByteBuddyDoFnInvokerFactoryTest {
+  /**
+   * Ensuring we define the subclass using bytebuddy in the right classloader,
+   * i.e. the doFn classloader and not beam classloader.
+   */
+  @Test
+  public void validateProxyClassLoaderSelectionLogic() throws Exception {
+final ClassLoader testLoader = 
Thread.currentThread().getContextClassLoader();
+final ClassLoader loader = new InterceptingUrlClassLoader(testLoader, 
MyDoFn.class.getName());
+final Class> source = (Class>) 
loader.loadClass(
+
"org.apache.beam.sdk.transforms.reflect.ByteBuddyDoFnInvokerFactoryTest$MyDoFn");
+assertEquals(loader, source.getClassLoader()); // precondition check
+final String proxyName = source.getName()
 
 Review comment:
   What I mean is this: if the classloader is equal, then what is the problem 
with the other conditions you are checking?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90144)
Time Spent: 2h 20m  (was: 2h 10m)

> Add a regression test for the DoFn classloader selection
> 
>
> Key: BEAM-3479
> URL: https://issues.apache.org/jira/browse/BEAM-3479
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Follow up task after https://github.com/apache/beam/pull/4235 merge. This 
> task is about ensuring we test that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3997) implement HCatalog integration test

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3997?focusedWorklogId=90158=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90158
 ]

ASF GitHub Bot logged work on BEAM-3997:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:32
Start Date: 11/Apr/18 20:32
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5097: [BEAM-3997] 
HCatalog integration test
URL: https://github.com/apache/beam/pull/5097#issuecomment-380586364
 
 
   Thanks @kkucharc for the contribution BTW.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90158)
Time Spent: 0.5h  (was: 20m)

> implement HCatalog integration test
> ---
>
> Key: BEAM-3997
> URL: https://issues.apache.org/jira/browse/BEAM-3997
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Dariusz Aniszewski
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3249) Use Gradle to build/release project

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3249?focusedWorklogId=90164=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90164
 ]

ASF GitHub Bot logged work on BEAM-3249:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:47
Start Date: 11/Apr/18 20:47
Worklog Time Spent: 10m 
  Work Description: lukecwik opened a new pull request #5107: [BEAM-3249] 
Clean-up and use shaded test jars, removing evaluationDependsOn
URL: https://github.com/apache/beam/pull/5107
 
 
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90164)
Time Spent: 12h 20m  (was: 12h 10m)

> Use Gradle to build/release project
> ---
>
> Key: BEAM-3249
> URL: https://issues.apache.org/jira/browse/BEAM-3249
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> I have collected data by running several builds against master using Gradle 
> and Maven without using Gradle's support for incremental builds.
> Gradle (mins)
> min: 25.04
> max: 160.14
> median: 45.78
> average: 52.19
> stdev: 30.80
> Maven (mins)
> min: 56.86
> max: 216.55
> median: 87.93
> average: 109.10
> stdev: 48.01
> I excluded a few timeouts (240 mins) that happened during the Maven build 
> from its numbers but we can see conclusively that Gradle is about twice as 
> fast for the build when compared to Maven when run using Jenkins.
> Original dev@ thread: 
> https://lists.apache.org/thread.html/225dddcfc78f39bbb296a0d2bbef1caf37e17677c7e5573f0b6fe253@%3Cdev.beam.apache.org%3E
> The data is available here 
> https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3326) Execute a Stage via the portability framework in the ReferenceRunner

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3326?focusedWorklogId=90163=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90163
 ]

ASF GitHub Bot logged work on BEAM-3326:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:47
Start Date: 11/Apr/18 20:47
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #4761: [BEAM-3326] Remove 
SdkHarnessClientControlService
URL: https://github.com/apache/beam/pull/4761#issuecomment-380590353
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90163)
Time Spent: 6h 20m  (was: 6h 10m)

> Execute a Stage via the portability framework in the ReferenceRunner
> 
>
> Key: BEAM-3326
> URL: https://issues.apache.org/jira/browse/BEAM-3326
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> This is the supertask for remote execution in the Universal Local Runner 
> (BEAM-2899).
> This executes a stage remotely via portability framework APIs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (c85f998 -> 2bc909c)

2018-04-11 Thread altay
This is an automated email from the ASF dual-hosted git repository.

altay pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from c85f998  Merge pull request #5101 from alanmyrvold/alan-pre-post
 add a3b3b45  Clean up docker networks generated by hdfs integration tests.
 add 2bc909c  Merge pull request #5100 from 
udim/hdfs-postcommit-docker-network

No new revisions were added by this update.

Summary of changes:
 .../apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh   | 2 ++
 1 file changed, 2 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
al...@apache.org.


[jira] [Work logged] (BEAM-4053) Go should have a postcommit run on a cron schedule

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4053?focusedWorklogId=90139=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90139
 ]

ASF GitHub Bot logged work on BEAM-4053:


Author: ASF GitHub Bot
Created on: 11/Apr/18 19:26
Start Date: 11/Apr/18 19:26
Worklog Time Spent: 10m 
  Work Description: alanmyrvold commented on issue #5101: [BEAM-4053] Add a 
Go postcommit
URL: https://github.com/apache/beam/pull/5101#issuecomment-380568021
 
 
   Run Go PostCommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90139)
Time Spent: 1h 20m  (was: 1h 10m)

> Go should have a postcommit run on a cron schedule
> --
>
> Key: BEAM-4053
> URL: https://issues.apache.org/jira/browse/BEAM-4053
> Project: Beam
>  Issue Type: New Feature
>  Components: build-system
>Reporter: Alan Myrvold
>Assignee: Alan Myrvold
>Priority: Major
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> To allow assessing if it is reliable, there should be a Go postcommit, 
> initially the same as the precommit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3991) Update dependency 'google-api-services-storage' to latest version

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3991?focusedWorklogId=90151=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90151
 ]

ASF GitHub Bot logged work on BEAM-3991:


Author: ASF GitHub Bot
Created on: 11/Apr/18 19:49
Start Date: 11/Apr/18 19:49
Worklog Time Spent: 10m 
  Work Description: aaltay commented on a change in pull request #5105: 
[BEAM-3991] Updates gcsio to use a API specific batch endpoint.
URL: https://github.com/apache/beam/pull/5105#discussion_r180876774
 
 

 ##
 File path: sdks/python/apache_beam/io/gcp/gcsio.py
 ##
 @@ -91,6 +91,16 @@
 # GcsIO.delete_batch().
 MAX_BATCH_OPERATION_SIZE = 100
 
+# Batch endpoint URL for GCS.
+# We have to specify an API specific endpoint here since Google APIs global
+# batch endpoints will be deprecated on 03/25/2019.
+# See 
https://developers.googleblog.com/2018/03/discontinuing-support-for-json-rpc-and.html.
+# Currently apitools library uses a global batch endpoint by default: 
https://github.com/google/apitools/blob/master/apitools/base/py/batch.py#L152
+# TODO: remove this constant and it's usage after apitools move to using an API
 
 Review comment:
   Let's create a JIRA to move off of apitools. That library itself is in 
maintenance mode.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90151)
Time Spent: 20m  (was: 10m)

> Update dependency 'google-api-services-storage' to latest version
> -
>
> Key: BEAM-3991
> URL: https://issues.apache.org/jira/browse/BEAM-3991
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently we use version 'v1-rev71-1.22.0' which is deprecated and about two 
> years old.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3479) Add a regression test for the DoFn classloader selection

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3479?focusedWorklogId=90157=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90157
 ]

ASF GitHub Bot logged work on BEAM-3479:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:32
Start Date: 11/Apr/18 20:32
Worklog Time Spent: 10m 
  Work Description: rmannibucau commented on a change in pull request 
#4412: [BEAM-3479] adding a test to ensure the right classloader is used to 
defined the dofninvoker
URL: https://github.com/apache/beam/pull/4412#discussion_r180888659
 
 

 ##
 File path: 
sdks/java/core/src/test/java/org/apache/beam/sdk/transforms/reflect/ByteBuddyDoFnInvokerFactoryTest.java
 ##
 @@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.sdk.transforms.reflect;
+
+import static java.util.Arrays.asList;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.fail;
+
+import org.apache.beam.sdk.testing.InterceptingUrlClassLoader;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.junit.Test;
+
+/**
+ * Tests for the proxy generator based on byte buddy.
+ */
+public class ByteBuddyDoFnInvokerFactoryTest {
+  /**
+   * Ensuring we define the subclass using bytebuddy in the right classloader,
+   * i.e. the doFn classloader and not beam classloader.
+   */
+  @Test
+  public void validateProxyClassLoaderSelectionLogic() throws Exception {
+final ClassLoader testLoader = 
Thread.currentThread().getContextClassLoader();
+final ClassLoader loader = new InterceptingUrlClassLoader(testLoader, 
MyDoFn.class.getName());
+final Class> source = (Class>) 
loader.loadClass(
+
"org.apache.beam.sdk.transforms.reflect.ByteBuddyDoFnInvokerFactoryTest$MyDoFn");
+assertEquals(loader, source.getClassLoader()); // precondition check
+final String proxyName = source.getName()
 
 Review comment:
   That the test ran well and validates what we want even if our test loader 
impl changes and is silently broken. They are preconditions checks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90157)
Time Spent: 2.5h  (was: 2h 20m)

> Add a regression test for the DoFn classloader selection
> 
>
> Key: BEAM-3479
> URL: https://issues.apache.org/jira/browse/BEAM-3479
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-core
>Reporter: Romain Manni-Bucau
>Assignee: Romain Manni-Bucau
>Priority: Major
> Fix For: 2.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Follow up task after https://github.com/apache/beam/pull/4235 merge. This 
> task is about ensuring we test that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3997) implement HCatalog integration test

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3997?focusedWorklogId=90156=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90156
 ]

ASF GitHub Bot logged work on BEAM-3997:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:31
Start Date: 11/Apr/18 20:31
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5097: [BEAM-3997] 
HCatalog integration test
URL: https://github.com/apache/beam/pull/5097#issuecomment-380586116
 
 
   @iemejia will you be interested in reviewing this ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90156)
Time Spent: 20m  (was: 10m)

> implement HCatalog integration test
> ---
>
> Key: BEAM-3997
> URL: https://issues.apache.org/jira/browse/BEAM-3997
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Dariusz Aniszewski
>Assignee: Jason Kuster
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4044) Take advantage of Calcite DDL

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4044?focusedWorklogId=90160=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90160
 ]

ASF GitHub Bot logged work on BEAM-4044:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:42
Start Date: 11/Apr/18 20:42
Worklog Time Spent: 10m 
  Work Description: kennknowles commented on a change in pull request 
#5040: [BEAM-4044] [SQL] Refresh DDL from 1.16
URL: https://github.com/apache/beam/pull/5040#discussion_r180890261
 
 

 ##
 File path: 
sdks/java/extensions/sql/src/test/java/org/apache/beam/sdk/extensions/sql/meta/provider/kafka/KafkaTableProviderTest.java
 ##
 @@ -65,17 +64,17 @@ private static Table mockTable(String name) {
 return Table.builder()
 .name(name)
 .comment(name + " table")
-.location(URI.create("kafka://localhost:2181/brokers?topic=test"))
+.location("kafka://localhost:2181/brokers?topic=test")
 
 Review comment:
   So we remove the U from URL and then put it back via `TYPE`. Can't object.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90160)
Time Spent: 2h 40m  (was: 2.5h)

> Take advantage of Calcite DDL
> -
>
> Key: BEAM-4044
> URL: https://issues.apache.org/jira/browse/BEAM-4044
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Andrew Pilloud
>Assignee: Andrew Pilloud
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> In Calcite 1.15 support for abstract DDL moved into calcite core. We should 
> take advantage of that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-3286) Go SDK support for portable side input

2018-04-11 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde reassigned BEAM-3286:
---

Assignee: (was: Henning Rohde)

> Go SDK support for portable side input
> --
>
> Key: BEAM-3286
> URL: https://issues.apache.org/jira/browse/BEAM-3286
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-go
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4051) python postcommit failure in docker-compose

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4051?focusedWorklogId=90166=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90166
 ]

ASF GitHub Bot logged work on BEAM-4051:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:57
Start Date: 11/Apr/18 20:57
Worklog Time Spent: 10m 
  Work Description: udim commented on issue #5100: [BEAM-4051] Clean up 
docker networks generated by hdfs integration tests.
URL: https://github.com/apache/beam/pull/5100#issuecomment-380593272
 
 
   R: @aaltay 
   Postcommit passes: 
https://builds.apache.org/job/beam_PostCommit_Python_Verify/4666/console


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90166)
Time Spent: 40m  (was: 0.5h)

> python postcommit failure in docker-compose
> ---
>
> Key: BEAM-4051
> URL: https://issues.apache.org/jira/browse/BEAM-4051
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PostCommit_Python_Verify/4661/consoleText]
> Error is:
> Creating network 
> "hdfs_it-jenkins-beam_postcommit_python_verify-4661_test_net" with the 
> default driver
> could not find an available, non-overlapping IPv4 address pool among the 
> defaults to assign to the network
>  
> I assume that the issue is that the networks being created are not being 
> cleaned up.
> Looking briefly online the solution seems to be something like "docker 
> network prune --force".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #69

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[swegner] Migrate a few more references to Maven from our README.md

--
[...truncated 1.11 MB...]
at 
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
at 
org.apache.beam.runners.spark.translation.SparkContextFactory.createSparkContext(SparkContextFactory.java:103)
at 
org.apache.beam.runners.spark.translation.SparkContextFactory.getSparkContext(SparkContextFactory.java:68)
at 
org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory.call(SparkRunnerStreamingContextFactory.java:79)
at 
org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory.call(SparkRunnerStreamingContextFactory.java:47)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testFirstElementLate(CreateStreamTest.java:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
at 
org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
at 

Build failed in Jenkins: beam_PostCommit_Python_Verify #4664

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[lcwik] [BEAM-3249] Add support for translating 'provided' scope dependencies to

[kenn] Basic run task for nexmark with runner selection

[swegner] Migrate a few more references to Maven from our README.md

--
[...truncated 1.04 MB...]
copying apache_beam/runners/portability/universal_local_runner_main.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/portability/universal_local_runner_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/portability
copying apache_beam/runners/test/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/test
copying apache_beam/runners/worker/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/bundle_processor.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/data_plane_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/log_handler_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.pxd -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/logger_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.pxd -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/opcounters_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operation_specs.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.pxd -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/operations.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_main_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sdk_worker_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/sideinputs_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_fast.pyx -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_slow.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/statesampler_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/worker_id_interceptor.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/runners/worker/worker_id_interceptor_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/runners/worker
copying apache_beam/testing/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/pipeline_verifiers_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_pipeline_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_stream_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/test_utils_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/util.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/util_test.py -> 
apache-beam-2.5.0.dev0/apache_beam/testing
copying apache_beam/testing/data/standard_coders.yaml -> 
apache-beam-2.5.0.dev0/apache_beam/testing/data
copying apache_beam/testing/data/trigger_transcripts.yaml -> 
apache-beam-2.5.0.dev0/apache_beam/testing/data
copying apache_beam/tools/__init__.py -> 
apache-beam-2.5.0.dev0/apache_beam/tools
copying 

[jira] [Work logged] (BEAM-3249) Use Gradle to build/release project

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3249?focusedWorklogId=90152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90152
 ]

ASF GitHub Bot logged work on BEAM-3249:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:03
Start Date: 11/Apr/18 20:03
Worklog Time Spent: 10m 
  Work Description: lukecwik closed pull request #5103: [BEAM-3249] Only 
generate all artifacts when publishing.
URL: https://github.com/apache/beam/pull/5103
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/build_rules.gradle b/build_rules.gradle
index fba441e2677..8f54fe0f478 100644
--- a/build_rules.gradle
+++ b/build_rules.gradle
@@ -325,7 +325,6 @@ ext.DEFAULT_SHADOW_CLOSURE = {
 class JavaNatureConfiguration {
   double javaVersion = 1.8// Controls the JDK source language and 
target compatibility
   boolean enableFindbugs = true   // Controls whether the findbugs plugin is 
enabled and configured
-  boolean enableShadow = true // Controls whether the shadow plugin is 
enabled and configured
   // The shadowJar / shadowTestJar tasks execute the following closure to 
configure themselves.
   // Users can compose their closure with the default closure via:
   // DEFAULT_SHADOW_CLOSURE << {
@@ -349,6 +348,7 @@ class JavaNatureConfiguration {
 //  * maven
 //  * net.ltgt.apt (plugin to configure annotation processing tool)
 //  * propdeps (provide optional and provided dependency configurations)
+//  * propdeps-maven
 //  * checkstyle
 //  * findbugs
 //  * shadow
@@ -447,6 +447,7 @@ ext.applyJavaNature = {
   // TODO: Either remove these plugins and find another way to generate the 
Maven poms
   // with the correct dependency scopes configured.
   apply plugin: 'propdeps'
+  apply plugin: 'propdeps-maven'
 
   // Configures a checkstyle plugin enforcing a set of rules and also allows 
for a set of
   // suppressions.
@@ -494,64 +495,62 @@ ext.applyJavaNature = {
   //
   // TODO: Enforce all relocations are always performed to:
   // getJavaRelocatedPath(package_suffix) where package_suffix is something 
like "com.google.commmon"
-  if (configuration.enableShadow) {
-apply plugin: 'com.github.johnrengelman.shadow'
+  apply plugin: 'com.github.johnrengelman.shadow'
 
-// Create a new configuration 'shadowTest' like 'shadow' for the test scope
-configurations {
-  shadow {
-description = "Dependencies for shaded source set 'main'"
-  }
-  compile.extendsFrom shadow
-  shadowTest {
-description = "Dependencies for shaded source set 'test'"
-extendsFrom shadow
-  }
-  testCompile.extendsFrom shadowTest
+  // Create a new configuration 'shadowTest' like 'shadow' for the test scope
+  configurations {
+shadow {
+  description = "Dependencies for shaded source set 'main'"
+}
+compile.extendsFrom shadow
+shadowTest {
+  description = "Dependencies for shaded source set 'test'"
+  extendsFrom shadow
 }
+testCompile.extendsFrom shadowTest
+  }
 
-// Always configure the shadowJar classifier and merge service files.
-shadowJar ({
-  classifier = "shaded"
-  mergeServiceFiles()
+  // Always configure the shadowJar classifier and merge service files.
+  shadowJar ({
+classifier = "shaded"
+mergeServiceFiles()
+  } << configuration.shadowClosure)
 
-} << configuration.shadowClosure)
+  // Always configure the shadowTestJar classifier and merge service files.
+  tasks.create(name: 'shadowTestJar', type: 
com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar, {
+classifier = "shaded-tests"
+from sourceSets.test.output
+configurations = [project.configurations.testRuntime]
+  } << configuration.shadowClosure)
 
-// Always configure the shadowTestJar classifier and merge service files.
-tasks.create(name: 'shadowTestJar', type: 
com.github.jengelman.gradle.plugins.shadow.tasks.ShadowJar, {
-  classifier = "shaded-tests"
-  from sourceSets.test.output
-  configurations = [project.configurations.testRuntime]
-} << configuration.shadowClosure)
+  // Ensure that shaded jar and test-jar are part of the their own 
configuration artifact sets
+  artifacts.shadow shadowJar
+  artifacts.shadowTest shadowTestJar
 
-// Ensure that shaded jar and test-jar are part of the their own 
cconfiguration artifact sets
-// and the archives configuration set.
-artifacts.shadow shadowJar
-artifacts.shadowTest shadowTestJar
+  if (isRelease() || project.hasProperty('publishing')) {
+apply plugin: "maven-publish"
+
+// Only build artifacts for archives if we are publishing
 artifacts.archives shadowJar
 

[jira] [Work logged] (BEAM-3326) Execute a Stage via the portability framework in the ReferenceRunner

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3326?focusedWorklogId=90162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90162
 ]

ASF GitHub Bot logged work on BEAM-3326:


Author: ASF GitHub Bot
Created on: 11/Apr/18 20:47
Start Date: 11/Apr/18 20:47
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #4761: [BEAM-3326] Remove 
SdkHarnessClientControlService
URL: https://github.com/apache/beam/pull/4761#issuecomment-380590353
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90162)
Time Spent: 6h 10m  (was: 6h)

> Execute a Stage via the portability framework in the ReferenceRunner
> 
>
> Key: BEAM-3326
> URL: https://issues.apache.org/jira/browse/BEAM-3326
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-core
>Reporter: Thomas Groh
>Assignee: Thomas Groh
>Priority: Major
>  Labels: portability
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> This is the supertask for remote execution in the Universal Local Runner 
> (BEAM-2899).
> This executes a stage remotely via portability framework APIs



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-4031) Add missing dataflow customization options for Go SDK

2018-04-11 Thread Henning Rohde (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Henning Rohde resolved BEAM-4031.
-
   Resolution: Fixed
Fix Version/s: 2.5.0

> Add missing dataflow customization options for Go SDK
> -
>
> Key: BEAM-4031
> URL: https://issues.apache.org/jira/browse/BEAM-4031
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-go
>Reporter: Henning Rohde
>Assignee: Henning Rohde
>Priority: Minor
> Fix For: 2.5.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> We're missing at least:
> zone
> temp_location
> worker_machine_type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #4666

2018-04-11 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-2732) State tracking in Python is inefficient and has duplicated code

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2732?focusedWorklogId=90170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90170
 ]

ASF GitHub Bot logged work on BEAM-2732:


Author: ASF GitHub Bot
Created on: 11/Apr/18 21:17
Start Date: 11/Apr/18 21:17
Worklog Time Spent: 10m 
  Work Description: pabloem commented on issue #4387: [BEAM-2732] Metrics 
rely on statesampler state
URL: https://github.com/apache/beam/pull/4387#issuecomment-380598583
 
 
   Rebased change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90170)
Time Spent: 6.5h  (was: 6h 20m)

> State tracking in Python is inefficient and has duplicated code
> ---
>
> Key: BEAM-2732
> URL: https://issues.apache.org/jira/browse/BEAM-2732
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> e.g logging and metrics keep state separately. State tracking should be 
> unified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4051) python postcommit failure in docker-compose

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4051?focusedWorklogId=90171=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90171
 ]

ASF GitHub Bot logged work on BEAM-4051:


Author: ASF GitHub Bot
Created on: 11/Apr/18 21:23
Start Date: 11/Apr/18 21:23
Worklog Time Spent: 10m 
  Work Description: aaltay closed pull request #5100: [BEAM-4051] Clean up 
docker networks generated by hdfs integration tests.
URL: https://github.com/apache/beam/pull/5100
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/sdks/python/apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh 
b/sdks/python/apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh
index 3e756da237c..6f527471d87 100755
--- a/sdks/python/apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh
+++ b/sdks/python/apache_beam/io/hdfs_integration_test/hdfs_integration_test.sh
@@ -35,6 +35,8 @@ cp -r ${ROOT_DIR}/model ${CONTEXT_DIR}/
 PROJECT_NAME=$(echo hdfs_IT-${BUILD_TAG:-non-jenkins})
 
 cd ${CONTEXT_DIR}
+# Clean up leftover unused networks. BEAM-4051
+docker network prune --force
 time docker-compose -p ${PROJECT_NAME} build
 time docker-compose -p ${PROJECT_NAME} up --exit-code-from test \
 --abort-on-container-exit


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90171)
Time Spent: 50m  (was: 40m)

> python postcommit failure in docker-compose
> ---
>
> Key: BEAM-4051
> URL: https://issues.apache.org/jira/browse/BEAM-4051
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Udi Meiri
>Assignee: Udi Meiri
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> [https://builds.apache.org/job/beam_PostCommit_Python_Verify/4661/consoleText]
> Error is:
> Creating network 
> "hdfs_it-jenkins-beam_postcommit_python_verify-4661_test_net" with the 
> default driver
> could not find an available, non-overlapping IPv4 address pool among the 
> defaults to assign to the network
>  
> I assume that the issue is that the networks being created are not being 
> cleaned up.
> Looking briefly online the solution seems to be something like "docker 
> network prune --force".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_MongoDBIO_IT #39

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[swegner] Publish a build scan from Jenkins build

[swegner] Migrate a few more references to Maven from our README.md

[amyrvold] [BEAM-4053] Add a Go postcommit

[lcwik] [BEAM-3249] Only generate all artifacts when publishing.

[ehudm] Clean up docker networks generated by hdfs integration tests.

[swegner] Add comment about build scans only on Jenkins

[chamikara] Updates gcsio to use a API specific batch endpoint.

--
[...truncated 47.85 KB...]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.202.5.247:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 
com.mongodb.binding.ClusterBinding.getReadConnectionSource(ClusterBinding.java:63)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:89)
at 
com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
at 
com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
at com.mongodb.Mongo.execute(Mongo.java:772)
at com.mongodb.Mongo$2.execute(Mongo.java:759)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
at 
org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.splitAndValidate(WorkerCustomSources.java:275)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitTyped(WorkerCustomSources.java:197)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplitWithApiLimit(WorkerCustomSources.java:181)
at 
com.google.cloud.dataflow.worker.WorkerCustomSources.performSplit(WorkerCustomSources.java:160)
at 
com.google.cloud.dataflow.worker.WorkerCustomSourceOperationExecutor.execute(WorkerCustomSourceOperationExecutor.java:75)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.executeWork(BatchDataflowWorker.java:381)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.doWork(BatchDataflowWorker.java:353)
at 
com.google.cloud.dataflow.worker.BatchDataflowWorker.getAndPerformWork(BatchDataflowWorker.java:284)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:134)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:114)
at 
com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:101)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
com.mongodb.MongoTimeoutException: Timed out after 3 ms while waiting for a 
server that matches ReadPreferenceServerSelector{readPreference=primary}. 
Client view of cluster state is {type=UNKNOWN, 
servers=[{address=35.202.5.247:27017, type=UNKNOWN, state=CONNECTING, 
exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, 
caused by {java.net.SocketTimeoutException: connect timed out}}]
at 
com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:369)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at 
com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at 

[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=90244=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90244
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 12/Apr/18 00:56
Start Date: 12/Apr/18 00:56
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on a change in pull request #5079: 
[BEAM-2990] support MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#discussion_r180939363
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
 ##
 @@ -242,6 +247,10 @@ public FieldType type() {
 public abstract TypeName getTypeName();
 // For container types (e.g. ARRAY), returns the type of the contained 
element.
 @Nullable public abstract FieldType getComponentType();
+// For MAP type, returns the type of the key element.
+@Nullable public abstract FieldType getComponentKeyType();
+// For MAP type, returns the type of the value element.
+@Nullable public abstract FieldType getComponentValueType();
 
 Review comment:
   +1, will change to key as primitive, and value can be primitive/array/map/row


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90244)
Time Spent: 3h  (was: 2h 50m)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=90246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90246
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:08
Start Date: 12/Apr/18 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5003: [BEAM-3942] Update performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#discussion_r180940672
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -593,6 +593,149 @@ ext.applyJavaNature = {
   }
 }
 
+// Reads and contains all necessary performance test parameters
+class JavaPerformanceTestConfiguration {
+
+  // Path to PerfKitBenchmarker application (pkb.py).
+  String pkbLocation = System.getProperty('pkbLocation')
+
+  // Data Processing Backend's log level.
+  String logLevel = System.getProperty('logLevel', 'INFO')
+
+  // Path to gradle binary.
+  String gradleBinary = System.getProperty('gradleBinary', './gradlew')
+
+  // If benchmark is official or not.
+  // Official benchmark results are meant to be displayed on PerfKitExplorer 
dashboards.
+  String isOfficial = System.getProperty('official', 'false')
+
+  // Specifies names of benchmarks to be run by perfkit.
+  String benchmarks = System.getProperty('benchmarks', 
'beam_integration_benchmark')
+
+  // If beam is not "prebuilt then perfkit runs the build task before running 
the tests.
+  String beamPrebuilt = System.getProperty('beamPrebuilt', 'true')
+
+  // Beam's sdk to be used by perfkit.
+  String beamSdk = System.getProperty('beamSdk', 'java')
+
+  // Timeout (in seconds) after which PerfKit will stop executing the 
benchmark (and will fail).
+  String timeout = System.getProperty('itTimeout', '1200')
+
+  // Path to kubernetes configuration file.
+  String kubeconfig = System.getProperty('kubeconfig', 
System.getProperty('user.home') + '/.kube/config')
+
+  // Path to kubernetes executable.
+  String kubectl = System.getProperty('kubectl', 'kubectl')
+
+  // Paths to files with kubernetes infrastructure to setup before the test 
runs.
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no scripts are expected.
+  String kubernetesScripts = System.getProperty('kubernetesScripts', '')
+
+  // Pipeline options to be used by the tested pipeline.
+  String integrationTestPipelineOptions = 
System.getProperty('integrationTestPipelineOptions')
+
+  // Path to file with 'dynamic' and 'static' pipeline options.
+  // that will be appended by perfkit to the test running command.
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no config file is expected.
+  String optionsConfigFile = System.getProperty('beamITOptions', '')
+
+  // Fully qualified name of the test to be run, eg:
+  // 'org.apache.beam.sdks.java.io.jdbc.JdbcIOIT'.
 
 Review comment:
   
   Can we group required and optional properties separately ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90246)
Time Spent: 7h 20m  (was: 7h 10m)

> Update performance testing framework to use Gradle.
> ---
>
> Key: BEAM-3942
> URL: https://issues.apache.org/jira/browse/BEAM-3942
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Łukasz Gajowy
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> This requires performing updates to PerfKitBenchmarker and Beam so that we 
> can execute performance tests using Gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=90249=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90249
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:08
Start Date: 12/Apr/18 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5003: [BEAM-3942] Update performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#discussion_r180940674
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -593,6 +593,149 @@ ext.applyJavaNature = {
   }
 }
 
+// Reads and contains all necessary performance test parameters
+class JavaPerformanceTestConfiguration {
+
+  // Path to PerfKitBenchmarker application (pkb.py).
+  String pkbLocation = System.getProperty('pkbLocation')
+
+  // Data Processing Backend's log level.
+  String logLevel = System.getProperty('logLevel', 'INFO')
+
+  // Path to gradle binary.
+  String gradleBinary = System.getProperty('gradleBinary', './gradlew')
+
+  // If benchmark is official or not.
+  // Official benchmark results are meant to be displayed on PerfKitExplorer 
dashboards.
+  String isOfficial = System.getProperty('official', 'false')
+
+  // Specifies names of benchmarks to be run by perfkit.
+  String benchmarks = System.getProperty('benchmarks', 
'beam_integration_benchmark')
+
+  // If beam is not "prebuilt then perfkit runs the build task before running 
the tests.
+  String beamPrebuilt = System.getProperty('beamPrebuilt', 'true')
+
+  // Beam's sdk to be used by perfkit.
+  String beamSdk = System.getProperty('beamSdk', 'java')
+
+  // Timeout (in seconds) after which PerfKit will stop executing the 
benchmark (and will fail).
+  String timeout = System.getProperty('itTimeout', '1200')
+
+  // Path to kubernetes configuration file.
+  String kubeconfig = System.getProperty('kubeconfig', 
System.getProperty('user.home') + '/.kube/config')
+
+  // Path to kubernetes executable.
+  String kubectl = System.getProperty('kubectl', 'kubectl')
+
+  // Paths to files with kubernetes infrastructure to setup before the test 
runs.
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no scripts are expected.
+  String kubernetesScripts = System.getProperty('kubernetesScripts', '')
+
+  // Pipeline options to be used by the tested pipeline.
+  String integrationTestPipelineOptions = 
System.getProperty('integrationTestPipelineOptions')
+
+  // Path to file with 'dynamic' and 'static' pipeline options.
+  // that will be appended by perfkit to the test running command.
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no config file is expected.
+  String optionsConfigFile = System.getProperty('beamITOptions', '')
+
+  // Fully qualified name of the test to be run, eg:
+  // 'org.apache.beam.sdks.java.io.jdbc.JdbcIOIT'.
+  String integrationTest = System.getProperty('integrationTest')
+
+  // Relative path to module where the test is, eg. 'sdks/java/io/jdbc.
+  String itModule = System.getProperty('itModule')
+
+  // Runner which will be used for running the tests.
+  String runner = System.getProperty('integrationTestRunner', 'direct')
+
+  // Any additional properties to be appended to benchmark execution command.
+  String extraProperties = System.getProperty('beamExtraProperties', '')
+}
+
+// When applied in a module's build.gradle file, this closure provides set of 
tasks
+// needed to run IOIT integration tests (manually, without perfkit).
+ext.enableJavaPerformanceTesting = {
+  println "enableJavaPerformanceTesting with ${it ? "$it" : "default 
configuration"} for project ${project.name}"
+
+  // Use the implicit it parameter of the closure to handle zero argument or 
one argument map calls.
+  // See: http://groovy-lang.org/closures.html#implicit-it
+  JavaPerformanceTestConfiguration configuration = it ? it as 
JavaPerformanceTestConfiguration : new JavaPerformanceTestConfiguration()
+
+  // Add runners needed to run integration tests on
+  task packageIntegrationTests(type: Jar) {
+  def runner = configuration.runner
+  dependencies {
+if (runner.contains('dataflow')) {
+  testCompile project(path: ":runners:google-cloud-dataflow-java", 
configuration: 'shadowTest')
+}
+
+if (runner.contains('direct')) {
+  testCompile project(path: ":runners:direct-java", configuration: 
'shadowTest')
+}
+}
+  }
+
+  // Task for running integration tests
+  task integrationTest(type: Test) {
 
 Review comment:
   
   Why do we have to define a new task for integration tests ? I think for 
Maven we used a standard task (verify). Is this a requirement for Gradle ?


This is an automated message from the Apache Git Service.
To respond to the 

[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=90248=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90248
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:08
Start Date: 12/Apr/18 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5003: [BEAM-3942] Update performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#discussion_r180940673
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -593,6 +593,149 @@ ext.applyJavaNature = {
   }
 }
 
+// Reads and contains all necessary performance test parameters
+class JavaPerformanceTestConfiguration {
+
+  // Path to PerfKitBenchmarker application (pkb.py).
+  String pkbLocation = System.getProperty('pkbLocation')
+
+  // Data Processing Backend's log level.
+  String logLevel = System.getProperty('logLevel', 'INFO')
+
+  // Path to gradle binary.
+  String gradleBinary = System.getProperty('gradleBinary', './gradlew')
+
+  // If benchmark is official or not.
+  // Official benchmark results are meant to be displayed on PerfKitExplorer 
dashboards.
+  String isOfficial = System.getProperty('official', 'false')
+
+  // Specifies names of benchmarks to be run by perfkit.
+  String benchmarks = System.getProperty('benchmarks', 
'beam_integration_benchmark')
+
+  // If beam is not "prebuilt then perfkit runs the build task before running 
the tests.
 
 Review comment:
   
   Nit: end quote after prebuilt.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90248)
Time Spent: 7h 40m  (was: 7.5h)

> Update performance testing framework to use Gradle.
> ---
>
> Key: BEAM-3942
> URL: https://issues.apache.org/jira/browse/BEAM-3942
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Łukasz Gajowy
>Priority: Major
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> This requires performing updates to PerfKitBenchmarker and Beam so that we 
> can execute performance tests using Gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=90247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90247
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:08
Start Date: 12/Apr/18 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5003: [BEAM-3942] Update performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#discussion_r180940671
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -593,6 +593,149 @@ ext.applyJavaNature = {
   }
 }
 
+// Reads and contains all necessary performance test parameters
+class JavaPerformanceTestConfiguration {
+
+  // Path to PerfKitBenchmarker application (pkb.py).
+  String pkbLocation = System.getProperty('pkbLocation')
+
+  // Data Processing Backend's log level.
+  String logLevel = System.getProperty('logLevel', 'INFO')
+
+  // Path to gradle binary.
+  String gradleBinary = System.getProperty('gradleBinary', './gradlew')
+
+  // If benchmark is official or not.
+  // Official benchmark results are meant to be displayed on PerfKitExplorer 
dashboards.
+  String isOfficial = System.getProperty('official', 'false')
+
+  // Specifies names of benchmarks to be run by perfkit.
+  String benchmarks = System.getProperty('benchmarks', 
'beam_integration_benchmark')
+
+  // If beam is not "prebuilt then perfkit runs the build task before running 
the tests.
+  String beamPrebuilt = System.getProperty('beamPrebuilt', 'true')
+
+  // Beam's sdk to be used by perfkit.
+  String beamSdk = System.getProperty('beamSdk', 'java')
+
+  // Timeout (in seconds) after which PerfKit will stop executing the 
benchmark (and will fail).
+  String timeout = System.getProperty('itTimeout', '1200')
+
+  // Path to kubernetes configuration file.
+  String kubeconfig = System.getProperty('kubeconfig', 
System.getProperty('user.home') + '/.kube/config')
+
+  // Path to kubernetes executable.
+  String kubectl = System.getProperty('kubectl', 'kubectl')
+
+  // Paths to files with kubernetes infrastructure to setup before the test 
runs.
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no scripts are expected.
+  String kubernetesScripts = System.getProperty('kubernetesScripts', '')
+
+  // Pipeline options to be used by the tested pipeline.
+  String integrationTestPipelineOptions = 
System.getProperty('integrationTestPipelineOptions')
+
+  // Path to file with 'dynamic' and 'static' pipeline options.
+  // that will be appended by perfkit to the test running command.
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no config file is expected.
+  String optionsConfigFile = System.getProperty('beamITOptions', '')
+
+  // Fully qualified name of the test to be run, eg:
+  // 'org.apache.beam.sdks.java.io.jdbc.JdbcIOIT'.
+  String integrationTest = System.getProperty('integrationTest')
+
+  // Relative path to module where the test is, eg. 'sdks/java/io/jdbc.
+  String itModule = System.getProperty('itModule')
+
+  // Runner which will be used for running the tests.
+  String runner = System.getProperty('integrationTestRunner', 'direct')
+
+  // Any additional properties to be appended to benchmark execution command.
+  String extraProperties = System.getProperty('beamExtraProperties', '')
+}
+
+// When applied in a module's build.gradle file, this closure provides set of 
tasks
+// needed to run IOIT integration tests (manually, without perfkit).
 
 Review comment:
   
   s/perfkit/PerfKitBenchmarker (here and elsewhere).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90247)
Time Spent: 7.5h  (was: 7h 20m)

> Update performance testing framework to use Gradle.
> ---
>
> Key: BEAM-3942
> URL: https://issues.apache.org/jira/browse/BEAM-3942
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Łukasz Gajowy
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> This requires performing updates to PerfKitBenchmarker and Beam so that we 
> can execute performance tests using Gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=90251=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90251
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:08
Start Date: 12/Apr/18 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5003: [BEAM-3942] Update performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#discussion_r180940679
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -593,6 +593,118 @@ ext.applyJavaNature = {
   }
 }
 
+// Reads and contains all necessary performance test parameters
+class JavaPerformanceTestConfiguration {
+  String pkbLocation = System.getProperty('pkbLocation')
+
+  String logLevel = System.getProperty('logLevel', 'INFO')
+  String gradleBinary = System.getProperty('gradleBinary', './gradlew')
+  String isOfficial = System.getProperty('official', 'true')
+  String benchmarks = System.getProperty('benchmarks', 
'beam_integration_benchmark')
+
+  String beamPrebuilt = System.getProperty('beamPrebuilt', 'true')
+  String beamSdk = System.getProperty('beamSdk', 'java')
+
+  String timeout = System.getProperty('itTimeout', '1200')
+
+  String kubeconfig = System.getProperty('kubeconfig', 
System.getProperty('user.home') + '/.kube/config')
+  String kubectl = System.getProperty('kubectl', 'kubectl')
+
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no scripts are expected.
+  String kubernetesScripts = System.getProperty('kubernetesScripts', '')
+
+  String integrationTestPipelineOptions = 
System.getProperty('integrationTestPipelineOptions')
+
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no config file is expected.
+  String optionsConfigFile = System.getProperty('beamITOptions', '')
+
+  String integrationTest = System.getProperty('integrationTest')
+  String itModule = System.getProperty('itModule')
+
+  String runner = System.getProperty('integrationTestRunner', 'direct')
+
+  String extraProperties = System.getProperty('beamExtraProperties', '')
+}
+
+// Configures a project with a set of tasks needed for running performance 
tests
+ext.enableJavaPerformanceTesting = {
+  println "enableJavaPerformanceTesting with ${it ? "$it" : "default 
configuration"} for project ${project.name}"
+
+  // Use the implicit it parameter of the closure to handle zero argument or 
one argument map calls.
+  JavaPerformanceTestConfiguration configuration = it ? it as 
JavaPerformanceTestConfiguration : new JavaPerformanceTestConfiguration()
+
+  // Add runners needed to run integration tests on
+  task packageIntegrationTests(type: Jar) {
+if (gradle.startParameter.taskNames.contains('integrationTest')) {
 
 Review comment:
   
   
   > **lgajowy** wrote:
   > This line effectively means that we should include the runner dependencies 
only when the `integrationTest` task is run. So when user runs `./gradlew 
integrationTest...` those dependencies are added while building the jar. 
   > 
   > Now I see that this is not enough due the following reason: there is no 
way to "prebuild" tests - we cannot build with a separate build command and 
then just run the tests. Such prebuilding is needed by perfkit so it's a 
necessity I forgot to take into account here. "Prebuilding" is invoked when we 
specify the "beam_prebuilt" flag to false, see more in [Perfkit's 
code](https://github.com/GoogleCloudPlatform/PerfKitBenchmarker/blob/a3e6c0382809cb68cd294f34cbd5b3b558e53f56/perfkitbenchmarker/beam_benchmark_helper.py#L147).
   > 
   > I think we should remove this condition and include the runner dependency 
only if the command is run with appropriate "integrationTestRunner" property . 
I changed this in the next commit. 
   > So now we can prebuild in Perfkit using this command: 
   > `./gradlew clean build -xtest` -DintegrationTestRunner=dataflow 
   > and run the tests using `integrationTest` task, for example like this: 
   > 
   > and execute the tests using this one:
   > `./gradlew clean integrationTest -p sdks/java/io/file-based-io-tests/ 
-DintegrationTestPipelineOptions='[]' --tests 
org.apache.beam.sdk.io.text.TextIOIT  -DintegrationTestRunner=dataflow`
   > 
   > WDYT?
   
   
   Makes sense. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90251)
Time Spent: 8h 10m  (was: 8h)

> Update performance testing framework to use Gradle.
> ---
>
>   

[jira] [Work logged] (BEAM-3942) Update performance testing framework to use Gradle.

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3942?focusedWorklogId=90250=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90250
 ]

ASF GitHub Bot logged work on BEAM-3942:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:08
Start Date: 12/Apr/18 01:08
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on a change in pull request 
#5003: [BEAM-3942] Update performance testing framework to use Gradle
URL: https://github.com/apache/beam/pull/5003#discussion_r180940678
 
 

 ##
 File path: build_rules.gradle
 ##
 @@ -593,6 +593,118 @@ ext.applyJavaNature = {
   }
 }
 
+// Reads and contains all necessary performance test parameters
+class JavaPerformanceTestConfiguration {
+  String pkbLocation = System.getProperty('pkbLocation')
+
+  String logLevel = System.getProperty('logLevel', 'INFO')
+  String gradleBinary = System.getProperty('gradleBinary', './gradlew')
+  String isOfficial = System.getProperty('official', 'true')
+  String benchmarks = System.getProperty('benchmarks', 
'beam_integration_benchmark')
+
+  String beamPrebuilt = System.getProperty('beamPrebuilt', 'true')
+  String beamSdk = System.getProperty('beamSdk', 'java')
+
+  String timeout = System.getProperty('itTimeout', '1200')
+
+  String kubeconfig = System.getProperty('kubeconfig', 
System.getProperty('user.home') + '/.kube/config')
+  String kubectl = System.getProperty('kubectl', 'kubectl')
+
+  // PerfKit will have trouble reading 'null' path. It expects empty string if 
no scripts are expected.
 
 Review comment:
   
   
   > **lgajowy** wrote:
   > I think we can fix this on the perfkit side but I rather tend to think it 
should be done in separate PR. I wanted to change the least possible amount of 
code in perfkit to integrate with gradle and then improve. Do you agree with me?
   
   
   Sounds good.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90250)
Time Spent: 8h  (was: 7h 50m)

> Update performance testing framework to use Gradle.
> ---
>
> Key: BEAM-3942
> URL: https://issues.apache.org/jira/browse/BEAM-3942
> Project: Beam
>  Issue Type: Sub-task
>  Components: testing
>Reporter: Chamikara Jayalath
>Assignee: Łukasz Gajowy
>Priority: Major
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> This requires performing updates to PerfKitBenchmarker and Beam so that we 
> can execute performance tests using Gradle.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3981) Futurize and fix python 2 compatibility for coders package

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3981?focusedWorklogId=90256=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90256
 ]

ASF GitHub Bot logged work on BEAM-3981:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:33
Start Date: 12/Apr/18 01:33
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5053: [BEAM-3981] Futurize coders subpackage
URL: https://github.com/apache/beam/pull/5053#discussion_r180943594
 
 

 ##
 File path: sdks/python/apache_beam/coders/coder_impl.py
 ##
 @@ -304,7 +307,7 @@ def encode_to_stream(self, value, stream, nested):
   dict_value = value  # for typing
   stream.write_byte(DICT_TYPE)
   stream.write_var_int64(len(dict_value))
-  for k, v in dict_value.iteritems():
+  for k, v in dict_value.items():
 
 Review comment:
   Any particular reason for the `iteritems()` -> `items()` change?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90256)
Time Spent: 7.5h  (was: 7h 20m)

> Futurize and fix python 2 compatibility for coders package
> --
>
> Key: BEAM-3981
> URL: https://issues.apache.org/jira/browse/BEAM-3981
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Run automatic conversion with futurize tool on coders subpackage and fix 
> python 2 compatibility. This prepares the subpackage for python 3 support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4028) Step / Operation naming should rely on a NameContext class

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4028?focusedWorklogId=90258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90258
 ]

ASF GitHub Bot logged work on BEAM-4028:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:40
Start Date: 12/Apr/18 01:40
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5043: [BEAM-4028] Adding NameContext to Python SDK.
URL: https://github.com/apache/beam/pull/5043#discussion_r180944556
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -682,14 +701,14 @@ def execute(self):
 # The order of the elements is important because the inputs use
 # list indexes as references.
 
-step_names = (
-self._map_task.step_names or [None] * len(self._map_task.operations))
 for ix, spec in enumerate(self._map_task.operations):
   # This is used for logging and assigning names to counters.
-  operation_name = self._map_task.system_names[ix]
-  step_name = step_names[ix]
+  name_context = common.DataflowNameContext(
 
 Review comment:
   
   Will we always need the Dataflow version here?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90258)
Time Spent: 2h  (was: 1h 50m)

> Step / Operation naming should rely on a NameContext class
> --
>
> Key: BEAM-4028
> URL: https://issues.apache.org/jira/browse/BEAM-4028
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Steps can have different names depending on the runner (stage, step, user, 
> system name...). 
> Depending on the needs of different components (operations, logging, metrics, 
> statesampling) these step names are passed around without a specific order.
> Instead, SDK should rely on `NameContext` objects that carry all the naming 
> information for a single step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3714?focusedWorklogId=90263=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90263
 ]

ASF GitHub Bot logged work on BEAM-3714:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:49
Start Date: 12/Apr/18 01:49
Worklog Time Spent: 10m 
  Work Description: evindj commented on issue #5109: [BEAM-3714]modified 
result set to be forward only and read only
URL: https://github.com/apache/beam/pull/5109#issuecomment-380648883
 
 
   @jkff I add to close the other PR rebasing  added some other commits to the 
it that could have been confusing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90263)
Time Spent: 2h 40m  (was: 2.5h)

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Innocent
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=90229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90229
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 11/Apr/18 23:48
Start Date: 11/Apr/18 23:48
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on a change in pull request #5079: 
[BEAM-2990] support MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#discussion_r180930489
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/values/Row.java
 ##
 @@ -355,6 +379,11 @@ public Builder addValues(Object ... values) {
   return this;
 }
 
+public  Builder addMap(Map data) {
 
 Review comment:
   it's not necessary, will remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90229)
Time Spent: 2h 40m  (was: 2.5h)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2898) Flink supports chaining/fusion of single-SDK stages

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2898?focusedWorklogId=90231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90231
 ]

ASF GitHub Bot logged work on BEAM-2898:


Author: ASF GitHub Bot
Created on: 11/Apr/18 23:56
Start Date: 11/Apr/18 23:56
Worklog Time Spent: 10m 
  Work Description: tgroh commented on issue #4783: [BEAM-2898] Support 
Impulse transforms in Flink batch runner
URL: https://github.com/apache/beam/pull/4783#issuecomment-380631588
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90231)
Time Spent: 4h 20m  (was: 4h 10m)

> Flink supports chaining/fusion of single-SDK stages
> ---
>
> Key: BEAM-2898
> URL: https://issues.apache.org/jira/browse/BEAM-2898
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink
>Reporter: Henning Rohde
>Priority: Major
>  Labels: portability
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> The Fn API supports fused stages, which avoids unnecessarily round-tripping 
> the data over the Fn API between stages. The Flink runner should use that 
> capability for better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] branch master updated (afddb22 -> b8cf07d)

2018-04-11 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git.


from afddb22  Merge pull request #5091 from swegner/jenkins_build_scan
 add 919295f  Updates gcsio to use a API specific batch endpoint.
 new b8cf07d  Merge pull request #5105: [BEAM-3991] Updates gcsio to use a 
API specific batch endpoint

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 sdks/python/apache_beam/io/gcp/gcsio.py | 13 +
 1 file changed, 13 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


[jira] [Work logged] (BEAM-3991) Update dependency 'google-api-services-storage' to latest version

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3991?focusedWorklogId=90234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90234
 ]

ASF GitHub Bot logged work on BEAM-3991:


Author: ASF GitHub Bot
Created on: 12/Apr/18 00:00
Start Date: 12/Apr/18 00:00
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #5105: [BEAM-3991] 
Updates gcsio to use a API specific batch endpoint.
URL: https://github.com/apache/beam/pull/5105
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/sdks/python/apache_beam/io/gcp/gcsio.py 
b/sdks/python/apache_beam/io/gcp/gcsio.py
index f687686fd64..e428e3b1401 100644
--- a/sdks/python/apache_beam/io/gcp/gcsio.py
+++ b/sdks/python/apache_beam/io/gcp/gcsio.py
@@ -91,6 +91,17 @@
 # GcsIO.delete_batch().
 MAX_BATCH_OPERATION_SIZE = 100
 
+# Batch endpoint URL for GCS.
+# We have to specify an API specific endpoint here since Google APIs global
+# batch endpoints will be deprecated on 03/25/2019.
+# See 
https://developers.googleblog.com/2018/03/discontinuing-support-for-json-rpc-and.html.
  # pylint: disable=line-too-long
+# Currently apitools library uses a global batch endpoint by default:
+# https://github.com/google/apitools/blob/master/apitools/base/py/batch.py#L152
+# TODO: remove this constant and it's usage after apitools move to using an API
+# specific batch endpoint or after Beam gcsio module start using a GCS client
+# library that does not use global batch endpoints.
+GCS_BATCH_ENDPOINT = 'https://www.googleapis.com/batch/storage/v1'
+
 
 def proxy_info_from_environment_var(proxy_env_var):
   """Reads proxy info from the environment and converts to httplib2.ProxyInfo.
@@ -274,6 +285,7 @@ def delete_batch(self, paths):
 if not paths:
   return []
 batch_request = BatchApiRequest(
+batch_url=GCS_BATCH_ENDPOINT,
 retryable_codes=retry.SERVER_ERROR_OR_TIMEOUT_CODES)
 for path in paths:
   bucket, object_path = parse_gcs_path(path)
@@ -328,6 +340,7 @@ def copy_batch(self, src_dest_pairs):
 if not src_dest_pairs:
   return []
 batch_request = BatchApiRequest(
+batch_url=GCS_BATCH_ENDPOINT,
 retryable_codes=retry.SERVER_ERROR_OR_TIMEOUT_CODES)
 for src, dest in src_dest_pairs:
   src_bucket, src_path = parse_gcs_path(src)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90234)
Time Spent: 50m  (was: 40m)

> Update dependency 'google-api-services-storage' to latest version
> -
>
> Key: BEAM-3991
> URL: https://issues.apache.org/jira/browse/BEAM-3991
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.5.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently we use version 'v1-rev71-1.22.0' which is deprecated and about two 
> years old.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3991) Update dependency 'google-api-services-storage' to latest version

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3991?focusedWorklogId=90233=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90233
 ]

ASF GitHub Bot logged work on BEAM-3991:


Author: ASF GitHub Bot
Created on: 12/Apr/18 00:00
Start Date: 12/Apr/18 00:00
Worklog Time Spent: 10m 
  Work Description: chamikaramj commented on issue #5105: [BEAM-3991] 
Updates gcsio to use a API specific batch endpoint.
URL: https://github.com/apache/beam/pull/5105#issuecomment-380632160
 
 
   Thanks. Merging.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90233)
Time Spent: 40m  (was: 0.5h)

> Update dependency 'google-api-services-storage' to latest version
> -
>
> Key: BEAM-3991
> URL: https://issues.apache.org/jira/browse/BEAM-3991
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.5.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently we use version 'v1-rev71-1.22.0' which is deprecated and about two 
> years old.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3249) Use Gradle to build/release project

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3249?focusedWorklogId=90235=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90235
 ]

ASF GitHub Bot logged work on BEAM-3249:


Author: ASF GitHub Bot
Created on: 12/Apr/18 00:01
Start Date: 12/Apr/18 00:01
Worklog Time Spent: 10m 
  Work Description: chamikaramj closed pull request #5107: [BEAM-3249] 
Clean-up and use shaded test jars, removing evaluationDependsOn
URL: https://github.com/apache/beam/pull/5107
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):



 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90235)
Time Spent: 12.5h  (was: 12h 20m)

> Use Gradle to build/release project
> ---
>
> Key: BEAM-3249
> URL: https://issues.apache.org/jira/browse/BEAM-3249
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system, testing
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Major
>  Time Spent: 12.5h
>  Remaining Estimate: 0h
>
> I have collected data by running several builds against master using Gradle 
> and Maven without using Gradle's support for incremental builds.
> Gradle (mins)
> min: 25.04
> max: 160.14
> median: 45.78
> average: 52.19
> stdev: 30.80
> Maven (mins)
> min: 56.86
> max: 216.55
> median: 87.93
> average: 109.10
> stdev: 48.01
> I excluded a few timeouts (240 mins) that happened during the Maven build 
> from its numbers but we can see conclusively that Gradle is about twice as 
> fast for the build when compared to Maven when run using Jenkins.
> Original dev@ thread: 
> https://lists.apache.org/thread.html/225dddcfc78f39bbb296a0d2bbef1caf37e17677c7e5573f0b6fe253@%3Cdev.beam.apache.org%3E
> The data is available here 
> https://docs.google.com/spreadsheets/d/1MHVjF-xoI49_NJqEQakUgnNIQ7Qbjzu8Y1q_h3dbF1M/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_TextIOIT #380

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[swegner] Publish a build scan from Jenkins build

[swegner] Migrate a few more references to Maven from our README.md

[amyrvold] [BEAM-4053] Add a Go postcommit

[lcwik] [BEAM-3249] Only generate all artifacts when publishing.

[ehudm] Clean up docker networks generated by hdfs integration tests.

[swegner] Add comment about build scans only on Jenkins

[chamikara] Updates gcsio to use a API specific batch endpoint.

--
[...truncated 27.34 KB...]
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.22.0 
from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev374-1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.auto.value:auto-value:jar:1.5.3 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core:jar:1.0.2 from the shaded 
jar.
[INFO] Excluding org.json:json:jar:20160810 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-spanner:jar:0.20.0b-beta from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:proto-google-cloud-spanner-admin-instance-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-spanner-v1:jar:0.1.11b 
from the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-database-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding 
com.google.api.grpc:grpc-google-cloud-spanner-admin-instance-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-longrunning-v1:jar:0.1.11 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-longrunning-v1:jar:0.1.11 
from the shaded jar.
[INFO] Excluding com.google.cloud.bigtable:bigtable-protos:jar:1.0.0-pre3 from 
the shaded jar.
[INFO] Excluding 

Jenkins build is back to normal : beam_PerformanceTests_HadoopInputFormat #130

2018-04-11 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-3991) Update dependency 'google-api-services-storage' to latest version

2018-04-11 Thread Chamikara Jayalath (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434750#comment-16434750
 ] 

Chamikara Jayalath commented on BEAM-3991:
--

Python SDK was updated.

PR for updating Java SDK is [https://github.com/apache/beam/pull/5046]

This is waiting till next iteration of Dataflow service is released (within a 
week).

> Update dependency 'google-api-services-storage' to latest version
> -
>
> Key: BEAM-3991
> URL: https://issues.apache.org/jira/browse/BEAM-3991
> Project: Beam
>  Issue Type: Improvement
>  Components: io-java-gcp, sdk-py-core
>Reporter: Chamikara Jayalath
>Assignee: Chamikara Jayalath
>Priority: Blocker
> Fix For: 2.5.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently we use version 'v1-rev71-1.22.0' which is deprecated and about two 
> years old.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Python #1137

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[swegner] Publish a build scan from Jenkins build

[swegner] Migrate a few more references to Maven from our README.md

[amyrvold] [BEAM-4053] Add a Go postcommit

[lcwik] [BEAM-3249] Only generate all artifacts when publishing.

[ehudm] Clean up docker networks generated by hdfs integration tests.

[swegner] Add comment about build scans only on Jenkins

[chamikara] Updates gcsio to use a API specific batch endpoint.

--
[...truncated 1.62 KB...]
[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins7847917087091308028.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins4764724376474456743.sh
+ rm -rf .env
[beam_PerformanceTests_Python] $ /bin/bash -xe /tmp/jenkins140245675040456348.sh
+ virtualenv .env --system-site-packages
New python executable in 

Installing setuptools, pip, wheel...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1524304689042729770.sh
+ .env/bin/pip install --upgrade setuptools pip
Requirement already up-to-date: setuptools in ./.env/lib/python2.7/site-packages
Requirement already up-to-date: pip in ./.env/lib/python2.7/site-packages
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins2592019467096105231.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins5790369867633706259.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Collecting numpy==1.13.3 (from -r PerfKitBenchmarker/requirements.txt (line 22))
  Using cached numpy-1.13.3-cp27-cp27mu-manylinux1_x86_64.whl
Requirement already satisfied: functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in 
/home/jenkins/.local/lib/python2.7/site-packages (from absl-py->-r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: xmltodict in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests-ntlm>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests>=2.9.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: ntlm-auth>=1.0.2 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 
requests-ntlm>=0.3.0->pywinrm->-r PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: cryptography>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 
requests-ntlm>=0.3.0->pywinrm->-r PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: idna<2.6,>=2.5 in 

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #50

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[swegner] Publish a build scan from Jenkins build

[swegner] Add comment about build scans only on Jenkins

--
[...truncated 19.00 MB...]
Apr 12, 2018 12:13:42 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Schema 
View/Combine.GloballyAsSingletonView/CreateDataflowView as step s13
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Serialize 
mutations as step s14
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Extract keys 
as step s15
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/GroupByKey as step s16
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Sample 
keys/Combine.GroupedValues as step s17
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) as step s18
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly as step s19
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParMultiDo(ToIsmRecordForMapLike) as step s20
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForSize as step s21
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForSize) as step s22
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForKeys as step s23
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForKey) as step s24
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/Flatten.PCollections as step s25
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/CreateDataflowView as step s26
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Partition 
input as step s27
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Group by 
partition as step s28
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Batch 
mutations together as step s29
Apr 12, 2018 12:13:43 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Write 
mutations to Spanner as step s30
Apr 12, 2018 12:13:43 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0412001325-6999722f/output/results/staging/
Apr 12, 2018 12:13:43 AM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <80355 bytes, hash DSgYS3XRZb3oP5dHXXoOCQ> to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0412001325-6999722f/output/results/staging/pipeline-DSgYS3XRZb3oP5dHXXoOCQ.pb

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite 

[jira] [Work logged] (BEAM-3981) Futurize and fix python 2 compatibility for coders package

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3981?focusedWorklogId=90255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90255
 ]

ASF GitHub Bot logged work on BEAM-3981:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:32
Start Date: 12/Apr/18 01:32
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5053: [BEAM-3981] Futurize coders subpackage
URL: https://github.com/apache/beam/pull/5053#discussion_r180943388
 
 

 ##
 File path: sdks/python/apache_beam/coders/coder_impl.py
 ##
 @@ -279,21 +282,21 @@ def get_estimated_size_and_observables(self, value, 
nested=False):
 
   def encode_to_stream(self, value, stream, nested):
 t = type(value)
-if t is NoneType:
+if value is None:
   stream.write_byte(NONE_TYPE)
 elif t is int:
   stream.write_byte(INT_TYPE)
   stream.write_var_int64(value)
 elif t is float:
   stream.write_byte(FLOAT_TYPE)
   stream.write_bigendian_double(value)
-elif t is str:
-  stream.write_byte(STR_TYPE)
+elif t is bytes:
+  stream.write_byte(BYTES_TYPE)
   stream.write(value, nested)
-elif t is six.text_type:
-  unicode_value = value  # for typing
-  stream.write_byte(UNICODE_TYPE)
-  stream.write(unicode_value.encode('utf-8'), nested)
+elif t is unicode:
+  text_value = value  # for typing
 
 Review comment:
   Can you use the same `try: unicode except: unicode = str` in the 
corresponding part of the .pxd file so that the type annotation directive for 
`text_value` is respected?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90255)
Time Spent: 7h 20m  (was: 7h 10m)

> Futurize and fix python 2 compatibility for coders package
> --
>
> Key: BEAM-3981
> URL: https://issues.apache.org/jira/browse/BEAM-3981
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Run automatic conversion with futurize tool on coders subpackage and fix 
> python 2 compatibility. This prepares the subpackage for python 3 support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3981) Futurize and fix python 2 compatibility for coders package

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3981?focusedWorklogId=90253=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90253
 ]

ASF GitHub Bot logged work on BEAM-3981:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:32
Start Date: 12/Apr/18 01:32
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5053: [BEAM-3981] Futurize coders subpackage
URL: https://github.com/apache/beam/pull/5053#discussion_r180943361
 
 

 ##
 File path: sdks/python/apache_beam/coders/coders.py
 ##
 @@ -309,11 +318,6 @@ class ToStringCoder(Coder):
   """A default string coder used if no sink coder is specified."""
 
   def encode(self, value):
-try:   # Python 2
-  if isinstance(value, unicode):
-return value.encode('utf-8')
 
 Review comment:
   Should we do the same `unicode = str` thing here?  (In the new version, we 
will raise an error if the value is a non-ascii unicode string; eg: `str(u'')` 
raises an error, while this worked before.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90253)
Time Spent: 7h  (was: 6h 50m)

> Futurize and fix python 2 compatibility for coders package
> --
>
> Key: BEAM-3981
> URL: https://issues.apache.org/jira/browse/BEAM-3981
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Run automatic conversion with futurize tool on coders subpackage and fix 
> python 2 compatibility. This prepares the subpackage for python 3 support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3981) Futurize and fix python 2 compatibility for coders package

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3981?focusedWorklogId=90254=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90254
 ]

ASF GitHub Bot logged work on BEAM-3981:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:32
Start Date: 12/Apr/18 01:32
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5053: [BEAM-3981] Futurize coders subpackage
URL: https://github.com/apache/beam/pull/5053#discussion_r180943338
 
 

 ##
 File path: sdks/python/apache_beam/coders/coders.py
 ##
 @@ -216,6 +222,9 @@ def __eq__(self, other):
 and self._dict_without_impl() == other._dict_without_impl())
 # pylint: enable=protected-access
 
+  def __hash__(self):
+return hash(type(self))
 
 Review comment:
   Any particular reason for this change?  Previously, the hash would default 
to `object.__hash__`, which tries to give a different hash code for each 
instance.  This change would give the same hash code for each class, which may 
not be desirable, since coders in general could be parameterized.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90254)
Time Spent: 7h 10m  (was: 7h)

> Futurize and fix python 2 compatibility for coders package
> --
>
> Key: BEAM-3981
> URL: https://issues.apache.org/jira/browse/BEAM-3981
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-py-core
>Reporter: Robbe
>Assignee: Ahmet Altay
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Run automatic conversion with futurize tool on coders subpackage and fix 
> python 2 compatibility. This prepares the subpackage for python 3 support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[beam] 01/01: Merge pull request #5105: [BEAM-3991] Updates gcsio to use a API specific batch endpoint

2018-04-11 Thread chamikara
This is an automated email from the ASF dual-hosted git repository.

chamikara pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git

commit b8cf07dc03643ded1d4e9d45caf4151316cf2c3e
Merge: afddb22 919295f
Author: Chamikara Jayalath 
AuthorDate: Wed Apr 11 17:00:41 2018 -0700

Merge pull request #5105: [BEAM-3991] Updates gcsio to use a API specific 
batch endpoint

 sdks/python/apache_beam/io/gcp/gcsio.py | 13 +
 1 file changed, 13 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
chamik...@apache.org.


Build failed in Jenkins: beam_PerformanceTests_Spark #1579

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[swegner] Publish a build scan from Jenkins build

[swegner] Migrate a few more references to Maven from our README.md

[amyrvold] [BEAM-4053] Add a Go postcommit

[lcwik] [BEAM-3249] Only generate all artifacts when publishing.

[ehudm] Clean up docker networks generated by hdfs integration tests.

[swegner] Add comment about build scans only on Jenkins

[chamikara] Updates gcsio to use a API specific batch endpoint.

--
[...truncated 89.30 KB...]
'apache-beam-testing:bqjob_r9c3ae5d9026ac04_0162b7407a31_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r9c3ae5d9026ac04_0162b7407a31_1 ... (0s) 
Current status: RUNNING 
Waiting on 
bqjob_r9c3ae5d9026ac04_0162b7407a31_1 ... (0s) Current status: DONE   
2018-04-12 00:28:00,962 1d1349eb MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-12 00:28:27,934 1d1349eb MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-12 00:28:30,277 1d1349eb MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r63905bbf0b364fe_0162b740ed33_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r63905bbf0b364fe_0162b740ed33_1 ... (0s) 
Current status: RUNNING 
Waiting on 
bqjob_r63905bbf0b364fe_0162b740ed33_1 ... (0s) Current status: DONE   
2018-04-12 00:28:30,277 1d1349eb MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-12 00:28:55,626 1d1349eb MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-12 00:28:57,978 1d1349eb MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: 

BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r7512daeaf9692f82_0162b7415939_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)
Upload complete.Waiting on bqjob_r7512daeaf9692f82_0162b7415939_1 ... (0s) 
Current status: RUNNING 
 Waiting on 
bqjob_r7512daeaf9692f82_0162b7415939_1 ... (0s) Current status: DONE   
2018-04-12 00:28:57,979 1d1349eb MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-12 00:29:26,930 1d1349eb MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-12 00:29:29,358 1d1349eb MainThread INFO Ran: {bq load --autodetect 

[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=90242=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90242
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 12/Apr/18 00:55
Start Date: 12/Apr/18 00:55
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on a change in pull request #5079: 
[BEAM-2990] support MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#discussion_r180939261
 
 

 ##
 File path: sdks/java/core/src/test/java/org/apache/beam/sdk/values/RowTest.java
 ##
 @@ -174,6 +176,24 @@ public void testCreatesArrayArray() {
 assertEquals(data, row.getArray("array"));
   }
 
+  @Test
+  public void testCreatesMap() {
+Map data = new HashMap() {
+  {
+put(1, "value1");
+put(2, "value2");
+put(3, "value3");
+put(4, "value4");
 
 Review comment:
   will update to support primitive/array/map/row as value type in Map


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90242)
Time Spent: 2h 50m  (was: 2h 40m)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=90245=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90245
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 12/Apr/18 00:59
Start Date: 12/Apr/18 00:59
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on a change in pull request #5079: 
[BEAM-2990] support MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#discussion_r180939620
 
 

 ##
 File path: sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java
 ##
 @@ -208,6 +209,7 @@ public int hashCode() {
 public static final Set STRING_TYPES = ImmutableSet.of(STRING);
 public static final Set DATE_TYPES = ImmutableSet.of(DATETIME);
 public static final Set CONTAINER_TYPES = ImmutableSet.of(ARRAY);
+public static final Set MAP_TYPES = ImmutableSet.of(MAP);
 
 Review comment:
   would separate here, CONTAINER should be ARRAY/SET. `List>` could be a 
CONTAINER_TYPE, `Map<>` is not.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90245)
Time Spent: 3h 10m  (was: 3h)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Java_GradleBuild #51

2018-04-11 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-4028) Step / Operation naming should rely on a NameContext class

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4028?focusedWorklogId=90260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90260
 ]

ASF GitHub Bot logged work on BEAM-4028:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:41
Start Date: 12/Apr/18 01:41
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5043: [BEAM-4028] Adding NameContext to Python SDK.
URL: https://github.com/apache/beam/pull/5043#discussion_r180944559
 
 

 ##
 File path: sdks/python/apache_beam/runners/common.py
 ##
 @@ -39,6 +39,75 @@
 from apache_beam.utils.windowed_value import WindowedValue
 
 
+class NameContext(object):
+  """Holds the name information for a step."""
+
+  def __init__(self, step_name):
+"""Creates a new step NameContext.
+
+Args:
+  step_name: The name of the step.
+"""
+self.step_name = step_name
+
+  def __eq__(self, other):
+return self.step_name == other.step_name
+
+  def __ne__(self, other):
+return not self == other
+
+  def __repr__(self):
+return 'NameContext(%s)' % self.__dict__()
+
+  def __hash__(self):
+return hash(self.step_name)
+
+  def metrics_name(self):
+"""Returns the step name used for metrics reporting."""
+return self.step_name
+
+  def logging_name(self):
+"""Returns the step name used for logging."""
+return self.step_name
+
+
+class DataflowNameContext(NameContext):
 
 Review comment:
   
   Do we need this to be exposed in the Beam code?  Can we have this only in 
the worker?  If this needs to be done in two steps, can you add a TODO / Jira?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90260)
Time Spent: 2h 20m  (was: 2h 10m)

> Step / Operation naming should rely on a NameContext class
> --
>
> Key: BEAM-4028
> URL: https://issues.apache.org/jira/browse/BEAM-4028
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Steps can have different names depending on the runner (stage, step, user, 
> system name...). 
> Depending on the needs of different components (operations, logging, metrics, 
> statesampling) these step names are passed around without a specific order.
> Instead, SDK should rely on `NameContext` objects that carry all the naming 
> information for a single step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4028) Step / Operation naming should rely on a NameContext class

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4028?focusedWorklogId=90259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90259
 ]

ASF GitHub Bot logged work on BEAM-4028:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:40
Start Date: 12/Apr/18 01:40
Worklog Time Spent: 10m 
  Work Description: charlesccychen commented on a change in pull request 
#5043: [BEAM-4028] Adding NameContext to Python SDK.
URL: https://github.com/apache/beam/pull/5043#discussion_r180944557
 
 

 ##
 File path: sdks/python/apache_beam/runners/worker/operations.py
 ##
 @@ -104,34 +104,45 @@ class Operation(object):
   one or more receiver operations that will take that as input.
   """
 
-  def __init__(self, operation_name, spec, counter_factory, state_sampler):
+  def __init__(self, name_context, spec, counter_factory, state_sampler):
 """Initializes a worker operation instance.
 
 Args:
-  operation_name: The system name assigned by the runner for this
-operation.
+  name_context: A NameContext instance or string(deprecated), with the
+name information for this operation.
   spec: A operation_specs.Worker* instance.
   counter_factory: The CounterFactory to use for our counters.
   state_sampler: The StateSampler for the current operation.
 """
-self.operation_name = operation_name
+if isinstance(name_context, common.NameContext):
+  #TODO(pabloem) - Clean this up once it's completely migrated.
 
 Review comment:
   
   Space before TODO.  Can you also add a Jira for the remaining work?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90259)
Time Spent: 2h 10m  (was: 2h)

> Step / Operation naming should rely on a NameContext class
> --
>
> Key: BEAM-4028
> URL: https://issues.apache.org/jira/browse/BEAM-4028
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Steps can have different names depending on the runner (stage, step, user, 
> system name...). 
> Depending on the needs of different components (operations, logging, metrics, 
> statesampling) these step names are passed around without a specific order.
> Instead, SDK should rely on `NameContext` objects that carry all the naming 
> information for a single step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3714) JdbcIO.read() should create a forward-only, read-only result set

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3714?focusedWorklogId=90262=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90262
 ]

ASF GitHub Bot logged work on BEAM-3714:


Author: ASF GitHub Bot
Created on: 12/Apr/18 01:47
Start Date: 12/Apr/18 01:47
Worklog Time Spent: 10m 
  Work Description: evindj opened a new pull request #5109: 
[BEAM-3714]modified result set to be forward only and read only
URL: https://github.com/apache/beam/pull/5109
 
 
   DESCRIPTION HERE
   
   
   
   Follow this checklist to help us incorporate your contribution quickly and 
easily:
   
- [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
- [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
- [ ] Write a pull request description that is detailed enough to 
understand:
  - [ ] What the pull request does
  - [ ] Why it does it
  - [ ] How it does it
  - [ ] Why this approach
- [ ] Each commit in the pull request should have a meaningful subject line 
and body.
- [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
- [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90262)
Time Spent: 2.5h  (was: 2h 20m)

> JdbcIO.read() should create a forward-only, read-only result set
> 
>
> Key: BEAM-3714
> URL: https://issues.apache.org/jira/browse/BEAM-3714
> Project: Beam
>  Issue Type: Bug
>  Components: io-java-jdbc
>Reporter: Eugene Kirpichov
>Assignee: Innocent
>Priority: Major
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> [https://stackoverflow.com/questions/48784889/streaming-data-from-cloudsql-into-dataflow/48819934#48819934]
>  - a user is trying to load a large table from MySQL, and the MySQL JDBC 
> driver requires special measures when loading large result sets.
> JdbcIO currently calls simply "connection.prepareStatement(query)" 
> https://github.com/apache/beam/blob/bb8c12c4956cbe3c6f2e57113e7c0ce2a5c05009/sdks/java/io/jdbc/src/main/java/org/apache/beam/sdk/io/jdbc/JdbcIO.java#L508
>  - it should specify type TYPE_FORWARD_ONLY and concurrency CONCUR_READ_ONLY 
> - these values should always be used.
> Seems that different databases have different requirements for streaming 
> result sets.
> E.g. MySQL requires setting fetch size; PostgreSQL says "The Connection must 
> not be in autocommit mode." 
> https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor . 
> Oracle, I think, doesn't have any special requirements but I don't know. 
> Fetch size should probably still be set to a reasonably large value.
> Seems that the common denominator of these requirements is: set fetch size to 
> a reasonably large but not maximum value; disable autocommit (there's nothing 
> to commit in read() anyway).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4056) Identify Side Inputs by PTransform ID and local name

2018-04-11 Thread Ben Sidhom (JIRA)
Ben Sidhom created BEAM-4056:


 Summary: Identify Side Inputs by PTransform ID and local name
 Key: BEAM-4056
 URL: https://issues.apache.org/jira/browse/BEAM-4056
 Project: Beam
  Issue Type: New Feature
  Components: runner-core
Reporter: Ben Sidhom
Assignee: Ben Sidhom


This is necessary in order to correctly identify side inputs during all phases 
of portable pipeline execution (fusion, translation, and SDK execution).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_XmlIOIT #130

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Use valid name for fake ArgSpec args.

[ccy] Update DataflowRunner to suppress duplicate system messages

[Pablo] Fixing data race in statesampler_fast.

[swegner] Decrease flink ValidatesRunner maxParallelForks from 4 to 2 in order 
to

[herohde] [BEAM-4015] Always update Go dependences in Maven build

[herohde] Update Go maven plugin

--
[...truncated 26.53 KB...]
[INFO] Excluding 
com.google.apis:google-api-services-storage:jar:v1-rev71-1.22.0 from the shaded 
jar.
[INFO] Excluding com.google.auth:google-auth-library-credentials:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.auth:google-auth-library-oauth2-http:jar:0.7.1 from 
the shaded jar.
[INFO] Excluding com.google.cloud.bigdataoss:util:jar:1.4.5 from the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-java6:jar:1.22.0 from 
the shaded jar.
[INFO] Excluding com.google.api-client:google-api-client-jackson2:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.oauth-client:google-oauth-client-java6:jar:1.22.0 
from the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.5.0-SNAPSHOT from 
the shaded jar.
[INFO] Excluding 
org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.5.0-SNAPSHOT from the 
shaded jar.
[INFO] Excluding io.grpc:grpc-core:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.errorprone:error_prone_annotations:jar:2.0.15 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-context:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.instrumentation:instrumentation-api:jar:0.3.0 from 
the shaded jar.
[INFO] Excluding 
com.google.apis:google-api-services-bigquery:jar:v2-rev374-1.22.0 from the 
shaded jar.
[INFO] Excluding com.google.api:gax-grpc:jar:0.20.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.api:api-common:jar:1.0.0-rc2 from the shaded jar.
[INFO] Excluding com.google.auto.value:auto-value:jar:1.5.3 from the shaded jar.
[INFO] Excluding com.google.api:gax:jar:1.3.1 from the shaded jar.
[INFO] Excluding org.threeten:threetenbp:jar:1.3.3 from the shaded jar.
[INFO] Excluding com.google.cloud:google-cloud-core-grpc:jar:1.2.0 from the 
shaded jar.
[INFO] Excluding com.google.protobuf:protobuf-java-util:jar:3.2.0 from the 
shaded jar.
[INFO] Excluding com.google.code.gson:gson:jar:2.7 from the shaded jar.
[INFO] Excluding com.google.apis:google-api-services-pubsub:jar:v1-rev10-1.22.0 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-cloud-pubsub-v1:jar:0.1.18 
from the shaded jar.
[INFO] Excluding com.google.api.grpc:proto-google-iam-v1:jar:0.1.18 from the 
shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-protobuf:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.http-client:google-http-client-jackson:jar:1.22.0 
from the shaded jar.
[INFO] Excluding com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0 from 
the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Excluding io.grpc:grpc-auth:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-netty:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http2:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec-http:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler-proxy:jar:4.1.8.Final from the shaded 
jar.
[INFO] Excluding io.netty:netty-codec-socks:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-handler:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-buffer:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-common:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-transport:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-resolver:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.netty:netty-codec:jar:4.1.8.Final from the shaded jar.
[INFO] Excluding io.grpc:grpc-stub:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-all:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-okhttp:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.squareup.okhttp:okhttp:jar:2.5.0 from the shaded jar.
[INFO] Excluding com.squareup.okio:okio:jar:1.6.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-lite:jar:1.2.0 from the shaded jar.
[INFO] Excluding io.grpc:grpc-protobuf-nano:jar:1.2.0 from the shaded jar.
[INFO] Excluding com.google.protobuf.nano:protobuf-javanano:jar:3.0.0-alpha-5 
from the shaded jar.
[INFO] Excluding 

Build failed in Jenkins: beam_PerformanceTests_Python #1134

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Use valid name for fake ArgSpec args.

[ccy] Update DataflowRunner to suppress duplicate system messages

[Pablo] Fixing data race in statesampler_fast.

[swegner] Decrease flink ValidatesRunner maxParallelForks from 4 to 2 in order 
to

[herohde] [BEAM-4015] Always update Go dependences in Maven build

[herohde] Update Go maven plugin

--
[...truncated 1.84 KB...]
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6458251481524553331.sh
+ virtualenv .env --system-site-packages
New python executable in .env/bin/python
Installing setuptools, pip...done.
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins6534466918144038700.sh
+ .env/bin/pip install --upgrade setuptools pip
Downloading/unpacking setuptools from 
https://pypi.python.org/packages/20/d7/04a0b689d3035143e2ff288f4b9ee4bf6ed80585cc121c90bfd85a1a8c2e/setuptools-39.0.1-py2.py3-none-any.whl#md5=ca299c7acd13a72e1171a3697f2b99bc
Downloading/unpacking pip from 
https://pypi.python.org/packages/ac/95/a05b56bb975efa78d3557efa36acaf9cf5d2fd0ee0062060493687432e03/pip-9.0.3-py2.py3-none-any.whl#md5=d512ceb964f38ba31addb8142bc657cb
Installing collected packages: setuptools, pip
  Found existing installation: setuptools 2.2
Uninstalling setuptools:
  Successfully uninstalled setuptools
  Found existing installation: pip 1.5.4
Uninstalling pip:
  Successfully uninstalled pip
Successfully installed setuptools pip
Cleaning up...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins3844849458040639615.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_Python] $ /bin/bash -xe 
/tmp/jenkins1768250496408946833.sh
+ .env/bin/pip install -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied: absl-py in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: setuptools in ./.env/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 16))
Requirement already satisfied: colorlog[windows]==2.6.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied: futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied: PyYAML==3.12 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied: pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied: numpy==1.13.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied: functools32 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied: contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Requirement already satisfied: pywinrm in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: six in /usr/local/lib/python2.7/dist-packages 
(from absl-py->-r PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied: MarkupSafe>=0.23 in 
/usr/local/lib/python2.7/dist-packages (from jinja2>=2.7->-r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied: colorama; extra == "windows" in 
/usr/lib/python2.7/dist-packages (from colorlog[windows]==2.6.0->-r 
PerfKitBenchmarker/requirements.txt (line 17))
Requirement already satisfied: xmltodict in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests-ntlm>=0.3.0 in 
/home/jenkins/.local/lib/python2.7/site-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: requests>=2.9.1 in 
/usr/local/lib/python2.7/dist-packages (from pywinrm->-r 
PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: ntlm-auth>=1.0.2 in 
/home/jenkins/.local/lib/python2.7/site-packages (from 
requests-ntlm>=0.3.0->pywinrm->-r PerfKitBenchmarker/requirements.txt (line 25))
Requirement already satisfied: cryptography>=1.3 in 

Build failed in Jenkins: beam_PerformanceTests_Compressed_TextIOIT_HDFS #35

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Use valid name for fake ArgSpec args.

[ccy] Update DataflowRunner to suppress duplicate system messages

[Pablo] Fixing data race in statesampler_fast.

[swegner] Decrease flink ValidatesRunner maxParallelForks from 4 to 2 in order 
to

[herohde] [BEAM-4015] Always update Go dependences in Maven build

[herohde] Update Go maven plugin

--
[...truncated 286.44 KB...]
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:248)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:235)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:923)
at 
org.apache.beam.sdk.io.WriteFiles$WriteUnshardedTempFilesWithSpillingFn.processElement(WriteFiles.java:503)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at 
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:614)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:712)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
at org.apache.hadoop.ipc.Client.call(Client.java:1451)
at org.apache.hadoop.ipc.Client.call(Client.java:1412)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
at com.sun.proxy.$Proxy60.create(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy61.create(Unknown Source)
at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:778)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:109)
at 
org.apache.beam.sdk.io.hdfs.HadoopFileSystem.create(HadoopFileSystem.java:68)
at 

Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Spark_Gradle #61

2018-04-11 Thread Apache Jenkins Server
See 


--
[...truncated 1.59 MB...]
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:457)
at 
org.apache.spark.SparkContext$$anonfun$12.apply(SparkContext.scala:457)
at scala.collection.immutable.List.foreach(List.scala:392)
at org.apache.spark.SparkContext.(SparkContext.scala:457)
at 
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:58)
at 
org.apache.beam.runners.spark.translation.SparkContextFactory.createSparkContext(SparkContextFactory.java:103)
at 
org.apache.beam.runners.spark.translation.SparkContextFactory.getSparkContext(SparkContextFactory.java:68)
at 
org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory.call(SparkRunnerStreamingContextFactory.java:79)
at 
org.apache.beam.runners.spark.translation.streaming.SparkRunnerStreamingContextFactory.call(SparkRunnerStreamingContextFactory.java:47)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:627)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$$anonfun$7.apply(JavaStreamingContext.scala:626)
at scala.Option.getOrElse(Option.scala:121)
at 
org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:828)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext$.getOrCreate(JavaStreamingContext.scala:626)
at 
org.apache.spark.streaming.api.java.JavaStreamingContext.getOrCreate(JavaStreamingContext.scala)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:169)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:123)
at 
org.apache.beam.runners.spark.TestSparkRunner.run(TestSparkRunner.java:83)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:311)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:346)
at org.apache.beam.sdk.testing.TestPipeline.run(TestPipeline.java:328)
at 
org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testDiscardingMode(CreateStreamTest.java:203)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
at 
org.apache.beam.sdk.testing.TestPipeline$1.evaluate(TestPipeline.java:317)
at 
org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239)
at org.junit.rules.RunRules.evaluate(RunRules.java:20)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
at 
org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
at 
org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
at 

[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=89795=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89795
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 11/Apr/18 06:22
Start Date: 11/Apr/18 06:22
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on issue #5079: [BEAM-2990] support 
MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#issuecomment-380340663
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89795)
Time Spent: 1.5h  (was: 1h 20m)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: beam_PerformanceTests_Spark #1576

2018-04-11 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Use valid name for fake ArgSpec args.

[ccy] Update DataflowRunner to suppress duplicate system messages

[Pablo] Fixing data race in statesampler_fast.

[swegner] Decrease flink ValidatesRunner maxParallelForks from 4 to 2 in order 
to

[herohde] [BEAM-4015] Always update Go dependences in Maven build

[herohde] Update Go maven plugin

--
[...truncated 89.48 KB...]
'apache-beam-testing:bqjob_r5243450914a1b779_0162b35cbe7d_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-11 06:20:24,640 6d7e215a MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-11 06:20:50,552 6d7e215a MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-11 06:20:52,946 6d7e215a MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r74c396a7e922c1c3_0162b35d2d86_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r74c396a7e922c1c3_0162b35d2d86_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r74c396a7e922c1c3_0162b35d2d86_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-11 06:20:52,947 6d7e215a MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-11 06:21:21,118 6d7e215a MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-11 06:21:23,913 6d7e215a MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r47eaffa5fabc537e_0162b35da4dc_1 ... (0s) Current status: 
RUNNING 
 Waiting on bqjob_r47eaffa5fabc537e_0162b35da4dc_1 ... (0s) 
Current status: DONE   
BigQuery error in load operation: Error processing job
'apache-beam-testing:bqjob_r47eaffa5fabc537e_0162b35da4dc_1': Invalid schema
update. Field timestamp has changed type from TIMESTAMP to FLOAT

STDERR: 
/usr/lib/google-cloud-sdk/platform/bq/third_party/oauth2client/contrib/gce.py:73:
 UserWarning: You have requested explicit scopes to be used with a GCE service 
account.
Using this argument will have no effect on the actual scopes for tokens
requested. These scopes are set at VM instance creation time and
can't be overridden in the request.

  warnings.warn(_SCOPES_WARNING)

2018-04-11 06:21:23,913 6d7e215a MainThread INFO Retrying exception running 
IssueRetryableCommand: Command returned a non-zero exit code.

2018-04-11 06:21:53,735 6d7e215a MainThread INFO Running: bq load 
--autodetect --source_format=NEWLINE_DELIMITED_JSON 
beam_performance.pkb_results 

2018-04-11 06:21:56,101 6d7e215a MainThread INFO Ran: {bq load --autodetect 
--source_format=NEWLINE_DELIMITED_JSON beam_performance.pkb_results 

  ReturnCode:1
STDOUT: Upload complete.
Waiting on bqjob_r35638e5254a0690f_0162b35e23c9_1 ... (0s) Current status: 
RUNNING 

Build failed in Jenkins: beam_PostCommit_Java_GradleBuild #38

2018-04-11 Thread Apache Jenkins Server
See 


--
[...truncated 17.39 MB...]
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/ParDo(GroupByKeyHashAndSortByKeyAndWindow) as step s18
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForData/BatchViewOverrides.GroupByKeyAndSortValuesOnly as step s19
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParMultiDo(ToIsmRecordForMapLike) as step s20
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForSize as step s21
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForSize) as step s22
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/GBKaSVForKeys as step s23
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/ParDo(ToIsmMetadataRecordForKey) as step s24
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/Flatten.PCollections as step s25
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Keys sample 
as view/CreateDataflowView as step s26
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Partition 
input as step s27
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Group by 
partition as step s28
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Batch 
mutations together as step s29
Apr 11, 2018 6:44:22 AM 
org.apache.beam.runners.dataflow.DataflowPipelineTranslator$Translator addStep
INFO: Adding SpannerIO.Write/Write mutations to Cloud Spanner/Write 
mutations to Spanner as step s30
Apr 11, 2018 6:44:22 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: Staging pipeline description to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0411064419-b9126d6a/output/results/staging/
Apr 11, 2018 6:44:22 AM org.apache.beam.runners.dataflow.util.PackageUtil 
tryStagePackage
INFO: Uploading <80747 bytes, hash k_VJmohlkRFDVo4pkyf6Fw> to 
gs://temp-storage-for-end-to-end-tests/spannerwriteit0testwrite-jenkins-0411064419-b9126d6a/output/results/staging/pipeline-k_VJmohlkRFDVo4pkyf6Fw.pb

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_OUT
Dataflow SDK version: 2.5.0-SNAPSHOT

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_ERROR
Apr 11, 2018 6:44:24 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To access the Dataflow monitoring console, please navigate to 
https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-04-10_23_44_23-3950490098324126581?project=apache-beam-testing

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_OUT
Submitted job: 2018-04-10_23_44_23-3950490098324126581

org.apache.beam.sdk.io.gcp.spanner.SpannerWriteIT > testWrite STANDARD_ERROR
Apr 11, 2018 6:44:24 AM org.apache.beam.runners.dataflow.DataflowRunner run
INFO: To cancel the job using the 'gcloud' tool, run:
> gcloud dataflow jobs --project=apache-beam-testing cancel 
--region=us-central1 2018-04-10_23_44_23-3950490098324126581
Apr 11, 2018 6:44:24 AM org.apache.beam.runners.dataflow.TestDataflowRunner 
run
INFO: Running Dataflow job 2018-04-10_23_44_23-3950490098324126581 with 0 
expected assertions.
Apr 11, 2018 6:44:34 AM 
org.apache.beam.runners.dataflow.util.MonitoringUtil$LoggingHandler process
INFO: 2018-04-11T06:44:23.523Z: Autoscaling is enabled for job 
2018-04-10_23_44_23-3950490098324126581. The number of workers will be between 

[jira] [Work logged] (BEAM-4044) Take advantage of Calcite DDL

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4044?focusedWorklogId=89807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89807
 ]

ASF GitHub Bot logged work on BEAM-4044:


Author: ASF GitHub Bot
Created on: 11/Apr/18 06:48
Start Date: 11/Apr/18 06:48
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on a change in pull request #5040: 
[BEAM-4044] [SQL] Refresh DDL from 1.16
URL: https://github.com/apache/beam/pull/5040#discussion_r180646402
 
 

 ##
 File path: sdks/java/extensions/sql/src/main/codegen/includes/parserImpls.ftl
 ##
 @@ -1,106 +1,154 @@
-<#-- Licensed to the Apache Software Foundation (ASF) under one or more 
contributor
-  license agreements. See the NOTICE file distributed with this work for 
additional
-  information regarding copyright ownership. The ASF licenses this file to
-  You under the Apache License, Version 2.0 (the "License"); you may not use
-  this file except in compliance with the License. You may obtain a copy of
-  the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required
-  by applicable law or agreed to in writing, software distributed under the
-  License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS
-  OF ANY KIND, either express or implied. See the License for the specific
-  language governing permissions and limitations under the License. -->
+<#--
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to you under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+-->
 
+boolean IfNotExistsOpt() :
+{
+}
+{
+   { return true; }
+|
+{ return false; }
+}
 
-private void ColumnDef(List list) :
+boolean IfExistsOpt() :
 {
-SqlParserPos pos;
-SqlIdentifier name;
-SqlDataTypeSpec type;
-ColumnConstraint constraint = null;
-SqlNode comment = null;
 }
 {
-name = SimpleIdentifier() { pos = getPos(); }
-type = DataType()
-[
-   
-  { constraint = new ColumnConstraint.PrimaryKey(getPos()); }
-]
+  { return true; }
+|
+{ return false; }
+}
+
+SqlNodeList Options() :
+{
+final Span s;
+final List list = Lists.newArrayList();
+}
+{
+ { s = span(); } 
 [
-   comment = StringLiteral()
+Option(list)
+(
+
+Option(list)
+)*
 ]
-{
-list.add(new ColumnDefinition(name, type, constraint, comment, pos));
+ {
+return new SqlNodeList(list, s.end(this));
 }
 }
 
-SqlNodeList ColumnDefinitionList() :
+void Option(List list) :
 {
-SqlParserPos pos;
-List list = Lists.newArrayList();
+final SqlIdentifier id;
+final SqlNode value;
 }
 {
- { pos = getPos(); }
-ColumnDef(list)
-(  ColumnDef(list) )*
+id = SimpleIdentifier()
+value = Literal() {
+list.add(id);
+list.add(value);
+}
+}
+
+SqlNodeList TableElementList() :
+{
+final Span s;
+final List list = Lists.newArrayList();
+}
+{
+ { s = span(); }
+TableElement(list)
+(
+ TableElement(list)
+)*
  {
-return new SqlNodeList(list, pos.plus(getPos()));
+return new SqlNodeList(list, s.end(this));
+}
+}
+
+void TableElement(List list) :
+{
+final SqlIdentifier id;
+final SqlDataTypeSpec type;
+final boolean nullable;
+SqlNode comment = null;
+final Span s = Span.of();
+}
+{
+id = SimpleIdentifier()
+(
+type = DataType()
+(
+ { nullable = true; }
+|
+  { nullable = false; }
+|
+{ nullable = true; }
+)
+[  comment = StringLiteral() ]
+{
+list.add(
+SqlDdlNodes.column(s.add(id).end(this), id,
+type.withNullable(nullable), comment));
+}
+|
+{ list.add(id); }
+)
+|
+id = SimpleIdentifier() {
+list.add(id);
 }
 }
 
-/**
- * CREATE TABLE ( IF NOT EXISTS )?
- *   ( database_name '.' )? table_name ( '(' column_def ( ',' column_def )* ')'
- *   ( STORED AS INPUTFORMAT input_format_classname OUTPUTFORMAT 
output_format_classname )?
- *   LOCATION location_uri
- *   ( TBLPROPERTIES tbl_properties )?
- *   ( AS select_stmt )
 
 

[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=89808=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-89808
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 11/Apr/18 06:50
Start Date: 11/Apr/18 06:50
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on issue #5079: [BEAM-2990] support 
MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#issuecomment-380346045
 
 
   run java precommit


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 89808)
Time Spent: 1h 40m  (was: 1.5h)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-4047) TransformExecutorServices should use Executor, not ExecutorService

2018-04-11 Thread Romain Manni-Bucau (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Romain Manni-Bucau reassigned BEAM-4047:


Assignee: (was: Thomas Groh)

> TransformExecutorServices should use Executor, not ExecutorService
> --
>
> Key: BEAM-4047
> URL: https://issues.apache.org/jira/browse/BEAM-4047
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-direct
>Reporter: Romain Manni-Bucau
>Priority: Trivial
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-4047) TransformExecutorServices should use Executor, not ExecutorService

2018-04-11 Thread Romain Manni-Bucau (JIRA)
Romain Manni-Bucau created BEAM-4047:


 Summary: TransformExecutorServices should use Executor, not 
ExecutorService
 Key: BEAM-4047
 URL: https://issues.apache.org/jira/browse/BEAM-4047
 Project: Beam
  Issue Type: Improvement
  Components: runner-direct
Reporter: Romain Manni-Bucau
Assignee: Thomas Groh






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Jenkins build is back to normal : beam_PostCommit_Python_ValidatesContainer_Dataflow #101

2018-04-11 Thread Apache Jenkins Server
See 




[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=90282=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90282
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 12/Apr/18 05:25
Start Date: 12/Apr/18 05:25
Worklog Time Spent: 10m 
  Work Description: shoyer commented on issue #4959: [BEAM-3956] Preserve 
stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-380682171
 
 
   @tvalentyn I rebased and tests pass


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90282)
Time Spent: 7h 10m  (was: 7h)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-2990) support data type MAP

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2990?focusedWorklogId=90283=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90283
 ]

ASF GitHub Bot logged work on BEAM-2990:


Author: ASF GitHub Bot
Created on: 12/Apr/18 05:28
Start Date: 12/Apr/18 05:28
Worklog Time Spent: 10m 
  Work Description: XuMingmin commented on issue #5079: [BEAM-2990] support 
MAP in SQL schema
URL: https://github.com/apache/beam/pull/5079#issuecomment-380682698
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90283)
Time Spent: 3h 20m  (was: 3h 10m)

> support data type MAP
> -
>
> Key: BEAM-2990
> URL: https://issues.apache.org/jira/browse/BEAM-2990
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Xu Mingmin
>Priority: Major
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> support Non-scalar types:
> MAP   Collection of keys mapped to values
> ARRAY Ordered, contiguous collection that may contain duplicates



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-4038) Support Kafka Headers in KafkaIO

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-4038?focusedWorklogId=90275=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90275
 ]

ASF GitHub Bot logged work on BEAM-4038:


Author: ASF GitHub Bot
Created on: 12/Apr/18 03:29
Start Date: 12/Apr/18 03:29
Worklog Time Spent: 10m 
  Work Description: gkumar7 opened a new pull request #5111: BEAM-4038: 
Support Kafka Headers in KafkaIO
URL: https://github.com/apache/beam/pull/5111
 
 
   Adds read support for Kafka headers. These changes have been tested with 
prior versions of Kafka 0.9.x as well as the latest version (1.0.x). 
   
   Added KafkaHeader, KafkaHeaders, KafkaRecordHeader, KafkaRecordHeaders for 
backwards compatibility.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90275)
Time Spent: 10m
Remaining Estimate: 0h

> Support Kafka Headers in KafkaIO
> 
>
> Key: BEAM-4038
> URL: https://issues.apache.org/jira/browse/BEAM-4038
> Project: Beam
>  Issue Type: New Feature
>  Components: io-java-kafka
>Reporter: Geet Kumar
>Assignee: Raghu Angadi
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Headers have been added to Kafka Consumer/Producer records (KAFKA-4208). The 
> purpose of this JIRA is to support this feature in KafkaIO.  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=90273=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90273
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 12/Apr/18 02:58
Start Date: 12/Apr/18 02:58
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #4959: [BEAM-3956] 
Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-380660202
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90273)
Time Spent: 6h 50m  (was: 6h 40m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (BEAM-3956) Stacktraces from exceptions in user code should be preserved in the Python SDK

2018-04-11 Thread ASF GitHub Bot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-3956?focusedWorklogId=90274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-90274
 ]

ASF GitHub Bot logged work on BEAM-3956:


Author: ASF GitHub Bot
Created on: 12/Apr/18 03:02
Start Date: 12/Apr/18 03:02
Worklog Time Spent: 10m 
  Work Description: tvalentyn commented on issue #4959: [BEAM-3956] 
Preserve stacktraces for Python exceptions
URL: https://github.com/apache/beam/pull/4959#issuecomment-380660844
 
 
   @shoyer can you please rebase this on top of recent master and rerun 
precommits and make sure they pass? Thank you.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 90274)
Time Spent: 7h  (was: 6h 50m)

> Stacktraces from exceptions in user code should be preserved in the Python SDK
> --
>
> Key: BEAM-3956
> URL: https://issues.apache.org/jira/browse/BEAM-3956
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Stephan Hoyer
>Priority: Major
>  Time Spent: 7h
>  Remaining Estimate: 0h
>
> Currently, Beam's Python SDK loses stacktraces for exceptions. It does 
> helpfully add a tag like "[while running StageA]" to exception error 
> messages, but that doesn't include the stacktrace of Python functions being 
> called.
> Including the full stacktraces would make a big difference for the ease of 
> debugging Beam pipelines when things go wrong.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2   3